From: sashiko-bot@kernel.org
To: "Tyrel Datwyler" <tyreld@linux.ibm.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH 29/29] ibmvfc: fail nvme-fc fcp-io and ls requests during transport reset
Date: Tue, 23 Jun 2026 02:04:39 +0000 [thread overview]
Message-ID: <20260623020439.5AF251F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260623013035.3436640-30-tyreld@linux.ibm.com>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [Critical] Pooled event structures retain stale NVMe request pointers, leading to exploitable Use-After-Free and double completions when non-NVMe commands fail.
- [High] Target kref is permanently leaked when an NVMe LS request fails or is purged.
Pre-existing issues:
- [High] Returning SCSI_MLQUEUE_HOST_BUSY for NVMe requests on transport closure causes permanent IO failures instead of requeuing.
- [Medium] NVMe FCP requests are not purged during a transport reset, rendering the patch's FCP functionality dead code.
--
commit 88ce7ce7f11c11cfca700c6919820cf4d9865c8a
Author: Tyrel Datwyler <tyreld@linux.ibm.com>
ibmvfc: fail nvme-fc fcp-io and ls requests during transport reset
The current purge code for flushing outstanding commands during a
transport reset only deals with SCSI commands. Rename the
ibmvfc_scsi_eh_done completion handler to ibmvfc_vfc_eh_done and wire it
to correctly complete nvme fcp and ls commands when flushing the
inflight command list during a reset.
> The current purge code for flushing outstanding commands during a
> transport reset only deals with SCSI commands. Rename the
> ibmvfc_scsi_eh_done completion handler to ibmvfc_vfc_eh_done and wire it
> to correctly complete nvme fcp and ls commands when flushing the
> inflight command list during a reset.
[Severity: Medium]
This isn't a bug introduced by this patch, but does the purge code actually
evaluate inflight NVMe FCP commands?
Looking at ibmvfc_purge_requests(), the loop iterates over vhost->crq and
vhost->scsi_scrqs.scrqs:
drivers/scsi/ibmvscsi/ibmvfc-core.c:ibmvfc_purge_requests() {
...
for (i = 0; i < hwqs; i++) {
spin_lock_irqsave(queues[i].q_lock, flags);
...
}
It seems to entirely miss vhost->nvme_scrqs.scrqs where FCP requests are
queued. If so, inflight NVMe FCP commands might be orphaned during a reset,
making the new FCP completion handling unreachable.
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc-core.c b/drivers/scsi/ibmvscsi/ibmvfc-core.c
> index a7183493cf965..aeb5b8902aa55 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc-core.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc-core.c
[ ... ]
> @@ -1092,20 +1092,28 @@ void ibmvfc_free_event(struct ibmvfc_event *evt)
> }
>
> /**
> - * ibmvfc_scsi_eh_done - EH done function for queuecommand commands
> + * ibmvfc_vfc_eh_done - EH done function for queued IO
> * @evt: ibmvfc event struct
> *
> - * This function does not setup any error status, that must be done
> - * before this function gets called.
> + * This function does not setup any error status for scsi commands, that must be
> + * done before this function gets called.
> **/
> -static void ibmvfc_scsi_eh_done(struct ibmvfc_event *evt)
> +static void ibmvfc_vfc_eh_done(struct ibmvfc_event *evt)
> {
> struct scsi_cmnd *cmnd = evt->cmnd;
> + struct nvmefc_ls_req *ls_req = evt->ls_req;
> + struct nvmefc_fcp_req *fcp_req = evt->fcp_req;
>
> if (cmnd) {
> scsi_dma_unmap(cmnd);
> scsi_done(cmnd);
> - }
> + } else if (fcp_req) {
> + fcp_req->rcv_rsplen = 0;
> + fcp_req->transferred_length = 0;
> + fcp_req->status = NVME_SC_INTERNAL;
> + fcp_req->done(fcp_req);
> + } else if (ls_req)
> + ls_req->done(ls_req, -EIO);
[Severity: High]
Does this error handler leak the target kref for LS requests?
In ibmvfc_nvme_ls_req(), a reference is taken via kref_get(&tgt->kref) which
is typically dropped in the success path by ibmvfc_ls_req_done().
Since this patch replaces the completion handler with ibmvfc_vfc_eh_done(),
the transport request is completed here but kref_put(&evt->tgt->kref, ...)
is never called, potentially leaving the target structure leaked.
>
> ibmvfc_free_event(evt);
> }
> @@ -1146,8 +1154,10 @@ static void ibmvfc_fail_request(struct ibmvfc_event *evt, int error_code)
> BUG_ON(!atomic_dec_and_test(&evt->active));
> if (evt->cmnd) {
> evt->cmnd->result = (error_code << 16);
> - evt->done = ibmvfc_scsi_eh_done;
> - } else
> + evt->done = ibmvfc_vfc_eh_done;
> + } else if (evt->fcp_req || evt->ls_req)
> + evt->done = ibmvfc_vfc_eh_done;
> + else
[Severity: Critical]
Can these checks be triggered by stale NVMe pointers on non-NVMe commands?
When ibmvfc_get_event() reallocates an event structure from the pool,
ibmvfc_init_event() doesn't clear evt->fcp_req or evt->ls_req to NULL:
drivers/scsi/ibmvscsi/ibmvfc-core.c:ibmvfc_init_event() {
...
evt->cmnd = NULL;
evt->sync_iu = NULL;
evt->eh_comp = NULL;
...
}
If a non-NVMe command (e.g., a MAD command) is allocated an event with stale
NVMe pointers and fails here, will it incorrectly replace the completion
handler with ibmvfc_vfc_eh_done() and later use-after-free the stale NVMe
pointers?
> evt->xfer_iu->mad_common.status = cpu_to_be16(IBMVFC_MAD_DRIVER_FAILED);
>
> timer_delete(&evt->timer);
[ ... ]
> @@ -1816,8 +1826,10 @@ int ibmvfc_send_event(struct ibmvfc_event *evt,
[Severity: High]
This isn't a bug introduced by this patch, but earlier in ibmvfc_send_event(),
it hardcodes a return of SCSI_MLQUEUE_HOST_BUSY on H_CLOSED without
distinguishing between SCSI and NVMe commands:
drivers/scsi/ibmvscsi/ibmvfc-core.c:ibmvfc_send_event() {
...
if (rc == H_CLOSED) {
...
return SCSI_MLQUEUE_HOST_BUSY;
}
...
}
This value (0x1055) is propagated back up through ibmvfc_nvme_fcp_io()
directly to the NVMe FC transport layer. Since 0x1055 is neither 0 nor
-EBUSY, will the NVMe FC core treat it as a hard failure and abort the
request with BLK_STS_IOERR instead of handling the queue pressure
appropriately?
> dev_err(vhost->dev, "Send error (rc=%d)\n", rc);
> if (evt->cmnd) {
> evt->cmnd->result = DID_ERROR << 16;
> - evt->done = ibmvfc_scsi_eh_done;
> - } else
> + evt->done = ibmvfc_vfc_eh_done;
> + } else if (evt->fcp_req || evt->ls_req)
> + evt->done = ibmvfc_vfc_eh_done;
> + else
> evt->xfer_iu->mad_common.status = cpu_to_be16(IBMVFC_MAD_CRQ_ERROR);
>
> evt->done(evt);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260623013035.3436640-1-tyreld@linux.ibm.com?part=29
prev parent reply other threads:[~2026-06-23 2:04 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-23 1:30 [PATCH 00/29] ibmvfc: Add NVMe-FC support Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 01/29] ibmvfc: move target list from host to protocol specific channel groups Tyrel Datwyler
2026-06-23 1:49 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 02/29] ibmvfc: add NVMe/FC protocol interface definitions Tyrel Datwyler
2026-06-23 1:54 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 03/29] ibmvfc: split NVMe support into separate source file and add transport stubs Tyrel Datwyler
2026-06-23 1:50 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 04/29] ibmvfc: initialize NVMe channel configuration during driver probe Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 05/29] ibmvfc: alloc/dealloc sub-queues for nvme channels Tyrel Datwyler
2026-06-23 1:55 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 06/29] ibmvfc: add logic for protocol specific fabric logins Tyrel Datwyler
2026-06-23 1:50 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 07/29] ibmvfc: add wrapper to get vhost associated with a channel struct Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 08/29] ibmvfc: add helper for creating protocol specific discovery event Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 09/29] ibmvfc: add helper to check NVMe/FC support with active channels Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 10/29] ibmvfc: allocate and free NVMe channel group discover buffer Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 11/29] ibmvfc: send NVMe target discovery MAD Tyrel Datwyler
2026-06-23 1:52 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 12/29] ibmvfc: add NVMe/FC Implicit Logout and Move Login support Tyrel Datwyler
2026-06-23 1:49 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 13/29] ibmvfc: add NVMe/FC Port " Tyrel Datwyler
2026-06-23 1:53 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 14/29] ibmvfc: add NVMe/FC Process " Tyrel Datwyler
2026-06-23 1:52 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 15/29] ibmvfc: add NVMe/FC Query Target support Tyrel Datwyler
2026-06-23 1:52 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 16/29] ibmvfc: allocate targets based on protocol Tyrel Datwyler
2026-06-23 1:56 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 17/29] ibmvfc: delete NVMe/FC targets as well as SCSI Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 18/29] ibmvfc: update state machine to process NVMe/FC targets Tyrel Datwyler
2026-06-23 1:55 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 19/29] ibmvfc: implement NVMe/FC stubs for local/remote port registration Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 20/29] ibmvfc: register local nvme fc port after fabric login Tyrel Datwyler
2026-06-23 1:57 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 21/29] ibmvfc: process NVMe/FC rports in work thread Tyrel Datwyler
2026-06-23 2:00 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 22/29] ibmvfc: extend ibmvfc_debug visibility to ibmvfc-nvme.h Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 23/29] ibmvfc: declare global function definitions Tyrel Datwyler
2026-06-23 2:04 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 24/29] ibmvfc: implement LLDD callbacks for mapping nvme-fc queues Tyrel Datwyler
2026-06-23 2:05 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 25/29] ibmvfc: implement nvme-fc LS submission transport callback Tyrel Datwyler
2026-06-23 2:08 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 26/29] ibmvfc: implement nvme-fc IO command submission callback Tyrel Datwyler
2026-06-23 2:09 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 27/29] ibmvfc: implement nvme-fc LS abort handling callback Tyrel Datwyler
2026-06-23 2:09 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 28/29] ibmvfc: implement nvme-fc FCP abort callback Tyrel Datwyler
2026-06-23 2:05 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 29/29] ibmvfc: fail nvme-fc fcp-io and ls requests during transport reset Tyrel Datwyler
2026-06-23 2:04 ` sashiko-bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260623020439.5AF251F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=tyreld@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox