* Re: [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery
[not found] <20260328004518.1729186-1-mkhalfella@purestorage.com>
@ 2026-05-12 21:40 ` Mohamed Khalfella
2026-05-12 22:02 ` Sagi Grimberg
[not found] ` <20260328004518.1729186-3-mkhalfella@purestorage.com>
` (6 subsequent siblings)
7 siblings, 1 reply; 9+ messages in thread
From: Mohamed Khalfella @ 2026-05-12 21:40 UTC (permalink / raw)
To: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke
Cc: Aaron Dailey, Randy Jennings, Dhaval Giani, linux-nvme,
linux-kernel
On Fri 2026-03-27 17:43:31 -0700, Mohamed Khalfella wrote:
> This patchset adds support for TP8028 Rapid Path Failure Recovery for
> both nvme target and initiator. Rapid Path Failure Recovery brings
> Cross-Controller Reset (CCR) functionality to nvme. This allows nvme
> host to send an nvme command to a source nvme controller to reset
> the impacted nvme controller, provided that both source and impacted
> controllers are in the same nvme subsystem.
>
> The main use of CCR is when one path to the nvme subsystem fails.
> Inflight IOs on impacted nvme controller need to be terminated first
> before they can be retried on another path. Otherwise data corruption
> may happen. CCR provides a quick way to terminate these IOs on the
> unreachable nvme controller allowing recovery to move quickly avoiding
> unnecessary delays. In case of CCR is not possible, inflight requests
> are held for duration defined by TP4129 KATO Corrections and
> Clarifications before they are allowed to be retried.
>
>
> On the target side:
>
> * New struct members have been added to support CCR. struct nvme_id_ctrl
> has been updated with CIU (Controller Instance Uniquifier), CIRN
> (Controller Instance Random Number), and CQT (Command Quiesce Time).
> The combination of CIU, CNTLID, and CIRN is used to identify impacted
> controller in CCR command.
>
> * CCR nvme command implemented on the target causes impacted controller
> to fail and drop connections to host.
>
> * CCR logpage contains the status of pending CCR requests. An entry is
> added to the logpage after CCR request is validated. Completed CCR
> requests are removed from the logpage when controller becomes ready or
> when requested in get logpage command.
>
> * An AEN is sent when CCR completes to let the host know that it is safe
> to retry inflight requests.
>
>
> On the host side:
>
> * CIU, CIRN, and CQT have been added to struct nvme_ctrl. CIU and CIRN
> have been added to sysfs to make the values visible to the user.
> CIU and CIRN can be used to construct and manually send admin-passthru
> CCR commands.
>
> * New controller states FENCING and FENCED have been added to make sure
> that inflight request do not get canceled if they timeout during
> fencing process. FENCED exists so that controller state machine does
> not have a transition from FENCING to RESETTING. Instead FENCING ->
> FENCED -> RESETTING. This prevents a controller being fenced from
> getting reset. Only after fencing finishes the impacted controller is
> reset.
>
> * Controller recovery in nvme_fence_ctrl() is invoked when LIVE
> controller hits an error or when a request times out. CCR is attempted
> first to reset impacted controller. If it fails then inflight requests
> are held until it is safe to retry them.
>
> * Updated nvme fabric transports nvme-tcp, nvme-rdma, and nvme-fc to
> use CCR recovery.
>
>
> Ideally all inflight requests should be held during controller recovery
> and only retried after recovery is done. However, there are known
> situations where that is not the case in this implementation. These gaps
> will be addressed in future patches:
>
> * Manual controller reset from sysfs will result in controller going to
> RESETTING state and all inflight requests to be canceled immediately
> and may be retried on another path.
>
> * Manual controller delete from sysfs will also result in all inflight
> requests to be canceled immediately and may be retried on another path.
>
> * In nvme-fc, nvme controller will be deleted if remote port disappears
> with no timeout specified. This results in immediate cancellation of
> requests that may be retried on another path.
>
> * In nvme-rdma if HCA is removed all nvme controllers will be deleted.
> This results in canceling inflight IOs and may be they will be retried
> on another path.
>
>
> Changes from v3:
> - nvmet: Implement CCR nvme command
> - Fixed a bug in the order of members of struct nvme_cross_ctrl_reset_cmd
> - Use kmalloc_obj() instead of kmalloc()
>
> - nvme: Implement cross-controller reset recovery
> - Now CQT has been removed updated nvme_fence_ctrl() to return
> success or failure instead of remaining time.
> - Updated nvme_issue_wait_ccr() to respect deadline set in
> nvme_fence_ctrl().
v4 dropped CQT patches in order to focus on CCR. However, I came to the
understanding that we need to bring CQT patches back. The plan for v5 is
to be similar to v3 plus minor fixes came in v4.
Sagi - Does this sound good to you?
>
> - nvme-tcp: Use CCR to recover controller that hits an error
> - nvme-rdma: Use CCR to recover controller that hits an error
> - Updated log nvme_fence_ctrl() return value
>
> - nvme-fc: Refactor IO error recovery
> - Updated the commit message
> - Updated nvme_fc_start_ioerr_recovery() to handle
> CONNECTING case first.
>
> - nvme-fc: Use CCR to recover controller that hits an error
> - Updated log nvme_fence_ctrl() return value
>
> - nvmet: Add support for CQT to nvme target
> - nvme: Add support for CQT to nvme host
> - nvme: Update CCR completion wait timeout to consider CQT
> - nvme-tcp: Extend FENCING state per TP4129 on CCR failure
> - nvme-rdma: Extend FENCING state per TP4129 on CCR failure
> - nvme-fc: Extend FENCING state per TP4129 on CCR failure
> - Dropped CQT patches
>
>
> v3: https://lore.kernel.org/all/20260214042753.4073668-1-mkhalfella@purestorage.com/
>
> *** BLURB HERE ***
>
>
> Mohamed Khalfella (15):
> nvmet: Rapid Path Failure Recovery set controller identify fields
> nvmet/debugfs: Export controller CIU and CIRN via debugfs
> nvmet: Implement CCR nvme command
> nvmet: Implement CCR logpage
> nvmet: Send an AEN on CCR completion
> nvme: Rapid Path Failure Recovery read controller identify fields
> nvme: Introduce FENCING and FENCED controller states
> nvme: Implement cross-controller reset recovery
> nvme: Implement cross-controller reset completion
> nvme-tcp: Use CCR to recover controller that hits an error
> nvme-rdma: Use CCR to recover controller that hits an error
> nvme-fc: Refactor IO error recovery
> nvme-fc: Use CCR to recover controller that hits an error
> nvme-fc: Hold inflight requests while in FENCING state
> nvme-fc: Do not cancel requests in io taget before it is initialized
>
> drivers/nvme/host/constants.c | 1 +
> drivers/nvme/host/core.c | 225 +++++++++++++++++++++++++++++++-
> drivers/nvme/host/fc.c | 215 +++++++++++++++++++++---------
> drivers/nvme/host/nvme.h | 24 ++++
> drivers/nvme/host/rdma.c | 30 ++++-
> drivers/nvme/host/sysfs.c | 25 ++++
> drivers/nvme/host/tcp.c | 30 ++++-
> drivers/nvme/target/admin-cmd.c | 123 +++++++++++++++++
> drivers/nvme/target/core.c | 110 +++++++++++++++-
> drivers/nvme/target/debugfs.c | 21 +++
> drivers/nvme/target/nvmet.h | 18 ++-
> include/linux/nvme.h | 65 ++++++++-
> 12 files changed, 812 insertions(+), 75 deletions(-)
>
>
> base-commit: dd09eb443372f9390d36051d86ebe06e9919aeec
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery
2026-05-12 21:40 ` [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery Mohamed Khalfella
@ 2026-05-12 22:02 ` Sagi Grimberg
0 siblings, 0 replies; 9+ messages in thread
From: Sagi Grimberg @ 2026-05-12 22:02 UTC (permalink / raw)
To: Mohamed Khalfella, Justin Tee, Naresh Gottumukkala, Paul Ely,
Chaitanya Kulkarni, Jens Axboe, Keith Busch, James Smart,
Hannes Reinecke
Cc: Aaron Dailey, Randy Jennings, Dhaval Giani, linux-nvme,
linux-kernel
> v4 dropped CQT patches in order to focus on CCR. However, I came to the
> understanding that we need to bring CQT patches back. The plan for v5 is
> to be similar to v3 plus minor fixes came in v4.
>
> Sagi - Does this sound good to you?
Yes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 02/15] nvmet/debugfs: Export controller CIU and CIRN via debugfs
[not found] ` <20260328004518.1729186-3-mkhalfella@purestorage.com>
@ 2026-05-14 23:42 ` Randy Jennings
0 siblings, 0 replies; 9+ messages in thread
From: Randy Jennings @ 2026-05-14 23:42 UTC (permalink / raw)
To: Mohamed Khalfella
Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme,
linux-kernel
On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella
<mkhalfella@purestorage.com> wrote:
>
> Export ctrl->ciu and ctrl->cirn as debugfs files under controller
> debugfs directory.
>
> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 03/15] nvmet: Implement CCR nvme command
[not found] ` <20260328004518.1729186-4-mkhalfella@purestorage.com>
@ 2026-05-15 0:18 ` Randy Jennings
0 siblings, 0 replies; 9+ messages in thread
From: Randy Jennings @ 2026-05-15 0:18 UTC (permalink / raw)
To: Mohamed Khalfella
Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme,
linux-kernel
On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella
<mkhalfella@purestorage.com> wrote:
>
> Defined by TP8028 Rapid Path Failure Recovery, CCR (Cross-Controller
> Reset) command is an nvme command issued to source controller by
> initiator to reset impacted controller. Implement CCR command for linux
> nvme target.
>
> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
> ---
> + mutex_lock(&sctrl->lock);
> + list_for_each_entry(ccr, &sctrl->ccr_list, entry) {
> + if (ccr->ctrl == ictrl) {
> + status = NVME_SC_CCR_IN_PROGRESS | NVME_STATUS_DNR;
> + goto out_unlock;
> + }
> +
> + ccr_total++;
> + if (ccr->ctrl)
> + ccr_active++;
> + }
> +
> + if (ccr_active >= NVMF_CCR_LIMIT) {
> + status = NVME_SC_CCR_LIMIT_EXCEEDED;
> + goto out_unlock;
> + }
> + if (ccr_total >= NVMF_CCR_PER_PAGE) {
> + status = NVME_SC_CCR_LOGPAGE_FULL;
> + goto out_unlock;
> + }
> +
> + new_ccr = kmalloc_obj(*new_ccr, GFP_KERNEL);
This allocation could be done optimistically outside of the mutex.
But that would lead to more complicated code; probably not worth it
here.
Reviewed-by: Randy Jennings <randyj@purestorage.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 04/15] nvmet: Implement CCR logpage
[not found] ` <20260328004518.1729186-5-mkhalfella@purestorage.com>
@ 2026-05-15 0:38 ` Randy Jennings
0 siblings, 0 replies; 9+ messages in thread
From: Randy Jennings @ 2026-05-15 0:38 UTC (permalink / raw)
To: Mohamed Khalfella
Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme,
linux-kernel
On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella
<mkhalfella@purestorage.com> wrote:
>
> Defined by TP8028 Rapid Path Failure Recovery, CCR (Cross-Controller
> Reset) log page contains an entry for each CCR request submitted to
> source controller. Implement CCR logpage for nvme linux target.
> +/* NVMe Cross-Controller Reset Status */
> +enum {
> + NVME_CCR_STATUS_IN_PROGRESS,
> + NVME_CCR_STATUS_SUCCESS,
> + NVME_CCR_STATUS_FAILED,
> +};
> +
Looking at the rest of the code, all the enums are defined except
/* NVMe Namespace Write Protect State */
which does define the value of the first entry (0).
I think it would be prefereable to add explicit values here (0, 1, 2) even
though the implicit values should be correct.
Sincerely,
Randy Jennings
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 05/15] nvmet: Send an AEN on CCR completion
[not found] ` <20260328004518.1729186-6-mkhalfella@purestorage.com>
@ 2026-05-15 0:50 ` Randy Jennings
0 siblings, 0 replies; 9+ messages in thread
From: Randy Jennings @ 2026-05-15 0:50 UTC (permalink / raw)
To: Mohamed Khalfella
Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme,
linux-kernel
On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella
<mkhalfella@purestorage.com> wrote:
>
> Send an AEN to initiator when impacted controller exists. The
> notification points to CCR log page that initiator can read to check
> which CCR operation completed.
>
> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> +static void nvmet_add_async_event_locked(struct nvmet_ctrl *ctrl, u8 event_type,
> u8 event_info, u8 log_page)
> {
> struct nvmet_async_event *aen;
> @@ -218,13 +218,19 @@ void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
> aen->event_info = event_info;
> aen->log_page = log_page;
Please add
lockdep_assert_held(&ctrl->lock);
Looking at the rest of the code, this should go directly under the
local variable definitions.
>
> - mutex_lock(&ctrl->lock);
> list_add_tail(&aen->entry, &ctrl->async_events);
> - mutex_unlock(&ctrl->lock);
Sincerely,
Randy Jennings
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 06/15] nvme: Rapid Path Failure Recovery read controller identify fields
[not found] ` <20260328004518.1729186-7-mkhalfella@purestorage.com>
@ 2026-05-15 2:03 ` Randy Jennings
0 siblings, 0 replies; 9+ messages in thread
From: Randy Jennings @ 2026-05-15 2:03 UTC (permalink / raw)
To: Mohamed Khalfella
Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme,
linux-kernel
On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella
<mkhalfella@purestorage.com> wrote:
>
> TP8028 Rapid path failure added new fileds to controller identify
> response. Read CIU (Controller Instance Uniquifier), CIRN (Controller
> Instance Random Number), and CCRL (Cross-Controller Reset Limit) from
> controller identify response. Expose CIU and CIRN as sysfs attributes
> so the values can be used directrly by user if needed.
>
> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 07/15] nvme: Introduce FENCING and FENCED controller states
[not found] ` <20260328004518.1729186-8-mkhalfella@purestorage.com>
@ 2026-05-15 2:06 ` Randy Jennings
0 siblings, 0 replies; 9+ messages in thread
From: Randy Jennings @ 2026-05-15 2:06 UTC (permalink / raw)
To: Mohamed Khalfella
Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme,
linux-kernel
On Fri, Mar 27, 2026 at 5:46 PM Mohamed Khalfella
<mkhalfella@purestorage.com> wrote:
>
> FENCING is a new controller state that a LIVE controller enters when an
> error is encountered. While in FENCING state, inflight IOs that timeout
> are not canceled because they should be held until either CCR succeeds
> or time-based recovery completes. While the queues remain alive, requests
> are not allowed to be sent in this state, and the controller cannot be
> reset or deleted. This is intentional because resetting or deleting the
> controller results in canceling inflight IOs.
>
> FENCED is a short-term state the controller enters before it is reset.
> It exists only to prevent manual resets from happening while controller
> is in FENCING state.
>
> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 01/15] nvmet: Rapid Path Failure Recovery set controller identify fields
[not found] ` <20260328004518.1729186-2-mkhalfella@purestorage.com>
@ 2026-05-15 2:08 ` Randy Jennings
0 siblings, 0 replies; 9+ messages in thread
From: Randy Jennings @ 2026-05-15 2:08 UTC (permalink / raw)
To: Mohamed Khalfella
Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
Jens Axboe, Keith Busch, Sagi Grimberg, James Smart,
Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme,
linux-kernel
On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella
<mkhalfella@purestorage.com> wrote:
>
> TP8028 Rapid Path Failure Recovery defined new fields in controller
> identify response. The newly defined fields are:
>
> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-05-15 2:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260328004518.1729186-1-mkhalfella@purestorage.com>
2026-05-12 21:40 ` [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery Mohamed Khalfella
2026-05-12 22:02 ` Sagi Grimberg
[not found] ` <20260328004518.1729186-3-mkhalfella@purestorage.com>
2026-05-14 23:42 ` [PATCH v4 02/15] nvmet/debugfs: Export controller CIU and CIRN via debugfs Randy Jennings
[not found] ` <20260328004518.1729186-4-mkhalfella@purestorage.com>
2026-05-15 0:18 ` [PATCH v4 03/15] nvmet: Implement CCR nvme command Randy Jennings
[not found] ` <20260328004518.1729186-5-mkhalfella@purestorage.com>
2026-05-15 0:38 ` [PATCH v4 04/15] nvmet: Implement CCR logpage Randy Jennings
[not found] ` <20260328004518.1729186-6-mkhalfella@purestorage.com>
2026-05-15 0:50 ` [PATCH v4 05/15] nvmet: Send an AEN on CCR completion Randy Jennings
[not found] ` <20260328004518.1729186-7-mkhalfella@purestorage.com>
2026-05-15 2:03 ` [PATCH v4 06/15] nvme: Rapid Path Failure Recovery read controller identify fields Randy Jennings
[not found] ` <20260328004518.1729186-8-mkhalfella@purestorage.com>
2026-05-15 2:06 ` [PATCH v4 07/15] nvme: Introduce FENCING and FENCED controller states Randy Jennings
[not found] ` <20260328004518.1729186-2-mkhalfella@purestorage.com>
2026-05-15 2:08 ` [PATCH v4 01/15] nvmet: Rapid Path Failure Recovery set controller identify fields Randy Jennings
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox