The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Mohamed Khalfella <mkhalfella@purestorage.com>
To: Justin Tee <justin.tee@broadcom.com>,
	Naresh Gottumukkala <nareshgottumukkala83@gmail.com>,
	Paul Ely <paul.ely@broadcom.com>,
	Chaitanya Kulkarni <kch@nvidia.com>, Jens Axboe <axboe@kernel.dk>,
	Keith Busch <kbusch@kernel.org>, Sagi Grimberg <sagi@grimberg.me>,
	James Smart <jsmart833426@gmail.com>,
	Hannes Reinecke <hare@suse.de>
Cc: Aaron Dailey <adailey@purestorage.com>,
	Randy Jennings <randyj@purestorage.com>,
	Dhaval Giani <dgiani@purestorage.com>,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery
Date: Tue, 12 May 2026 14:40:40 -0700	[thread overview]
Message-ID: <20260512214040.GI10532-mkhalfella@purestorage.com> (raw)
In-Reply-To: <20260328004518.1729186-1-mkhalfella@purestorage.com>

On Fri 2026-03-27 17:43:31 -0700, Mohamed Khalfella wrote:
> This patchset adds support for TP8028 Rapid Path Failure Recovery for
> both nvme target and initiator. Rapid Path Failure Recovery brings
> Cross-Controller Reset (CCR) functionality to nvme. This allows nvme
> host to send an nvme command to a source nvme controller to reset
> the impacted nvme controller, provided that both source and impacted
> controllers are in the same nvme subsystem.
> 
> The main use of CCR is when one path to the nvme subsystem fails.
> Inflight IOs on impacted nvme controller need to be terminated first
> before they can be retried on another path. Otherwise data corruption
> may happen. CCR provides a quick way to terminate these IOs on the
> unreachable nvme controller allowing recovery to move quickly avoiding
> unnecessary delays. In case of CCR is not possible, inflight requests
> are held for duration defined by TP4129 KATO Corrections and
> Clarifications before they are allowed to be retried.
> 
> 
> On the target side:
> 
> * New struct members have been added to support CCR. struct nvme_id_ctrl
>   has been updated with CIU (Controller Instance Uniquifier), CIRN
>   (Controller Instance Random Number), and CQT (Command Quiesce Time).
>   The combination of CIU, CNTLID, and CIRN is used to identify impacted
>   controller in CCR command.
> 
> * CCR nvme command implemented on the target causes impacted controller
>   to fail and drop connections to host.
> 
> * CCR logpage contains the status of pending CCR requests. An entry is
>   added to the logpage after CCR request is validated. Completed CCR
>   requests are removed from the logpage when controller becomes ready or
>   when requested in get logpage command.
> 
> * An AEN is sent when CCR completes to let the host know that it is safe
>   to retry inflight requests.
> 
> 
> On the host side:
> 
> * CIU, CIRN, and CQT have been added to struct nvme_ctrl. CIU and CIRN
>   have been added to sysfs to make the values visible to the user.
>   CIU and CIRN can be used to construct and manually send admin-passthru
>   CCR commands.
> 
> * New controller states FENCING and FENCED have been added to make sure
>   that inflight request do not get canceled if they timeout during
>   fencing process. FENCED exists so that controller state machine does
>   not have a transition from FENCING to RESETTING. Instead FENCING ->
>   FENCED -> RESETTING. This prevents a controller being fenced from
>   getting reset. Only after fencing finishes the impacted controller is
>   reset.
> 
> * Controller recovery in nvme_fence_ctrl() is invoked when LIVE
>   controller hits an error or when a request times out. CCR is attempted
>   first to reset impacted controller. If it fails then inflight requests
>   are held until it is safe to retry them.
> 
> * Updated nvme fabric transports nvme-tcp, nvme-rdma, and nvme-fc to
>   use CCR recovery.
> 
> 
> Ideally all inflight requests should be held during controller recovery
> and only retried after recovery is done. However, there are known
> situations where that is not the case in this implementation. These gaps
> will be addressed in future patches:
> 
> * Manual controller reset from sysfs will result in controller going to
>   RESETTING state and all inflight requests to be canceled immediately
>   and may be retried on another path.
> 
> * Manual controller delete from sysfs will also result in all inflight
>   requests to be canceled immediately and may be retried on another path.
> 
> * In nvme-fc, nvme controller will be deleted if remote port disappears
>   with no timeout specified. This results in immediate cancellation of
>   requests that may be retried on another path.
> 
> * In nvme-rdma if HCA is removed all nvme controllers will be deleted.
>   This results in canceling inflight IOs and may be they will be retried
>   on another path.
> 
> 
> Changes from v3:
> - nvmet: Implement CCR nvme command
>   - Fixed a bug in the order of members of struct nvme_cross_ctrl_reset_cmd
>   - Use kmalloc_obj() instead of kmalloc()
> 
> - nvme: Implement cross-controller reset recovery
>   - Now CQT has been removed updated nvme_fence_ctrl() to return
>     success or failure instead of remaining time.
>   - Updated nvme_issue_wait_ccr() to respect deadline set in
>     nvme_fence_ctrl().

v4 dropped CQT patches in order to focus on CCR. However, I came to the
understanding that we need to bring CQT patches back. The plan for v5 is
to be similar to v3 plus minor fixes came in v4.

Sagi - Does this sound good to you?

> 
> - nvme-tcp: Use CCR to recover controller that hits an error
> - nvme-rdma: Use CCR to recover controller that hits an error
>   - Updated log nvme_fence_ctrl() return value
> 
> - nvme-fc: Refactor IO error recovery
>   - Updated the commit message
>   - Updated nvme_fc_start_ioerr_recovery() to handle
>     CONNECTING case first.
> 
> - nvme-fc: Use CCR to recover controller that hits an error
>   - Updated log nvme_fence_ctrl() return value
> 
> - nvmet: Add support for CQT to nvme target
> - nvme: Add support for CQT to nvme host
> - nvme: Update CCR completion wait timeout to consider CQT
> - nvme-tcp: Extend FENCING state per TP4129 on CCR failure
> - nvme-rdma: Extend FENCING state per TP4129 on CCR failure
> - nvme-fc: Extend FENCING state per TP4129 on CCR failure
>   - Dropped CQT patches
> 
> 
> v3: https://lore.kernel.org/all/20260214042753.4073668-1-mkhalfella@purestorage.com/
> 
> *** BLURB HERE ***
> 
> 
> Mohamed Khalfella (15):
>   nvmet: Rapid Path Failure Recovery set controller identify fields
>   nvmet/debugfs: Export controller CIU and CIRN via debugfs
>   nvmet: Implement CCR nvme command
>   nvmet: Implement CCR logpage
>   nvmet: Send an AEN on CCR completion
>   nvme: Rapid Path Failure Recovery read controller identify fields
>   nvme: Introduce FENCING and FENCED controller states
>   nvme: Implement cross-controller reset recovery
>   nvme: Implement cross-controller reset completion
>   nvme-tcp: Use CCR to recover controller that hits an error
>   nvme-rdma: Use CCR to recover controller that hits an error
>   nvme-fc: Refactor IO error recovery
>   nvme-fc: Use CCR to recover controller that hits an error
>   nvme-fc: Hold inflight requests while in FENCING state
>   nvme-fc: Do not cancel requests in io taget before it is initialized
> 
>  drivers/nvme/host/constants.c   |   1 +
>  drivers/nvme/host/core.c        | 225 +++++++++++++++++++++++++++++++-
>  drivers/nvme/host/fc.c          | 215 +++++++++++++++++++++---------
>  drivers/nvme/host/nvme.h        |  24 ++++
>  drivers/nvme/host/rdma.c        |  30 ++++-
>  drivers/nvme/host/sysfs.c       |  25 ++++
>  drivers/nvme/host/tcp.c         |  30 ++++-
>  drivers/nvme/target/admin-cmd.c | 123 +++++++++++++++++
>  drivers/nvme/target/core.c      | 110 +++++++++++++++-
>  drivers/nvme/target/debugfs.c   |  21 +++
>  drivers/nvme/target/nvmet.h     |  18 ++-
>  include/linux/nvme.h            |  65 ++++++++-
>  12 files changed, 812 insertions(+), 75 deletions(-)
> 
> 
> base-commit: dd09eb443372f9390d36051d86ebe06e9919aeec
> -- 
> 2.52.0
> 

       reply	other threads:[~2026-05-12 21:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260328004518.1729186-1-mkhalfella@purestorage.com>
2026-05-12 21:40 ` Mohamed Khalfella [this message]
2026-05-12 22:02   ` [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery Sagi Grimberg
     [not found] ` <20260328004518.1729186-3-mkhalfella@purestorage.com>
2026-05-14 23:42   ` [PATCH v4 02/15] nvmet/debugfs: Export controller CIU and CIRN via debugfs Randy Jennings
     [not found] ` <20260328004518.1729186-4-mkhalfella@purestorage.com>
2026-05-15  0:18   ` [PATCH v4 03/15] nvmet: Implement CCR nvme command Randy Jennings
     [not found] ` <20260328004518.1729186-5-mkhalfella@purestorage.com>
2026-05-15  0:38   ` [PATCH v4 04/15] nvmet: Implement CCR logpage Randy Jennings
     [not found] ` <20260328004518.1729186-6-mkhalfella@purestorage.com>
2026-05-15  0:50   ` [PATCH v4 05/15] nvmet: Send an AEN on CCR completion Randy Jennings
     [not found] ` <20260328004518.1729186-7-mkhalfella@purestorage.com>
2026-05-15  2:03   ` [PATCH v4 06/15] nvme: Rapid Path Failure Recovery read controller identify fields Randy Jennings
     [not found] ` <20260328004518.1729186-8-mkhalfella@purestorage.com>
2026-05-15  2:06   ` [PATCH v4 07/15] nvme: Introduce FENCING and FENCED controller states Randy Jennings
     [not found] ` <20260328004518.1729186-2-mkhalfella@purestorage.com>
2026-05-15  2:08   ` [PATCH v4 01/15] nvmet: Rapid Path Failure Recovery set controller identify fields Randy Jennings
     [not found] ` <20260328004518.1729186-9-mkhalfella@purestorage.com>
2026-05-15  2:32   ` [PATCH v4 08/15] nvme: Implement cross-controller reset recovery Randy Jennings
     [not found] ` <20260328004518.1729186-10-mkhalfella@purestorage.com>
2026-05-15  2:47   ` [PATCH v4 09/15] nvme: Implement cross-controller reset completion Randy Jennings
     [not found]   ` <73a9c0e2-ecd0-4170-8723-259529617ec0@suse.de>
     [not found]     ` <20260331165510.GD2861-mkhalfella@purestorage.com>
     [not found]       ` <019cf04f-8988-46fd-aecd-0f77ac5f8b8a@suse.de>
     [not found]         ` <20260407190940.GF2861-mkhalfella@purestorage.com>
2026-05-15  2:49           ` Randy Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260512214040.GI10532-mkhalfella@purestorage.com \
    --to=mkhalfella@purestorage.com \
    --cc=adailey@purestorage.com \
    --cc=axboe@kernel.dk \
    --cc=dgiani@purestorage.com \
    --cc=hare@suse.de \
    --cc=jsmart833426@gmail.com \
    --cc=justin.tee@broadcom.com \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=nareshgottumukkala83@gmail.com \
    --cc=paul.ely@broadcom.com \
    --cc=randyj@purestorage.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox