* Re: [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery [not found] <20260328004518.1729186-1-mkhalfella@purestorage.com> @ 2026-05-12 21:40 ` Mohamed Khalfella 2026-05-12 22:02 ` Sagi Grimberg [not found] ` <20260328004518.1729186-3-mkhalfella@purestorage.com> ` (8 subsequent siblings) 9 siblings, 1 reply; 12+ messages in thread From: Mohamed Khalfella @ 2026-05-12 21:40 UTC (permalink / raw) To: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke Cc: Aaron Dailey, Randy Jennings, Dhaval Giani, linux-nvme, linux-kernel On Fri 2026-03-27 17:43:31 -0700, Mohamed Khalfella wrote: > This patchset adds support for TP8028 Rapid Path Failure Recovery for > both nvme target and initiator. Rapid Path Failure Recovery brings > Cross-Controller Reset (CCR) functionality to nvme. This allows nvme > host to send an nvme command to a source nvme controller to reset > the impacted nvme controller, provided that both source and impacted > controllers are in the same nvme subsystem. > > The main use of CCR is when one path to the nvme subsystem fails. > Inflight IOs on impacted nvme controller need to be terminated first > before they can be retried on another path. Otherwise data corruption > may happen. CCR provides a quick way to terminate these IOs on the > unreachable nvme controller allowing recovery to move quickly avoiding > unnecessary delays. In case of CCR is not possible, inflight requests > are held for duration defined by TP4129 KATO Corrections and > Clarifications before they are allowed to be retried. > > > On the target side: > > * New struct members have been added to support CCR. struct nvme_id_ctrl > has been updated with CIU (Controller Instance Uniquifier), CIRN > (Controller Instance Random Number), and CQT (Command Quiesce Time). > The combination of CIU, CNTLID, and CIRN is used to identify impacted > controller in CCR command. > > * CCR nvme command implemented on the target causes impacted controller > to fail and drop connections to host. > > * CCR logpage contains the status of pending CCR requests. An entry is > added to the logpage after CCR request is validated. Completed CCR > requests are removed from the logpage when controller becomes ready or > when requested in get logpage command. > > * An AEN is sent when CCR completes to let the host know that it is safe > to retry inflight requests. > > > On the host side: > > * CIU, CIRN, and CQT have been added to struct nvme_ctrl. CIU and CIRN > have been added to sysfs to make the values visible to the user. > CIU and CIRN can be used to construct and manually send admin-passthru > CCR commands. > > * New controller states FENCING and FENCED have been added to make sure > that inflight request do not get canceled if they timeout during > fencing process. FENCED exists so that controller state machine does > not have a transition from FENCING to RESETTING. Instead FENCING -> > FENCED -> RESETTING. This prevents a controller being fenced from > getting reset. Only after fencing finishes the impacted controller is > reset. > > * Controller recovery in nvme_fence_ctrl() is invoked when LIVE > controller hits an error or when a request times out. CCR is attempted > first to reset impacted controller. If it fails then inflight requests > are held until it is safe to retry them. > > * Updated nvme fabric transports nvme-tcp, nvme-rdma, and nvme-fc to > use CCR recovery. > > > Ideally all inflight requests should be held during controller recovery > and only retried after recovery is done. However, there are known > situations where that is not the case in this implementation. These gaps > will be addressed in future patches: > > * Manual controller reset from sysfs will result in controller going to > RESETTING state and all inflight requests to be canceled immediately > and may be retried on another path. > > * Manual controller delete from sysfs will also result in all inflight > requests to be canceled immediately and may be retried on another path. > > * In nvme-fc, nvme controller will be deleted if remote port disappears > with no timeout specified. This results in immediate cancellation of > requests that may be retried on another path. > > * In nvme-rdma if HCA is removed all nvme controllers will be deleted. > This results in canceling inflight IOs and may be they will be retried > on another path. > > > Changes from v3: > - nvmet: Implement CCR nvme command > - Fixed a bug in the order of members of struct nvme_cross_ctrl_reset_cmd > - Use kmalloc_obj() instead of kmalloc() > > - nvme: Implement cross-controller reset recovery > - Now CQT has been removed updated nvme_fence_ctrl() to return > success or failure instead of remaining time. > - Updated nvme_issue_wait_ccr() to respect deadline set in > nvme_fence_ctrl(). v4 dropped CQT patches in order to focus on CCR. However, I came to the understanding that we need to bring CQT patches back. The plan for v5 is to be similar to v3 plus minor fixes came in v4. Sagi - Does this sound good to you? > > - nvme-tcp: Use CCR to recover controller that hits an error > - nvme-rdma: Use CCR to recover controller that hits an error > - Updated log nvme_fence_ctrl() return value > > - nvme-fc: Refactor IO error recovery > - Updated the commit message > - Updated nvme_fc_start_ioerr_recovery() to handle > CONNECTING case first. > > - nvme-fc: Use CCR to recover controller that hits an error > - Updated log nvme_fence_ctrl() return value > > - nvmet: Add support for CQT to nvme target > - nvme: Add support for CQT to nvme host > - nvme: Update CCR completion wait timeout to consider CQT > - nvme-tcp: Extend FENCING state per TP4129 on CCR failure > - nvme-rdma: Extend FENCING state per TP4129 on CCR failure > - nvme-fc: Extend FENCING state per TP4129 on CCR failure > - Dropped CQT patches > > > v3: https://lore.kernel.org/all/20260214042753.4073668-1-mkhalfella@purestorage.com/ > > *** BLURB HERE *** > > > Mohamed Khalfella (15): > nvmet: Rapid Path Failure Recovery set controller identify fields > nvmet/debugfs: Export controller CIU and CIRN via debugfs > nvmet: Implement CCR nvme command > nvmet: Implement CCR logpage > nvmet: Send an AEN on CCR completion > nvme: Rapid Path Failure Recovery read controller identify fields > nvme: Introduce FENCING and FENCED controller states > nvme: Implement cross-controller reset recovery > nvme: Implement cross-controller reset completion > nvme-tcp: Use CCR to recover controller that hits an error > nvme-rdma: Use CCR to recover controller that hits an error > nvme-fc: Refactor IO error recovery > nvme-fc: Use CCR to recover controller that hits an error > nvme-fc: Hold inflight requests while in FENCING state > nvme-fc: Do not cancel requests in io taget before it is initialized > > drivers/nvme/host/constants.c | 1 + > drivers/nvme/host/core.c | 225 +++++++++++++++++++++++++++++++- > drivers/nvme/host/fc.c | 215 +++++++++++++++++++++--------- > drivers/nvme/host/nvme.h | 24 ++++ > drivers/nvme/host/rdma.c | 30 ++++- > drivers/nvme/host/sysfs.c | 25 ++++ > drivers/nvme/host/tcp.c | 30 ++++- > drivers/nvme/target/admin-cmd.c | 123 +++++++++++++++++ > drivers/nvme/target/core.c | 110 +++++++++++++++- > drivers/nvme/target/debugfs.c | 21 +++ > drivers/nvme/target/nvmet.h | 18 ++- > include/linux/nvme.h | 65 ++++++++- > 12 files changed, 812 insertions(+), 75 deletions(-) > > > base-commit: dd09eb443372f9390d36051d86ebe06e9919aeec > -- > 2.52.0 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery 2026-05-12 21:40 ` [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery Mohamed Khalfella @ 2026-05-12 22:02 ` Sagi Grimberg 0 siblings, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2026-05-12 22:02 UTC (permalink / raw) To: Mohamed Khalfella, Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, James Smart, Hannes Reinecke Cc: Aaron Dailey, Randy Jennings, Dhaval Giani, linux-nvme, linux-kernel > v4 dropped CQT patches in order to focus on CCR. However, I came to the > understanding that we need to bring CQT patches back. The plan for v5 is > to be similar to v3 plus minor fixes came in v4. > > Sagi - Does this sound good to you? Yes ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-3-mkhalfella@purestorage.com>]
* Re: [PATCH v4 02/15] nvmet/debugfs: Export controller CIU and CIRN via debugfs [not found] ` <20260328004518.1729186-3-mkhalfella@purestorage.com> @ 2026-05-14 23:42 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-14 23:42 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > Export ctrl->ciu and ctrl->cirn as debugfs files under controller > debugfs directory. > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> > Reviewed-by: Hannes Reinecke <hare@suse.de> > Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Randy Jennings <randyj@purestorage.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-4-mkhalfella@purestorage.com>]
* Re: [PATCH v4 03/15] nvmet: Implement CCR nvme command [not found] ` <20260328004518.1729186-4-mkhalfella@purestorage.com> @ 2026-05-15 0:18 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 0:18 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > Defined by TP8028 Rapid Path Failure Recovery, CCR (Cross-Controller > Reset) command is an nvme command issued to source controller by > initiator to reset impacted controller. Implement CCR command for linux > nvme target. > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> > --- > + mutex_lock(&sctrl->lock); > + list_for_each_entry(ccr, &sctrl->ccr_list, entry) { > + if (ccr->ctrl == ictrl) { > + status = NVME_SC_CCR_IN_PROGRESS | NVME_STATUS_DNR; > + goto out_unlock; > + } > + > + ccr_total++; > + if (ccr->ctrl) > + ccr_active++; > + } > + > + if (ccr_active >= NVMF_CCR_LIMIT) { > + status = NVME_SC_CCR_LIMIT_EXCEEDED; > + goto out_unlock; > + } > + if (ccr_total >= NVMF_CCR_PER_PAGE) { > + status = NVME_SC_CCR_LOGPAGE_FULL; > + goto out_unlock; > + } > + > + new_ccr = kmalloc_obj(*new_ccr, GFP_KERNEL); This allocation could be done optimistically outside of the mutex. But that would lead to more complicated code; probably not worth it here. Reviewed-by: Randy Jennings <randyj@purestorage.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-5-mkhalfella@purestorage.com>]
* Re: [PATCH v4 04/15] nvmet: Implement CCR logpage [not found] ` <20260328004518.1729186-5-mkhalfella@purestorage.com> @ 2026-05-15 0:38 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 0:38 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > Defined by TP8028 Rapid Path Failure Recovery, CCR (Cross-Controller > Reset) log page contains an entry for each CCR request submitted to > source controller. Implement CCR logpage for nvme linux target. > +/* NVMe Cross-Controller Reset Status */ > +enum { > + NVME_CCR_STATUS_IN_PROGRESS, > + NVME_CCR_STATUS_SUCCESS, > + NVME_CCR_STATUS_FAILED, > +}; > + Looking at the rest of the code, all the enums are defined except /* NVMe Namespace Write Protect State */ which does define the value of the first entry (0). I think it would be prefereable to add explicit values here (0, 1, 2) even though the implicit values should be correct. Sincerely, Randy Jennings ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-6-mkhalfella@purestorage.com>]
* Re: [PATCH v4 05/15] nvmet: Send an AEN on CCR completion [not found] ` <20260328004518.1729186-6-mkhalfella@purestorage.com> @ 2026-05-15 0:50 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 0:50 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > Send an AEN to initiator when impacted controller exists. The > notification points to CCR log page that initiator can read to check > which CCR operation completed. > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> > Reviewed-by: Hannes Reinecke <hare@suse.de> > Reviewed-by: Sagi Grimberg <sagi@grimberg.me> > +static void nvmet_add_async_event_locked(struct nvmet_ctrl *ctrl, u8 event_type, > u8 event_info, u8 log_page) > { > struct nvmet_async_event *aen; > @@ -218,13 +218,19 @@ void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type, > aen->event_info = event_info; > aen->log_page = log_page; Please add lockdep_assert_held(&ctrl->lock); Looking at the rest of the code, this should go directly under the local variable definitions. > > - mutex_lock(&ctrl->lock); > list_add_tail(&aen->entry, &ctrl->async_events); > - mutex_unlock(&ctrl->lock); Sincerely, Randy Jennings ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-7-mkhalfella@purestorage.com>]
* Re: [PATCH v4 06/15] nvme: Rapid Path Failure Recovery read controller identify fields [not found] ` <20260328004518.1729186-7-mkhalfella@purestorage.com> @ 2026-05-15 2:03 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 2:03 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > TP8028 Rapid path failure added new fileds to controller identify > response. Read CIU (Controller Instance Uniquifier), CIRN (Controller > Instance Random Number), and CCRL (Cross-Controller Reset Limit) from > controller identify response. Expose CIU and CIRN as sysfs attributes > so the values can be used directrly by user if needed. > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> > Reviewed-by: Hannes Reinecke <hare@suse.de> > Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Randy Jennings <randyj@purestorage.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-8-mkhalfella@purestorage.com>]
* Re: [PATCH v4 07/15] nvme: Introduce FENCING and FENCED controller states [not found] ` <20260328004518.1729186-8-mkhalfella@purestorage.com> @ 2026-05-15 2:06 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 2:06 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:46 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > FENCING is a new controller state that a LIVE controller enters when an > error is encountered. While in FENCING state, inflight IOs that timeout > are not canceled because they should be held until either CCR succeeds > or time-based recovery completes. While the queues remain alive, requests > are not allowed to be sent in this state, and the controller cannot be > reset or deleted. This is intentional because resetting or deleting the > controller results in canceling inflight IOs. > > FENCED is a short-term state the controller enters before it is reset. > It exists only to prevent manual resets from happening while controller > is in FENCING state. > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Reviewed-by: Randy Jennings <randyj@purestorage.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-2-mkhalfella@purestorage.com>]
* Re: [PATCH v4 01/15] nvmet: Rapid Path Failure Recovery set controller identify fields [not found] ` <20260328004518.1729186-2-mkhalfella@purestorage.com> @ 2026-05-15 2:08 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 2:08 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:45 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > TP8028 Rapid Path Failure Recovery defined new fields in controller > identify response. The newly defined fields are: > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Reviewed-by: Randy Jennings <randyj@purestorage.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-9-mkhalfella@purestorage.com>]
* Re: [PATCH v4 08/15] nvme: Implement cross-controller reset recovery [not found] ` <20260328004518.1729186-9-mkhalfella@purestorage.com> @ 2026-05-15 2:32 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 2:32 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:46 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > A host that has more than one path connecting to an nvme subsystem > typically has an nvme controller associated with every path. This is > mostly applicable to nvmeof. If one path goes down, inflight IOs on that > path should not be retried immediately on another path because this > could lead to data corruption as described in TP4129. TP8028 defines > cross-controller reset mechanism that can be used by host to terminate > IOs on the failed path using one of the remaining healthy paths. Only > after IOs are terminated, or long enough time passes as defined by > TP4129, inflight IOs should be retried on another path. Implement core > cross-controller reset shared logic to be used by the transports. > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> > + now = jiffies; > + if (time_before(now, deadline)) > + tmo = min_t(unsigned long, > + secs_to_jiffies(ictrl->kato), deadline - now); At LSF last week, we talked about skipping logic here and just using the rest of the time (deadline - now). The arguments for shortening the time were dubious, so simpler logic seemed better. > + > + if (!wait_for_completion_timeout(&ccr.complete, tmo)) { > + ret = -ETIMEDOUT; > + goto out; > + } Sincerely, Randy Jennings ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20260328004518.1729186-10-mkhalfella@purestorage.com>]
* Re: [PATCH v4 09/15] nvme: Implement cross-controller reset completion [not found] ` <20260328004518.1729186-10-mkhalfella@purestorage.com> @ 2026-05-15 2:47 ` Randy Jennings [not found] ` <73a9c0e2-ecd0-4170-8723-259529617ec0@suse.de> 1 sibling, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 2:47 UTC (permalink / raw) To: Mohamed Khalfella Cc: Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Hannes Reinecke, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Fri, Mar 27, 2026 at 5:46 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > An nvme source controller that issues CCR command expects to receive an > NVME_AER_NOTICE_CCR_COMPLETED when pending CCR succeeds or fails. Add > sctrl->ccr_work to read NVME_LOG_CCR logpage and wakeup any thread > waiting on CCR completion. > > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> > --- > + ret = nvme_get_log(ctrl, 0, NVME_LOG_CCR, 0x01, > + 0x00, log, sizeof(*log), 0); The 0x01 is the log specific parameter. 0x01 sets the Remove Completed flag. Please explain that in a comment (if not assigning it to a well named local/ defining a constant). Sincerely, Randy Jennings ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <73a9c0e2-ecd0-4170-8723-259529617ec0@suse.de>]
[parent not found: <20260331165510.GD2861-mkhalfella@purestorage.com>]
[parent not found: <019cf04f-8988-46fd-aecd-0f77ac5f8b8a@suse.de>]
[parent not found: <20260407190940.GF2861-mkhalfella@purestorage.com>]
* Re: [PATCH v4 09/15] nvme: Implement cross-controller reset completion [not found] ` <20260407190940.GF2861-mkhalfella@purestorage.com> @ 2026-05-15 2:49 ` Randy Jennings 0 siblings, 0 replies; 12+ messages in thread From: Randy Jennings @ 2026-05-15 2:49 UTC (permalink / raw) To: Mohamed Khalfella Cc: Hannes Reinecke, Justin Tee, Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni, Jens Axboe, Keith Busch, Sagi Grimberg, James Smart, Aaron Dailey, Dhaval Giani, linux-nvme, linux-kernel On Tue, Apr 7, 2026 at 12:09 PM Mohamed Khalfella <mkhalfella@purestorage.com> wrote: > > On Tue 2026-04-07 07:48:50 +0200, Hannes Reinecke wrote: > > On 3/31/26 18:55, Mohamed Khalfella wrote: > > > On Mon 2026-03-30 12:53:07 +0200, Hannes Reinecke wrote: > > >> On 3/28/26 01:43, Mohamed Khalfella wrote: > > >>> An nvme source controller that issues CCR command expects to receive an > > >>> NVME_AER_NOTICE_CCR_COMPLETED when pending CCR succeeds or fails. Add > > >>> sctrl->ccr_work to read NVME_LOG_CCR logpage and wakeup any thread > > >>> waiting on CCR completion. > > >>> > > >>> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> > > >>> --- > > >>> drivers/nvme/host/core.c | 49 +++++++++++++++++++++++++++++++++++++++- > > >>> drivers/nvme/host/nvme.h | 1 + > > >>> 2 files changed, 49 insertions(+), 1 deletion(-) > > >>> > > >>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > > >>> index 5603ae36444f..793f203bfc38 100644 > > >>> --- a/drivers/nvme/host/core.c > > >>> +++ b/drivers/nvme/host/core.c > > >>> @@ -1920,7 +1920,8 @@ EXPORT_SYMBOL_GPL(nvme_set_queue_count); > > >>> > > >>> #define NVME_AEN_SUPPORTED \ > > >>> (NVME_AEN_CFG_NS_ATTR | NVME_AEN_CFG_FW_ACT | \ > > >>> - NVME_AEN_CFG_ANA_CHANGE | NVME_AEN_CFG_DISC_CHANGE) > > >>> + NVME_AEN_CFG_ANA_CHANGE | NVME_AEN_CFG_CCR_COMPLETE | \ > > >>> + NVME_AEN_CFG_DISC_CHANGE) > > >>> > > >>> static void nvme_enable_aen(struct nvme_ctrl *ctrl) > > >>> { > > >>> @@ -4873,6 +4874,47 @@ static void nvme_get_fw_slot_info(struct nvme_ctrl *ctrl) > > >>> kfree(log); > > >>> } > > >>> > > >>> +static void nvme_ccr_work(struct work_struct *work) > > >>> +{ > > >>> + struct nvme_ctrl *ctrl = container_of(work, struct nvme_ctrl, ccr_work); > > >>> + struct nvme_ccr_entry *ccr; > > >>> + struct nvme_ccr_log_entry *entry; > > >>> + struct nvme_ccr_log *log; > > >>> + unsigned long flags; > > >>> + int ret, i; > > >>> + > > >>> + log = kmalloc(sizeof(*log), GFP_KERNEL); > > >>> + if (!log) > > >>> + return; > > >>> + > > >>> + ret = nvme_get_log(ctrl, 0, NVME_LOG_CCR, 0x01, > > >>> + 0x00, log, sizeof(*log), 0); > > >>> + if (ret) > > >>> + goto out; > > >>> + > > >>> + spin_lock_irqsave(&ctrl->lock, flags); > > >>> + for (i = 0; i < le16_to_cpu(log->ne); i++) { > > >>> + entry = &log->entries[i]; > > >>> + if (entry->ccrs == NVME_CCR_STATUS_IN_PROGRESS) > > >>> + continue; > > >>> + > > >>> + list_for_each_entry(ccr, &ctrl->ccr_list, list) { > > >>> + struct nvme_ctrl *ictrl = ccr->ictrl; > > >>> + > > >>> + if (ictrl->cntlid != le16_to_cpu(entry->icid) || > > >>> + ictrl->ciu != entry->ciu) > > >>> + continue; > > >>> + > > >>> + /* Complete matching entry */ > > >>> + ccr->ccrs = entry->ccrs; > > >>> + complete(&ccr->complete); > > >>> + } > > >>> + } > > >>> + spin_unlock_irqrestore(&ctrl->lock, flags); > > >>> +out: > > >>> + kfree(log); > > >>> +} > > >>> + > > >>> static void nvme_fw_act_work(struct work_struct *work) > > >>> { > > >>> struct nvme_ctrl *ctrl = container_of(work, > > >>> @@ -4949,6 +4991,9 @@ static bool nvme_handle_aen_notice(struct nvme_ctrl *ctrl, u32 result) > > >>> case NVME_AER_NOTICE_DISC_CHANGED: > > >>> ctrl->aen_result = result; > > >>> break; > > >>> + case NVME_AER_NOTICE_CCR_COMPLETED: > > >>> + queue_work(nvme_wq, &ctrl->ccr_work); > > >>> + break; > > >>> default: > > >>> dev_warn(ctrl->device, "async event result %08x\n", result); > > >>> } > > >>> @@ -5144,6 +5189,7 @@ void nvme_stop_ctrl(struct nvme_ctrl *ctrl) > > >>> nvme_stop_failfast_work(ctrl); > > >>> flush_work(&ctrl->async_event_work); > > >>> cancel_work_sync(&ctrl->fw_act_work); > > >>> + cancel_work_sync(&ctrl->ccr_work); > > >>> if (ctrl->ops->stop_ctrl) > > >>> ctrl->ops->stop_ctrl(ctrl); > > >>> } > > >>> @@ -5267,6 +5313,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev, > > >>> ctrl->quirks = quirks; > > >>> ctrl->numa_node = NUMA_NO_NODE; > > >>> INIT_WORK(&ctrl->scan_work, nvme_scan_work); > > >>> + INIT_WORK(&ctrl->ccr_work, nvme_ccr_work); > > >>> INIT_WORK(&ctrl->async_event_work, nvme_async_event_work); > > >>> INIT_WORK(&ctrl->fw_act_work, nvme_fw_act_work); > > >>> INIT_WORK(&ctrl->delete_work, nvme_delete_ctrl_work); > > >>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h > > >>> index f2bcff9ccd25..776ee8aa5a93 100644 > > >>> --- a/drivers/nvme/host/nvme.h > > >>> +++ b/drivers/nvme/host/nvme.h > > >>> @@ -419,6 +419,7 @@ struct nvme_ctrl { > > >>> struct nvme_effects_log *effects; > > >>> struct xarray cels; > > >>> struct work_struct scan_work; > > >>> + struct work_struct ccr_work; > > >>> struct work_struct async_event_work; > > >>> struct delayed_work ka_work; > > >>> struct delayed_work failfast_work; > > >> > > >> Hmm. The 'nvme_fence_ctrl' operation introduced in the previous patch > > >> is synchronous, yet in this patch we're looking a a log page to figure > > >> out if the cross-controller reset is complete. > > >> Which is slightly irritating. > > >> Wouldn't it be better to make the 'nvme_fence_ctrl' operation > > >> asynchronous, and then have a separate function to wait for the fence > > >> operation to complete (which then could look at log pages etc)? > > > > > > True nvme_fence_ctrl() is synchronous, but it runs in from ctrl->fencing_work. > > > What is it that you find irritating about nvme_fence_ctrl()? > > > > > > > Thins is, in order to make nvme_fence_ctrl() synchronous we have to > > wait for the operation itself (which is asynchronous) to complete. > > And that wait in itself is implemented by a wait queue. > > So we're having a wait queue calling nvme_fence_ctrl(), which calls > > another wait queue waiting for a completion. > > And then (if the IRS bit is not set) calling another waitqueue for > > checking the log page. > > There is no point of checking the CCR logpage before getting AEN. Sure > we can implement some sort of polling, but I do not think this is the > right approach. > > > > > I think we could simplify this by simply making nvme_fence_ctrl() > > asynchronous, which could do away with all the workqueue handling. > > I am not sure I understand exactly how nvme_fence_ctrl() can be make > asynchronous. Can you provide example code? I was able to convince Hannes at LSF to withdraw the request for an asynchronous structure for the code. Sincerely, Randy Jennings ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-05-15 2:49 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260328004518.1729186-1-mkhalfella@purestorage.com>
2026-05-12 21:40 ` [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery Mohamed Khalfella
2026-05-12 22:02 ` Sagi Grimberg
[not found] ` <20260328004518.1729186-3-mkhalfella@purestorage.com>
2026-05-14 23:42 ` [PATCH v4 02/15] nvmet/debugfs: Export controller CIU and CIRN via debugfs Randy Jennings
[not found] ` <20260328004518.1729186-4-mkhalfella@purestorage.com>
2026-05-15 0:18 ` [PATCH v4 03/15] nvmet: Implement CCR nvme command Randy Jennings
[not found] ` <20260328004518.1729186-5-mkhalfella@purestorage.com>
2026-05-15 0:38 ` [PATCH v4 04/15] nvmet: Implement CCR logpage Randy Jennings
[not found] ` <20260328004518.1729186-6-mkhalfella@purestorage.com>
2026-05-15 0:50 ` [PATCH v4 05/15] nvmet: Send an AEN on CCR completion Randy Jennings
[not found] ` <20260328004518.1729186-7-mkhalfella@purestorage.com>
2026-05-15 2:03 ` [PATCH v4 06/15] nvme: Rapid Path Failure Recovery read controller identify fields Randy Jennings
[not found] ` <20260328004518.1729186-8-mkhalfella@purestorage.com>
2026-05-15 2:06 ` [PATCH v4 07/15] nvme: Introduce FENCING and FENCED controller states Randy Jennings
[not found] ` <20260328004518.1729186-2-mkhalfella@purestorage.com>
2026-05-15 2:08 ` [PATCH v4 01/15] nvmet: Rapid Path Failure Recovery set controller identify fields Randy Jennings
[not found] ` <20260328004518.1729186-9-mkhalfella@purestorage.com>
2026-05-15 2:32 ` [PATCH v4 08/15] nvme: Implement cross-controller reset recovery Randy Jennings
[not found] ` <20260328004518.1729186-10-mkhalfella@purestorage.com>
2026-05-15 2:47 ` [PATCH v4 09/15] nvme: Implement cross-controller reset completion Randy Jennings
[not found] ` <73a9c0e2-ecd0-4170-8723-259529617ec0@suse.de>
[not found] ` <20260331165510.GD2861-mkhalfella@purestorage.com>
[not found] ` <019cf04f-8988-46fd-aecd-0f77ac5f8b8a@suse.de>
[not found] ` <20260407190940.GF2861-mkhalfella@purestorage.com>
2026-05-15 2:49 ` Randy Jennings
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox