From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-dy1-f180.google.com (mail-dy1-f180.google.com [74.125.82.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E74C1386C13 for ; Tue, 12 May 2026 21:40:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.180 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778622045; cv=none; b=FI4ru99CDC73bjuGOJEpDmOnzVyVQhrqt6SUv4uvLXoQUdcT2RqYZPEG0ON4wT5yW/nY7y5o5zCZXN3equlrifajzpaHihhEJfa1Wbdoh0YXv3ooS2wpbKEKNCNcyShqOdX96NC78giotzqmY9LMC7rEs0opJ0yRspDxborb2D0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778622045; c=relaxed/simple; bh=u0whoC4Tvsf+cwqo57cTwCLAD0xG7E5H/nZPGl2QGks=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=CvSU0kvGIhoWfjrnPwZpKG8kNtKC/xQKqFZGIn2OmCsYAj7sCxhKUpzWaJVvyh6gQD5wHOgMEqSkNSqddgUCBo8+Ltwqvr9sSuDNMbNRS5LTgYIIpwqr4xU5n+VWuvYmho1tykIq7sVHJh8uG6BP6unVdRKTUvuXfSvqHE7oyYo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=JLmZxVy+; arc=none smtp.client-ip=74.125.82.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="JLmZxVy+" Received: by mail-dy1-f180.google.com with SMTP id 5a478bee46e88-2c156c4a9efso8198880eec.1 for ; Tue, 12 May 2026 14:40:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1778622043; x=1779226843; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=8muQlQAlVf5BGRqzxW2gYpYvZNW9VotK8fLXFUnhAzg=; b=JLmZxVy+3Qy3PF2SQXdN+RqcU3ZpID57TFkDcgprbe47CuL6bqqyUaiAZYzXshorB5 LimlJKefAxQVln5D5u1Bv+1G3GIa623jdzGSGbqvoywvZpQMc3Iz4QQzgbr4oqjOJGqq cR6HCAI2MZKEy1NQi+59yanzz8cOIQjFtrKvqbLvALkMj3EqIzllWZFr60GWZFaCeVDN qPeuz+6yzTYl7q/A7kHLC8XBzCQOklChvBHSCoKZeOqRSMtYbJnvsKfmQWEqM92acsG7 iWgPKKLaz0Kqyjd4384o/cql7UK3I2mCITqouYeeNViHEGSBZr9kQJA8dDCk5smVBJ47 IOCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778622043; x=1779226843; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8muQlQAlVf5BGRqzxW2gYpYvZNW9VotK8fLXFUnhAzg=; b=Kvq11yVJJHSrbDlNPf0TZU3+nNQNz3uklD1GNMfeSvOGUKMe6VRzUYcROSIQ+J/QSg 3y1w2sD61hvxcbvYNv1r8raKEHLANx3vwpNy9LL2beCiZNAjoo26A6gPb3kqXIoLkAYk MAfPqweybjCrHt+JuamrPqF3Ov/qcqo0oHkizaxwYiFgtpGq+4xT7WIK+AoA+0e1k1ev i+bj2CtJySF6mMCyjYbnJH98oxpL1ezrUOtXYcotc+kqlpDb3HfHHmKE0rgLnL3nhTNw SWQqkTuI4LoZl+FxijC2rrXXU9vUfzo2y/DS9TwMVYHBR5/rdkNqLWo9aTHmO7Zo0z3n uQEA== X-Forwarded-Encrypted: i=1; AFNElJ9rGNQGOrFJTnd5v6cpgBmEb8g78bBENUPfEiO9hDaiybUf9mHEppWkNg/OF4yq5o8+wL1TAn5iWJI4pHY=@vger.kernel.org X-Gm-Message-State: AOJu0YwXDRhdT0M23zHFDD+SABR8gFQpjHqH/9WADmtRIQPHR2nqhM7R Zof6KmOXQXRdaPGpw6wv6A9OWB3LDNFVuDrTYcTr1/OsmDXSPTyDf+x3ClS4zTHQGyQ= X-Gm-Gg: Acq92OGCoquczbc/CUpYwNiSxk1GnEiQlf8c1ZM80yI8YMmesYui/fLCOPYZh4fEkww 8QuXUytVUkv6cjAKLSjxO5IoIBvgcGIWTuPOIE0QmR1tgFF63aS6/O9atx4lP5hbDcD5ZVuPFWr UfuGIr2MpyYpdC5GOZgmJb0ZMQcPUDFeGNQQ5vMvvuWPz4jnkFTidU6N8b3eELqE+nW+a3Zszvd dY1F/8Vw963aw/8f9pDtL/1Opv/TBwco19jBp7s+G5J1nxC7F9gXuRA1JY9h/iju5EY7QxdWQWn X4YHs3jAVI0cIdXZwYl1AAekfYJfgHqsnB/8CYvXVpZw3g6e8zllCVn+VDHdB9yQjjazJFDDHnp pZv+vcO6wQ3nQ29a0kZ3S7wWriljce7F+fzs6jam8Mk2Ej9+Xx1z6Nmrt9112ZdEAbusEXTRaKx gfdL90p3minbXeGuNpZRxmPjPZ6g9cYGL2oQ5IxXSeKBZH X-Received: by 2002:a05:7301:4592:b0:2f5:3fb3:4a76 with SMTP id 5a478bee46e88-3015468f678mr14314eec.10.1778622042772; Tue, 12 May 2026 14:40:42 -0700 (PDT) Received: from medusa.lab.kspace.sh ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id 5a478bee46e88-2f8859eafc2sm24678850eec.4.2026.05.12.14.40.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 May 2026 14:40:42 -0700 (PDT) Date: Tue, 12 May 2026 14:40:40 -0700 From: Mohamed Khalfella To: Justin Tee , Naresh Gottumukkala , Paul Ely , Chaitanya Kulkarni , Jens Axboe , Keith Busch , Sagi Grimberg , James Smart , Hannes Reinecke Cc: Aaron Dailey , Randy Jennings , Dhaval Giani , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 00/15] TP8028 Rapid Path Failure Recovery Message-ID: <20260512214040.GI10532-mkhalfella@purestorage.com> References: <20260328004518.1729186-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260328004518.1729186-1-mkhalfella@purestorage.com> On Fri 2026-03-27 17:43:31 -0700, Mohamed Khalfella wrote: > This patchset adds support for TP8028 Rapid Path Failure Recovery for > both nvme target and initiator. Rapid Path Failure Recovery brings > Cross-Controller Reset (CCR) functionality to nvme. This allows nvme > host to send an nvme command to a source nvme controller to reset > the impacted nvme controller, provided that both source and impacted > controllers are in the same nvme subsystem. > > The main use of CCR is when one path to the nvme subsystem fails. > Inflight IOs on impacted nvme controller need to be terminated first > before they can be retried on another path. Otherwise data corruption > may happen. CCR provides a quick way to terminate these IOs on the > unreachable nvme controller allowing recovery to move quickly avoiding > unnecessary delays. In case of CCR is not possible, inflight requests > are held for duration defined by TP4129 KATO Corrections and > Clarifications before they are allowed to be retried. > > > On the target side: > > * New struct members have been added to support CCR. struct nvme_id_ctrl > has been updated with CIU (Controller Instance Uniquifier), CIRN > (Controller Instance Random Number), and CQT (Command Quiesce Time). > The combination of CIU, CNTLID, and CIRN is used to identify impacted > controller in CCR command. > > * CCR nvme command implemented on the target causes impacted controller > to fail and drop connections to host. > > * CCR logpage contains the status of pending CCR requests. An entry is > added to the logpage after CCR request is validated. Completed CCR > requests are removed from the logpage when controller becomes ready or > when requested in get logpage command. > > * An AEN is sent when CCR completes to let the host know that it is safe > to retry inflight requests. > > > On the host side: > > * CIU, CIRN, and CQT have been added to struct nvme_ctrl. CIU and CIRN > have been added to sysfs to make the values visible to the user. > CIU and CIRN can be used to construct and manually send admin-passthru > CCR commands. > > * New controller states FENCING and FENCED have been added to make sure > that inflight request do not get canceled if they timeout during > fencing process. FENCED exists so that controller state machine does > not have a transition from FENCING to RESETTING. Instead FENCING -> > FENCED -> RESETTING. This prevents a controller being fenced from > getting reset. Only after fencing finishes the impacted controller is > reset. > > * Controller recovery in nvme_fence_ctrl() is invoked when LIVE > controller hits an error or when a request times out. CCR is attempted > first to reset impacted controller. If it fails then inflight requests > are held until it is safe to retry them. > > * Updated nvme fabric transports nvme-tcp, nvme-rdma, and nvme-fc to > use CCR recovery. > > > Ideally all inflight requests should be held during controller recovery > and only retried after recovery is done. However, there are known > situations where that is not the case in this implementation. These gaps > will be addressed in future patches: > > * Manual controller reset from sysfs will result in controller going to > RESETTING state and all inflight requests to be canceled immediately > and may be retried on another path. > > * Manual controller delete from sysfs will also result in all inflight > requests to be canceled immediately and may be retried on another path. > > * In nvme-fc, nvme controller will be deleted if remote port disappears > with no timeout specified. This results in immediate cancellation of > requests that may be retried on another path. > > * In nvme-rdma if HCA is removed all nvme controllers will be deleted. > This results in canceling inflight IOs and may be they will be retried > on another path. > > > Changes from v3: > - nvmet: Implement CCR nvme command > - Fixed a bug in the order of members of struct nvme_cross_ctrl_reset_cmd > - Use kmalloc_obj() instead of kmalloc() > > - nvme: Implement cross-controller reset recovery > - Now CQT has been removed updated nvme_fence_ctrl() to return > success or failure instead of remaining time. > - Updated nvme_issue_wait_ccr() to respect deadline set in > nvme_fence_ctrl(). v4 dropped CQT patches in order to focus on CCR. However, I came to the understanding that we need to bring CQT patches back. The plan for v5 is to be similar to v3 plus minor fixes came in v4. Sagi - Does this sound good to you? > > - nvme-tcp: Use CCR to recover controller that hits an error > - nvme-rdma: Use CCR to recover controller that hits an error > - Updated log nvme_fence_ctrl() return value > > - nvme-fc: Refactor IO error recovery > - Updated the commit message > - Updated nvme_fc_start_ioerr_recovery() to handle > CONNECTING case first. > > - nvme-fc: Use CCR to recover controller that hits an error > - Updated log nvme_fence_ctrl() return value > > - nvmet: Add support for CQT to nvme target > - nvme: Add support for CQT to nvme host > - nvme: Update CCR completion wait timeout to consider CQT > - nvme-tcp: Extend FENCING state per TP4129 on CCR failure > - nvme-rdma: Extend FENCING state per TP4129 on CCR failure > - nvme-fc: Extend FENCING state per TP4129 on CCR failure > - Dropped CQT patches > > > v3: https://lore.kernel.org/all/20260214042753.4073668-1-mkhalfella@purestorage.com/ > > *** BLURB HERE *** > > > Mohamed Khalfella (15): > nvmet: Rapid Path Failure Recovery set controller identify fields > nvmet/debugfs: Export controller CIU and CIRN via debugfs > nvmet: Implement CCR nvme command > nvmet: Implement CCR logpage > nvmet: Send an AEN on CCR completion > nvme: Rapid Path Failure Recovery read controller identify fields > nvme: Introduce FENCING and FENCED controller states > nvme: Implement cross-controller reset recovery > nvme: Implement cross-controller reset completion > nvme-tcp: Use CCR to recover controller that hits an error > nvme-rdma: Use CCR to recover controller that hits an error > nvme-fc: Refactor IO error recovery > nvme-fc: Use CCR to recover controller that hits an error > nvme-fc: Hold inflight requests while in FENCING state > nvme-fc: Do not cancel requests in io taget before it is initialized > > drivers/nvme/host/constants.c | 1 + > drivers/nvme/host/core.c | 225 +++++++++++++++++++++++++++++++- > drivers/nvme/host/fc.c | 215 +++++++++++++++++++++--------- > drivers/nvme/host/nvme.h | 24 ++++ > drivers/nvme/host/rdma.c | 30 ++++- > drivers/nvme/host/sysfs.c | 25 ++++ > drivers/nvme/host/tcp.c | 30 ++++- > drivers/nvme/target/admin-cmd.c | 123 +++++++++++++++++ > drivers/nvme/target/core.c | 110 +++++++++++++++- > drivers/nvme/target/debugfs.c | 21 +++ > drivers/nvme/target/nvmet.h | 18 ++- > include/linux/nvme.h | 65 ++++++++- > 12 files changed, 812 insertions(+), 75 deletions(-) > > > base-commit: dd09eb443372f9390d36051d86ebe06e9919aeec > -- > 2.52.0 >