From: Mohamed Khalfella <mkhalfella@purestorage.com>
To: James Smart <jsmart833426@gmail.com>
Cc: Justin Tee <justin.tee@broadcom.com>,
Naresh Gottumukkala <nareshgottumukkala83@gmail.com>,
Paul Ely <paul.ely@broadcom.com>,
Chaitanya Kulkarni <kch@nvidia.com>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Keith Busch <kbusch@kernel.org>, Sagi Grimberg <sagi@grimberg.me>,
Aaron Dailey <adailey@purestorage.com>,
Randy Jennings <randyj@purestorage.com>,
Dhaval Giani <dgiani@purestorage.com>,
Hannes Reinecke <hare@suse.de>,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 08/14] nvme: Implement cross-controller reset recovery
Date: Tue, 10 Feb 2026 15:25:53 -0800 [thread overview]
Message-ID: <20260210232553.GR3729-mkhalfella@purestorage.com> (raw)
In-Reply-To: <5f3c9cf0-7fee-432a-b6c5-44fb2acb0b1d@gmail.com>
On Tue 2026-02-10 14:49:15 -0800, James Smart wrote:
> On 2/10/2026 2:27 PM, Mohamed Khalfella wrote:
> > On Tue 2026-02-10 14:09:27 -0800, James Smart wrote:
> >> On 1/30/2026 2:34 PM, Mohamed Khalfella wrote:
> >> ...
> >>> +unsigned long nvme_fence_ctrl(struct nvme_ctrl *ictrl)
> >>> +{
> >>> + unsigned long deadline, now, timeout;
> >>> + struct nvme_ctrl *sctrl;
> >>> + u32 min_cntlid = 0;
> >>> + int ret;
> >>> +
> >>> + timeout = nvme_fence_timeout_ms(ictrl);
> >>> + dev_info(ictrl->device, "attempting CCR, timeout %lums\n", timeout);
> >>> +
> >>> + now = jiffies;
> >>> + deadline = now + msecs_to_jiffies(timeout);
> >>> + while (time_before(now, deadline)) {
> >>
> >> Q: don't we have something to identify the controller's subsystem
> >> supports CCR before we starting selecting controllers and sending CCR ?
> >>
> >> I would think on older devices that don't support it we should be
> >> skipping this loop. The loop could delay the Time-Based delay without
> >> any CCR.
> >
> > I do not think we have something that identifies CCR support at
> > subsystem level. The spec defines CCRL at the controller level. The loop
> > should not that bad. nvme_find_ctrl_ccr() should return NULL if CCR is
> > not supported and nvme_fence_ctrl() will return immediately.
> >
> >>
> >> -- james
> >>
>
> I would think CCRL on the failed controller would be enough to assume
> the subsystem supports it.
ictrl->ccr_limit is a good indication that subsystem supports CCR. I do
not think it is enough though. I say that for two reasons:
- May be this controller does not support CCR but others do on the same
subsystem. There is nothing prevents subsystem from putting a cap of
CCR at subsytem level.
- May be this controller supports CCR command but not now because all
CCR slots are used now. This can happen in the case of cascading
failure.
>
> I'm not worried about the coding on the host is so bad. It's more the
> multiple paths that must have cmds sent to them and getting error
> responses for unknown cmds (should be responded to ok, but you never
> know) as well as creating conditions for other errors where there will
> be no return for it - e.g. other paths losing connectivity while the ccr
> outstanding, etc. yes, they all have to work, but why bother adding
> these flows to an old controller that would never do CCR ?
If nvme_find_ctrl_ccr() returns a source controller to use then we know
the controller supports CCR and does have an available slot to process
this CCR request. I do not see how this code will send CCR request to an
old controller that does not know about CCR command.
I am not fully opposed against using ictrl->ccr_limit to return early. I
do not see the need for it. If you feel strongly about it I can update
nvme_fence_ctrl() to do so.
next prev parent reply other threads:[~2026-02-10 23:26 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-30 22:34 [PATCH v2 00/14] TP8028 Rapid Path Failure Recovery Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 01/14] nvmet: Rapid Path Failure Recovery set controller identify fields Mohamed Khalfella
2026-02-03 3:03 ` Hannes Reinecke
2026-02-03 18:14 ` Mohamed Khalfella
2026-02-04 0:34 ` Hannes Reinecke
2026-02-07 13:41 ` Sagi Grimberg
2026-02-14 0:42 ` Randy Jennings
2026-02-14 3:56 ` Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 02/14] nvmet/debugfs: Add ctrl uniquifier and random values Mohamed Khalfella
2026-02-03 3:04 ` Hannes Reinecke
2026-02-07 13:47 ` Sagi Grimberg
2026-02-11 0:50 ` Randy Jennings
2026-02-11 1:02 ` Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 03/14] nvmet: Implement CCR nvme command Mohamed Khalfella
2026-02-03 3:19 ` Hannes Reinecke
2026-02-03 18:40 ` Mohamed Khalfella
2026-02-04 0:38 ` Hannes Reinecke
2026-02-04 0:44 ` Mohamed Khalfella
2026-02-04 0:55 ` Hannes Reinecke
2026-02-04 17:52 ` Mohamed Khalfella
2026-02-07 13:58 ` Sagi Grimberg
2026-02-08 23:10 ` Mohamed Khalfella
2026-02-09 19:27 ` Mohamed Khalfella
2026-02-11 1:34 ` Randy Jennings
2026-02-07 14:11 ` Sagi Grimberg
2026-01-30 22:34 ` [PATCH v2 04/14] nvmet: Implement CCR logpage Mohamed Khalfella
2026-02-03 3:21 ` Hannes Reinecke
2026-02-07 14:11 ` Sagi Grimberg
2026-02-11 1:49 ` Randy Jennings
2026-01-30 22:34 ` [PATCH v2 05/14] nvmet: Send an AEN on CCR completion Mohamed Khalfella
2026-02-03 3:27 ` Hannes Reinecke
2026-02-03 18:48 ` Mohamed Khalfella
2026-02-04 0:43 ` Hannes Reinecke
2026-02-07 14:12 ` Sagi Grimberg
2026-02-11 1:52 ` Randy Jennings
2026-01-30 22:34 ` [PATCH v2 06/14] nvme: Rapid Path Failure Recovery read controller identify fields Mohamed Khalfella
2026-02-03 3:28 ` Hannes Reinecke
2026-02-07 14:13 ` Sagi Grimberg
2026-02-11 1:56 ` Randy Jennings
2026-01-30 22:34 ` [PATCH v2 07/14] nvme: Introduce FENCING and FENCED controller states Mohamed Khalfella
2026-02-03 5:07 ` Hannes Reinecke
2026-02-03 19:13 ` Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 08/14] nvme: Implement cross-controller reset recovery Mohamed Khalfella
2026-02-03 5:19 ` Hannes Reinecke
2026-02-03 20:00 ` Mohamed Khalfella
2026-02-04 1:10 ` Hannes Reinecke
2026-02-04 23:24 ` Mohamed Khalfella
2026-02-11 3:44 ` Randy Jennings
2026-02-11 15:19 ` Hannes Reinecke
2026-02-10 22:09 ` James Smart
2026-02-10 22:27 ` Mohamed Khalfella
2026-02-10 22:49 ` James Smart
2026-02-10 23:25 ` Mohamed Khalfella [this message]
2026-02-11 0:12 ` Mohamed Khalfella
2026-02-11 3:33 ` Randy Jennings
2026-01-30 22:34 ` [PATCH v2 09/14] nvme: Implement cross-controller reset completion Mohamed Khalfella
2026-02-03 5:22 ` Hannes Reinecke
2026-02-03 20:07 ` Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 10/14] nvme-tcp: Use CCR to recover controller that hits an error Mohamed Khalfella
2026-02-03 5:34 ` Hannes Reinecke
2026-02-03 21:24 ` Mohamed Khalfella
2026-02-04 0:48 ` Randy Jennings
2026-02-04 2:57 ` Hannes Reinecke
2026-02-10 1:39 ` Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 11/14] nvme-rdma: " Mohamed Khalfella
2026-02-03 5:35 ` Hannes Reinecke
2026-01-30 22:34 ` [PATCH v2 12/14] nvme-fc: Decouple error recovery from controller reset Mohamed Khalfella
2026-02-03 5:40 ` Hannes Reinecke
2026-02-03 21:29 ` Mohamed Khalfella
2026-02-03 19:19 ` James Smart
2026-02-03 22:49 ` James Smart
2026-02-04 0:15 ` Mohamed Khalfella
2026-02-04 0:11 ` Mohamed Khalfella
2026-02-05 0:08 ` James Smart
2026-02-05 0:59 ` Mohamed Khalfella
2026-02-09 22:53 ` Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 13/14] nvme-fc: Use CCR to recover controller that hits an error Mohamed Khalfella
2026-02-03 5:43 ` Hannes Reinecke
2026-02-10 22:12 ` James Smart
2026-02-10 22:20 ` Mohamed Khalfella
2026-02-13 19:29 ` Mohamed Khalfella
2026-01-30 22:34 ` [PATCH v2 14/14] nvme-fc: Hold inflight requests while in FENCING state Mohamed Khalfella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260210232553.GR3729-mkhalfella@purestorage.com \
--to=mkhalfella@purestorage.com \
--cc=adailey@purestorage.com \
--cc=axboe@kernel.dk \
--cc=dgiani@purestorage.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jsmart833426@gmail.com \
--cc=justin.tee@broadcom.com \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=nareshgottumukkala83@gmail.com \
--cc=paul.ely@broadcom.com \
--cc=randyj@purestorage.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox