From: James Smart <jsmart833426@gmail.com>
To: Mohamed Khalfella <mkhalfella@purestorage.com>,
Hannes Reinecke <hare@suse.de>
Cc: Justin Tee <justin.tee@broadcom.com>,
Naresh Gottumukkala <nareshgottumukkala83@gmail.com>,
Paul Ely <paul.ely@broadcom.com>,
Chaitanya Kulkarni <kch@nvidia.com>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Keith Busch <kbusch@kernel.org>, Sagi Grimberg <sagi@grimberg.me>,
Aaron Dailey <adailey@purestorage.com>,
Randy Jennings <randyj@purestorage.com>,
Dhaval Giani <dgiani@purestorage.com>,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
jsmart833426@gmail.com
Subject: Re: [PATCH v3 18/21] nvme: Update CCR completion wait timeout to consider CQT
Date: Thu, 19 Feb 2026 17:22:32 -0800 [thread overview]
Message-ID: <96ea8e82-ead3-498f-bb87-5b2809089950@gmail.com> (raw)
In-Reply-To: <20260217153530.GI2392949-mkhalfella@purestorage.com>
On 2/17/2026 7:35 AM, Mohamed Khalfella wrote:
> On Tue 2026-02-17 08:09:33 +0100, Hannes Reinecke wrote:
>> On 2/16/26 19:45, Mohamed Khalfella wrote:
>>> On Mon 2026-02-16 13:54:18 +0100, Hannes Reinecke wrote:
>>>> On 2/14/26 05:25, Mohamed Khalfella wrote:
>>>>> TP8028 Rapid Path Failure Recovery does not define how much time the
>>>>> host should wait for CCR operation to complete. It is reasonable to
>>>>> assume that CCR operation can take up to ctrl->cqt. Update wait time for
>>>>> CCR operation to be max(ctrl->cqt, ctrl->kato).
>>>>>
>>>>> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
>>>>> ---
>>>>> drivers/nvme/host/core.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>>> index 0680d05900c1..ff479c0263ab 100644
>>>>> --- a/drivers/nvme/host/core.c
>>>>> +++ b/drivers/nvme/host/core.c
>>>>> @@ -631,7 +631,7 @@ static int nvme_issue_wait_ccr(struct nvme_ctrl *sctrl, struct nvme_ctrl *ictrl)
>>>>> if (result & 0x01) /* Immediate Reset Successful */
>>>>> goto out;
>>>>>
>>>>> - tmo = secs_to_jiffies(ictrl->kato);
>>>>> + tmo = msecs_to_jiffies(max(ictrl->cqt, ictrl->kato * 1000));
>>>>> if (!wait_for_completion_timeout(&ccr.complete, tmo)) {
>>>>> ret = -ETIMEDOUT;
>>>>> goto out;
>>>>
>>>> That is not my understanding. I was under the impression that CQT is the
>>>> _additional_ time a controller requires to clear out outstanding
>>>> commands once it detected a loss of communication (ie _after_ KATO).
>>>> Which would mean we have to wait for up to
>>>> (ctrl->kato * 1000) + ctrl->cqt.
>>>
>>> At this point the source controller knows about communication loss. We
>>> do not need kato wait. In theory we should just wait for CQT.
>>> max(cqt, kato) is a conservative guess I made.
>>>
>> Not quite. The source controller (on the host!) knows about the
>> communication loss. But the target might not, as the keep-alive
>> command might have arrived at the target _just_ before KATO
>> triggered on the host. So the target is still good, and will
>> be waiting for _another_ KATO interval before declaring
>> a loss of communication.
>> And only then will the CQT period start at the target.
>>
>> Randy, please correct me if I'm wrong ...
>>
>
> wait_for_completion_timeout(&ccr.complete, tmo)) waits for CCR operation
> to complete. The wait starts after CCR command completed successfully.
> IOW, it starts after the host received a CQE from source controller on
> the target telling us all is good. If the source controller on the target
> already know about loss of communication then there is no need to wait
> for KATO. We just need to wait for CCR operation to finish because we
> know it has been started successfully.
>
> The specs does not tell us how much time to wait for CCR operation to
> complete. max(cqt, kato) is an estimate I think reasonable to make.
So, we've sent CCR, received a CQE for the CCR within KATO (timeout in
nvme_issue_wait_ccr()), then are waiting another max(KATO, CQT) for the
io to die.
As CQT is the time to wait once the ctrl is killing the io, and as the
response indicated it certainly passed that point, a minimum of CQT
should be all that is needed. Why are we bringing KATO into the picture?
-- this takes me over to patch 8 and the timeout on CCR response being KATO:
Why is KATO being used ? nothing about getting the response says it is
related to the keep alive. Keepalive can move along happily while CCR
hangs out and really has nothing to do with KATO.
If using the rationale of a keepalive cmd processing - has roundtrip
time and minimal and prioritized processing, as CCR needs to do more and
as the spec allows holding on to always return 1, it should be
KATO+<something>, where <something> is no more than CQT.
But given that KATO can be really long as its trying to catch
communication failures, and as our ccr controller should not have comm
issues, it should be fairly quick. So rather than a 2min KATO, why not
10-15s ? This gets a little crazy as it takes me down paths of why not
fire off multiple CCRs (via different ctlrs) to the subsystem at short
intervals (the timeout) to finally find one that completes quickly and
then start CQT. And if nothing completes quickly bound the whole thing
to fencing start+KATO+CQT ?
-- james
next prev parent reply other threads:[~2026-02-20 1:22 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-14 4:25 [PATCH v3 00/21] TP8028 Rapid Path Failure Recovery Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 01/21] nvmet: Rapid Path Failure Recovery set controller identify fields Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 02/21] nvmet/debugfs: Export controller CIU and CIRN via debugfs Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 03/21] nvmet: Implement CCR nvme command Mohamed Khalfella
2026-02-27 16:30 ` Maurizio Lombardi
2026-03-25 18:52 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 04/21] nvmet: Implement CCR logpage Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 05/21] nvmet: Send an AEN on CCR completion Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 06/21] nvme: Rapid Path Failure Recovery read controller identify fields Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 07/21] nvme: Introduce FENCING and FENCED controller states Mohamed Khalfella
2026-02-16 12:33 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 08/21] nvme: Implement cross-controller reset recovery Mohamed Khalfella
2026-02-16 12:41 ` Hannes Reinecke
2026-02-17 18:35 ` Mohamed Khalfella
2026-02-26 2:37 ` Randy Jennings
2026-03-27 18:33 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 09/21] nvme: Implement cross-controller reset completion Mohamed Khalfella
2026-02-16 12:43 ` Hannes Reinecke
2026-02-17 18:25 ` Mohamed Khalfella
2026-02-18 7:51 ` Hannes Reinecke
2026-02-18 12:47 ` Mohamed Khalfella
2026-02-20 3:34 ` Randy Jennings
2026-02-14 4:25 ` [PATCH v3 10/21] nvme-tcp: Use CCR to recover controller that hits an error Mohamed Khalfella
2026-02-16 12:47 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 11/21] nvme-rdma: " Mohamed Khalfella
2026-02-16 12:47 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 12/21] nvme-fc: Decouple error recovery from controller reset Mohamed Khalfella
2026-02-28 0:12 ` James Smart
2026-03-26 2:37 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 13/21] nvme-fc: Use CCR to recover controller that hits an error Mohamed Khalfella
2026-02-28 1:03 ` James Smart
2026-03-26 17:40 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 14/21] nvme-fc: Hold inflight requests while in FENCING state Mohamed Khalfella
2026-02-27 2:49 ` Randy Jennings
2026-02-28 1:10 ` James Smart
2026-02-14 4:25 ` [PATCH v3 15/21] nvme-fc: Do not cancel requests in io taget before it is initialized Mohamed Khalfella
2026-02-28 1:12 ` James Smart
2026-02-14 4:25 ` [PATCH v3 16/21] nvmet: Add support for CQT to nvme target Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 17/21] nvme: Add support for CQT to nvme host Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 18/21] nvme: Update CCR completion wait timeout to consider CQT Mohamed Khalfella
2026-02-16 12:54 ` Hannes Reinecke
2026-02-16 18:45 ` Mohamed Khalfella
2026-02-17 7:09 ` Hannes Reinecke
2026-02-17 15:35 ` Mohamed Khalfella
2026-02-20 1:22 ` James Smart [this message]
2026-02-20 2:11 ` Randy Jennings
2026-02-20 7:23 ` Hannes Reinecke
2026-02-20 2:01 ` Randy Jennings
2026-02-20 7:25 ` Hannes Reinecke
2026-02-27 3:05 ` Randy Jennings
2026-03-02 7:32 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 19/21] nvme-tcp: Extend FENCING state per TP4129 on CCR failure Mohamed Khalfella
2026-02-16 12:56 ` Hannes Reinecke
2026-02-17 17:58 ` Mohamed Khalfella
2026-02-18 8:26 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 20/21] nvme-rdma: " Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 21/21] nvme-fc: " Mohamed Khalfella
2026-02-28 1:20 ` James Smart
2026-03-25 19:07 ` Mohamed Khalfella
2026-04-01 13:33 ` [PATCH v3 00/21] TP8028 Rapid Path Failure Recovery Achkinazi, Igor
2026-04-01 16:37 ` Mohamed Khalfella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=96ea8e82-ead3-498f-bb87-5b2809089950@gmail.com \
--to=jsmart833426@gmail.com \
--cc=adailey@purestorage.com \
--cc=axboe@kernel.dk \
--cc=dgiani@purestorage.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=justin.tee@broadcom.com \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=mkhalfella@purestorage.com \
--cc=nareshgottumukkala83@gmail.com \
--cc=paul.ely@broadcom.com \
--cc=randyj@purestorage.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox