From: Mohamed Khalfella <mkhalfella@purestorage.com>
To: Hannes Reinecke <hare@suse.de>
Cc: Justin Tee <justin.tee@broadcom.com>,
Naresh Gottumukkala <nareshgottumukkala83@gmail.com>,
Paul Ely <paul.ely@broadcom.com>,
Chaitanya Kulkarni <kch@nvidia.com>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Keith Busch <kbusch@kernel.org>, Sagi Grimberg <sagi@grimberg.me>,
James Smart <jsmart833426@gmail.com>,
Aaron Dailey <adailey@purestorage.com>,
Randy Jennings <randyj@purestorage.com>,
Dhaval Giani <dgiani@purestorage.com>,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 09/21] nvme: Implement cross-controller reset completion
Date: Tue, 17 Feb 2026 10:25:54 -0800 [thread overview]
Message-ID: <20260217182554.GE3435530-mkhalfella@purestorage.com> (raw)
In-Reply-To: <3b21ccbd-7948-436b-8faa-a5541c65946a@suse.de>
On Mon 2026-02-16 13:43:51 +0100, Hannes Reinecke wrote:
> On 2/14/26 05:25, Mohamed Khalfella wrote:
> > An nvme source controller that issues CCR command expects to receive an
> > NVME_AER_NOTICE_CCR_COMPLETED when pending CCR succeeds or fails. Add
> > sctrl->ccr_work to read NVME_LOG_CCR logpage and wakeup any thread
> > waiting on CCR completion.
> >
> > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
> > ---
> > drivers/nvme/host/core.c | 49 +++++++++++++++++++++++++++++++++++++++-
> > drivers/nvme/host/nvme.h | 1 +
> > 2 files changed, 49 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 765b1524b3ed..a9fcde1b411b 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -1916,7 +1916,8 @@ EXPORT_SYMBOL_GPL(nvme_set_queue_count);
> >
> > #define NVME_AEN_SUPPORTED \
> > (NVME_AEN_CFG_NS_ATTR | NVME_AEN_CFG_FW_ACT | \
> > - NVME_AEN_CFG_ANA_CHANGE | NVME_AEN_CFG_DISC_CHANGE)
> > + NVME_AEN_CFG_ANA_CHANGE | NVME_AEN_CFG_CCR_COMPLETE | \
> > + NVME_AEN_CFG_DISC_CHANGE)
> >
> > static void nvme_enable_aen(struct nvme_ctrl *ctrl)
> > {
> > @@ -4880,6 +4881,47 @@ static void nvme_get_fw_slot_info(struct nvme_ctrl *ctrl)
> > kfree(log);
> > }
> >
> > +static void nvme_ccr_work(struct work_struct *work)
> > +{
> > + struct nvme_ctrl *ctrl = container_of(work, struct nvme_ctrl, ccr_work);
> > + struct nvme_ccr_entry *ccr;
> > + struct nvme_ccr_log_entry *entry;
> > + struct nvme_ccr_log *log;
> > + unsigned long flags;
> > + int ret, i;
> > +
> > + log = kmalloc(sizeof(*log), GFP_KERNEL);
> > + if (!log)
> > + return;
> > +
> > + ret = nvme_get_log(ctrl, 0, NVME_LOG_CCR, 0x01,
> > + 0x00, log, sizeof(*log), 0);
> > + if (ret)
> > + goto out;
> > +
> > + spin_lock_irqsave(&ctrl->lock, flags);
> > + for (i = 0; i < le16_to_cpu(log->ne); i++) {
> > + entry = &log->entries[i];
> > + if (entry->ccrs == NVME_CCR_STATUS_IN_PROGRESS)
> > + continue;
> > +
> > + list_for_each_entry(ccr, &ctrl->ccr_list, list) {
> > + struct nvme_ctrl *ictrl = ccr->ictrl;
> > +
> > + if (ictrl->cntlid != le16_to_cpu(entry->icid) ||
> > + ictrl->ciu != entry->ciu)
> > + continue;
> > +
> > + /* Complete matching entry */
> > + ccr->ccrs = entry->ccrs;
> > + complete(&ccr->complete);
> > + }
> > + }
> > + spin_unlock_irqrestore(&ctrl->lock, flags);
> > +out:
> > + kfree(log);
> > +}
> > +
> > static void nvme_fw_act_work(struct work_struct *work)
> > {
> > struct nvme_ctrl *ctrl = container_of(work,
> > @@ -4956,6 +4998,9 @@ static bool nvme_handle_aen_notice(struct nvme_ctrl *ctrl, u32 result)
> > case NVME_AER_NOTICE_DISC_CHANGED:
> > ctrl->aen_result = result;
> > break;
> > + case NVME_AER_NOTICE_CCR_COMPLETED:
> > + queue_work(nvme_wq, &ctrl->ccr_work);
> > + break;
> > default:
> > dev_warn(ctrl->device, "async event result %08x\n", result);
> > }
> > @@ -5145,6 +5190,7 @@ void nvme_stop_ctrl(struct nvme_ctrl *ctrl)
> > nvme_stop_failfast_work(ctrl);
> > flush_work(&ctrl->async_event_work);
> > cancel_work_sync(&ctrl->fw_act_work);
> > + cancel_work_sync(&ctrl->ccr_work);
> > if (ctrl->ops->stop_ctrl)
> > ctrl->ops->stop_ctrl(ctrl);
> > }
> > @@ -5268,6 +5314,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
> > ctrl->quirks = quirks;
> > ctrl->numa_node = NUMA_NO_NODE;
> > INIT_WORK(&ctrl->scan_work, nvme_scan_work);
> > + INIT_WORK(&ctrl->ccr_work, nvme_ccr_work);
> > INIT_WORK(&ctrl->async_event_work, nvme_async_event_work);
> > INIT_WORK(&ctrl->fw_act_work, nvme_fw_act_work);
> > INIT_WORK(&ctrl->delete_work, nvme_delete_ctrl_work);
> > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> > index f3ab9411cac5..af6a4e83053e 100644
> > --- a/drivers/nvme/host/nvme.h
> > +++ b/drivers/nvme/host/nvme.h
> > @@ -365,6 +365,7 @@ struct nvme_ctrl {
> > struct nvme_effects_log *effects;
> > struct xarray cels;
> > struct work_struct scan_work;
> > + struct work_struct ccr_work;
> > struct work_struct async_event_work;
> > struct delayed_work ka_work;
> > struct delayed_work failfast_work;
>
> We really would need some indicator whether 'ccr' is supported at all.
Why do we need this indicator, other than exporting it via sysfs?
> Using the number of available CCR commands would be an option, if though
> that would require us to keep two counters (one for the number of
> possible outstanding CCRs, and one for the number of actual outstanding
> CCRs.).
Like mentioned above ctrl->ccr_limit gives us the number of ccrs
available now. It is not 100% indicator if CCR is supported or not, but
it is enough to implement CCR. A second counter can help us skip trying
CCR if we know impacted controller does not support it.
Do you think it is worth it?
Iterating over controllers in the subsystem is not that bad IMO. This is
similar to the point raised by James Smart [1].
1- https://lore.kernel.org/all/05875e07-b908-425a-ba6f-5e060e03241e@gmail.com/
next prev parent reply other threads:[~2026-02-17 18:26 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-14 4:25 [PATCH v3 00/21] TP8028 Rapid Path Failure Recovery Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 01/21] nvmet: Rapid Path Failure Recovery set controller identify fields Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 02/21] nvmet/debugfs: Export controller CIU and CIRN via debugfs Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 03/21] nvmet: Implement CCR nvme command Mohamed Khalfella
2026-02-27 16:30 ` Maurizio Lombardi
2026-03-25 18:52 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 04/21] nvmet: Implement CCR logpage Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 05/21] nvmet: Send an AEN on CCR completion Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 06/21] nvme: Rapid Path Failure Recovery read controller identify fields Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 07/21] nvme: Introduce FENCING and FENCED controller states Mohamed Khalfella
2026-02-16 12:33 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 08/21] nvme: Implement cross-controller reset recovery Mohamed Khalfella
2026-02-16 12:41 ` Hannes Reinecke
2026-02-17 18:35 ` Mohamed Khalfella
2026-02-26 2:37 ` Randy Jennings
2026-03-27 18:33 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 09/21] nvme: Implement cross-controller reset completion Mohamed Khalfella
2026-02-16 12:43 ` Hannes Reinecke
2026-02-17 18:25 ` Mohamed Khalfella [this message]
2026-02-18 7:51 ` Hannes Reinecke
2026-02-18 12:47 ` Mohamed Khalfella
2026-02-20 3:34 ` Randy Jennings
2026-02-14 4:25 ` [PATCH v3 10/21] nvme-tcp: Use CCR to recover controller that hits an error Mohamed Khalfella
2026-02-16 12:47 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 11/21] nvme-rdma: " Mohamed Khalfella
2026-02-16 12:47 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 12/21] nvme-fc: Decouple error recovery from controller reset Mohamed Khalfella
2026-02-28 0:12 ` James Smart
2026-03-26 2:37 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 13/21] nvme-fc: Use CCR to recover controller that hits an error Mohamed Khalfella
2026-02-28 1:03 ` James Smart
2026-03-26 17:40 ` Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 14/21] nvme-fc: Hold inflight requests while in FENCING state Mohamed Khalfella
2026-02-27 2:49 ` Randy Jennings
2026-02-28 1:10 ` James Smart
2026-02-14 4:25 ` [PATCH v3 15/21] nvme-fc: Do not cancel requests in io taget before it is initialized Mohamed Khalfella
2026-02-28 1:12 ` James Smart
2026-02-14 4:25 ` [PATCH v3 16/21] nvmet: Add support for CQT to nvme target Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 17/21] nvme: Add support for CQT to nvme host Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 18/21] nvme: Update CCR completion wait timeout to consider CQT Mohamed Khalfella
2026-02-16 12:54 ` Hannes Reinecke
2026-02-16 18:45 ` Mohamed Khalfella
2026-02-17 7:09 ` Hannes Reinecke
2026-02-17 15:35 ` Mohamed Khalfella
2026-02-20 1:22 ` James Smart
2026-02-20 2:11 ` Randy Jennings
2026-02-20 7:23 ` Hannes Reinecke
2026-02-20 2:01 ` Randy Jennings
2026-02-20 7:25 ` Hannes Reinecke
2026-02-27 3:05 ` Randy Jennings
2026-03-02 7:32 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 19/21] nvme-tcp: Extend FENCING state per TP4129 on CCR failure Mohamed Khalfella
2026-02-16 12:56 ` Hannes Reinecke
2026-02-17 17:58 ` Mohamed Khalfella
2026-02-18 8:26 ` Hannes Reinecke
2026-02-14 4:25 ` [PATCH v3 20/21] nvme-rdma: " Mohamed Khalfella
2026-02-14 4:25 ` [PATCH v3 21/21] nvme-fc: " Mohamed Khalfella
2026-02-28 1:20 ` James Smart
2026-03-25 19:07 ` Mohamed Khalfella
2026-04-01 13:33 ` [PATCH v3 00/21] TP8028 Rapid Path Failure Recovery Achkinazi, Igor
2026-04-01 16:37 ` Mohamed Khalfella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260217182554.GE3435530-mkhalfella@purestorage.com \
--to=mkhalfella@purestorage.com \
--cc=adailey@purestorage.com \
--cc=axboe@kernel.dk \
--cc=dgiani@purestorage.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jsmart833426@gmail.com \
--cc=justin.tee@broadcom.com \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=nareshgottumukkala83@gmail.com \
--cc=paul.ely@broadcom.com \
--cc=randyj@purestorage.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox