public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Mohamed Khalfella <mkhalfella@purestorage.com>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Chaitanya Kulkarni <kch@nvidia.com>,
	Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	Keith Busch <kbusch@kernel.org>,
	Aaron Dailey <adailey@purestorage.com>,
	Randy Jennings <randyj@purestorage.com>,
	John Meneghini <jmeneghi@redhat.com>,
	Hannes Reinecke <hare@suse.de>,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 05/14] nvmet: Send an AEN on CCR completion
Date: Thu, 25 Dec 2025 10:13:13 -0800	[thread overview]
Message-ID: <20251225181313.GC8129-mkhalfella@purestorage.com> (raw)
In-Reply-To: <13c2bf9c-6eac-4c48-b12c-f76e86c8ef32@grimberg.me>

On Thu 2025-12-25 15:23:51 +0200, Sagi Grimberg wrote:
> 
> 
> On 26/11/2025 4:11, Mohamed Khalfella wrote:
> > Send an AEN to initiator when impacted controller exists. The
> > notification points to CCR log page that initiator can read to check
> > which CCR operation completed.
> >
> > Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
> > ---
> >   drivers/nvme/target/core.c  | 27 +++++++++++++++++++++++----
> >   drivers/nvme/target/nvmet.h |  3 ++-
> >   include/linux/nvme.h        |  3 +++
> >   3 files changed, 28 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
> > index 7dbe9255ff42..60173833c3eb 100644
> > --- a/drivers/nvme/target/core.c
> > +++ b/drivers/nvme/target/core.c
> > @@ -202,7 +202,7 @@ static void nvmet_async_event_work(struct work_struct *work)
> >   	nvmet_async_events_process(ctrl);
> >   }
> >   
> > -void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
> > +static void nvmet_add_async_event_locked(struct nvmet_ctrl *ctrl, u8 event_type,
> >   		u8 event_info, u8 log_page)
> >   {
> >   	struct nvmet_async_event *aen;
> > @@ -215,12 +215,17 @@ void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
> >   	aen->event_info = event_info;
> >   	aen->log_page = log_page;
> >   
> > -	mutex_lock(&ctrl->lock);
> >   	list_add_tail(&aen->entry, &ctrl->async_events);
> > -	mutex_unlock(&ctrl->lock);
> >   
> >   	queue_work(nvmet_wq, &ctrl->async_event_work);
> >   }
> > +void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
> > +		u8 event_info, u8 log_page)
> > +{
> > +	mutex_lock(&ctrl->lock);
> > +	nvmet_add_async_event_locked(ctrl, event_type, event_info, log_page);
> > +	mutex_unlock(&ctrl->lock);
> > +}
> >   
> >   static void nvmet_add_to_changed_ns_log(struct nvmet_ctrl *ctrl, __le32 nsid)
> >   {
> > @@ -1788,6 +1793,18 @@ struct nvmet_ctrl *nvmet_alloc_ctrl(struct nvmet_alloc_ctrl_args *args)
> >   }
> >   EXPORT_SYMBOL_GPL(nvmet_alloc_ctrl);
> >   
> > +static void nvmet_ctrl_notify_ccr(struct nvmet_ctrl *ctrl)
> > +{
> > +	lockdep_assert_held(&ctrl->lock);
> > +
> > +	if (nvmet_aen_bit_disabled(ctrl, NVME_AEN_BIT_CCR_COMPLETE))
> > +		return;
> > +
> > +	nvmet_add_async_event_locked(ctrl, NVME_AER_NOTICE,
> > +				     NVME_AER_NOTICE_CCR_COMPLETED,
> > +				     NVME_LOG_CCR);
> > +}
> > +
> >   static void nvmet_ctrl_complete_pending_ccr(struct nvmet_ctrl *ctrl)
> >   {
> >   	struct nvmet_subsys *subsys = ctrl->subsys;
> > @@ -1801,8 +1818,10 @@ static void nvmet_ctrl_complete_pending_ccr(struct nvmet_ctrl *ctrl)
> >   	list_for_each_entry(sctrl, &subsys->ctrls, subsys_entry) {
> >   		mutex_lock(&sctrl->lock);
> >   		list_for_each_entry(ccr, &sctrl->ccrs, entry) {
> > -			if (ccr->ctrl == ctrl)
> > +			if (ccr->ctrl == ctrl) {
> > +				nvmet_ctrl_notify_ccr(sctrl);
> >   				ccr->ctrl = NULL;
> > +			}
> 
> Is this double loop necessary? Would you have more than one controller 
> cross resetting the same

As it is implemented now CCRs are linked to sctrl. This decision can be
revisited if found suboptimal. At some point I had CCRs linked to
ctrl->subsys but that led to lock ordering issues. Double loop is
necessary to find all CCRs in all controllers and mark them done.
Yes, it is possible to have more than one sctrl resetting the same
ictrl.

> controller? Won't it be better to install a callback+opaque that the 
> controller removal will call?

Can you elaborate more on that? Better in what terms?

nvmet_ctrl_complete_pending_ccr() is called from nvmet_ctrl_free() when
we know that ctrl->ref is zero and no new CCRs will be added to this
controller because nvmet_ctrl_find_get_ccr() will not be able to get it.


  reply	other threads:[~2025-12-25 18:13 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-26  2:11 [RFC PATCH 00/14] TP8028 Rapid Path Failure Recovery Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 01/14] nvmet: Rapid Path Failure Recovery set controller identify fields Mohamed Khalfella
2025-12-16  1:35   ` Randy Jennings
2025-11-26  2:11 ` [RFC PATCH 02/14] nvmet/debugfs: Add ctrl uniquifier and random values Mohamed Khalfella
2025-12-16  1:43   ` Randy Jennings
2025-11-26  2:11 ` [RFC PATCH 03/14] nvmet: Implement CCR nvme command Mohamed Khalfella
2025-12-16  3:01   ` Randy Jennings
2025-12-31 21:14     ` Mohamed Khalfella
2025-12-25 13:14   ` Sagi Grimberg
2025-12-25 17:33     ` Mohamed Khalfella
2025-12-27  9:39       ` Sagi Grimberg
2025-12-31 21:35         ` Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 04/14] nvmet: Implement CCR logpage Mohamed Khalfella
2025-12-16  3:11   ` Randy Jennings
2025-11-26  2:11 ` [RFC PATCH 05/14] nvmet: Send an AEN on CCR completion Mohamed Khalfella
2025-12-16  3:31   ` Randy Jennings
2025-12-25 13:23   ` Sagi Grimberg
2025-12-25 18:13     ` Mohamed Khalfella [this message]
2025-12-27  9:48       ` Sagi Grimberg
2025-12-31 22:00         ` Mohamed Khalfella
2026-01-04 21:09           ` Sagi Grimberg
2026-01-07  2:58             ` Randy Jennings
2026-01-30 22:31             ` Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 06/14] nvme: Rapid Path Failure Recovery read controller identify fields Mohamed Khalfella
2025-12-18 15:22   ` Randy Jennings
2025-12-31 22:26     ` Mohamed Khalfella
2026-01-02 19:06       ` Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 07/14] nvme: Add RECOVERING nvme controller state Mohamed Khalfella
2025-12-18 23:18   ` Randy Jennings
2025-12-19  1:39     ` Randy Jennings
2025-12-25 13:29   ` Sagi Grimberg
2025-12-25 17:17     ` Mohamed Khalfella
2025-12-27  9:52       ` Sagi Grimberg
2025-12-31 22:45         ` Mohamed Khalfella
2025-12-27  9:55       ` Sagi Grimberg
2025-12-31 22:36         ` Mohamed Khalfella
2025-12-31 23:04           ` Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 08/14] nvme: Implement cross-controller reset recovery Mohamed Khalfella
2025-12-19  1:21   ` Randy Jennings
2025-12-27 10:14   ` Sagi Grimberg
2025-12-31  0:04     ` Randy Jennings
2026-01-04 21:14       ` Sagi Grimberg
2026-01-07  3:16         ` Randy Jennings
2025-12-31 23:43     ` Mohamed Khalfella
2026-01-04 21:39       ` Sagi Grimberg
2026-01-30 22:01         ` Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 09/14] nvme: Implement cross-controller reset completion Mohamed Khalfella
2025-12-19  1:31   ` Randy Jennings
2025-12-27 10:24   ` Sagi Grimberg
2025-12-31 23:51     ` Mohamed Khalfella
2026-01-04 21:15       ` Sagi Grimberg
2026-01-30 22:32         ` Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 10/14] nvme-tcp: Use CCR to recover controller that hits an error Mohamed Khalfella
2025-12-19  2:06   ` Randy Jennings
2026-01-01  0:04     ` Mohamed Khalfella
2025-12-27 10:35   ` Sagi Grimberg
2025-12-31  0:13     ` Randy Jennings
2026-01-04 21:19       ` Sagi Grimberg
2026-01-01  0:27     ` Mohamed Khalfella
2025-11-26  2:11 ` [RFC PATCH 11/14] nvme-rdma: " Mohamed Khalfella
2025-12-19  2:16   ` Randy Jennings
2025-12-27 10:36   ` Sagi Grimberg
2025-11-26  2:11 ` [RFC PATCH 12/14] nvme-fc: Decouple error recovery from controller reset Mohamed Khalfella
2025-12-19  2:59   ` Randy Jennings
2025-11-26  2:12 ` [RFC PATCH 13/14] nvme-fc: Use CCR to recover controller that hits an error Mohamed Khalfella
2025-12-20  1:21   ` Randy Jennings
2025-11-26  2:12 ` [RFC PATCH 14/14] nvme-fc: Hold inflight requests while in RECOVERING state Mohamed Khalfella
2025-12-20  1:44   ` Randy Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251225181313.GC8129-mkhalfella@purestorage.com \
    --to=mkhalfella@purestorage.com \
    --cc=adailey@purestorage.com \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jmeneghi@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=randyj@purestorage.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox