From: Nilay Shroff <nilay@linux.ibm.com>
To: Keith Busch <kbusch@meta.com>,
hch@lst.de, sagi@grimberg.me, linux-nvme@lists.infradead.org
Cc: Keith Busch <kbusch@kernel.org>, Gregory Joyce <gjoyce@ibm.com>
Subject: Re: [PATCHv2] nvme-pci: do not directly handle subsys reset fallout
Date: Tue, 25 Jun 2024 15:32:59 +0530 [thread overview]
Message-ID: <e4eb3d7c-e9d9-41fa-81cf-e3f248d3e0a8@linux.ibm.com> (raw)
In-Reply-To: <20240624172649.2268634-1-kbusch@meta.com>
Hi Keith,
On 6/24/24 22:56, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Scheduling reset_work after a nvme subsystem reset is expected to fail
> on pcie, but this also prevents potential handling from the platform's
> pcie services may provide that may successfully recovering the link
> without re-enumeration. Such examples include AER, DPC, and power's EEH.
>
> Provide a pci specific operation that safely initiates a subsystem
> reset, and instead of scheduling reset work, read back the status
> register to trigger a pcie read error.
>
> Since this only affects pci, the other fabrics drivers subscribe to a
> generic nvmf subsystem reset that is exactly the same as before. The
> loop fabric doesn't use it because nvmet doesn't support setting that
> property anyway.
>
> And since we're using the magic NSSR value in two places now, provide a
> symbolic define for it.
>
> Reported-by: Nilay Shroff <nilay@linux.ibm.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> Changes from v1, applied comments from Christoph:
>
> Split out and require new ctrl subsystem reset op
>
> Comments updated for correctness and clarity
>
> Use a symbolic name for the special reset value
>
> drivers/nvme/host/fabrics.c | 15 +++++++++++++++
> drivers/nvme/host/fabrics.h | 1 +
> drivers/nvme/host/fc.c | 1 +
> drivers/nvme/host/nvme.h | 14 +++-----------
> drivers/nvme/host/pci.c | 35 +++++++++++++++++++++++++++++++++++
> drivers/nvme/host/rdma.c | 1 +
> drivers/nvme/host/tcp.c | 1 +
> include/linux/nvme.h | 3 +++
> 8 files changed, 60 insertions(+), 11 deletions(-)
>
>
> /*
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 102a9fb0c65ff..6cb7acef05576 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1143,6 +1143,40 @@ static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
> spin_unlock(&nvmeq->sq_lock);
> }
>
> +static int nvme_pci_subsystem_reset(struct nvme_ctrl *ctrl)
> +{
> + struct nvme_dev *dev = to_nvme_dev(ctrl);
> + int ret = 0;
> +
> + /*
> + * Taking the shutdown_lock ensures the BAR mapping is not being
> + * altered by reset_work. Holding this lock before the RESETTING state
> + * change, if successful, also ensures nvme_remove won't be able to
> + * proceed to iounmap until we're done.
> + */
> + mutex_lock(&dev->shutdown_lock);
> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) {
> + ret = -EBUSY;
> + goto unlock;
> + }
> +
> + if (!dev->bar_mapped_size) {
> + ret = -ENODEV;
> + goto unlock;
> + }
> +
> + writel(NVME_SUBSYS_RESET, dev->bar + NVME_REG_NSSR);
> +
> + /*
> + * Read controller status to flush the previous write and trigger a
> + * pcie read error.
> + */
> + readl(dev->bar + NVME_REG_CSTS);
> +unlock:
> + mutex_unlock(&dev->shutdown_lock);
> + return ret;
> +}
I noticed that in your earlier proposal we were changing the NVMe controller
state from RESETTING to LIVE just after write to NVME_REG_NSSR. In this patch,
you removed that code and due to it now pci error recovery code (nvme_error_detected())
couldn't change the state of controller to RESETTING and that causes the pci error
recovery to immediately bail out without recovering the NVMe adapter.
I think we need to change state of controller to LIVE after write to NVME_REG_NSSR.
So was there any specific reason you removed that code or was it missed out by mistake?
Thanks,
--Nilay
next prev parent reply other threads:[~2024-06-25 10:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-24 17:26 [PATCHv2] nvme-pci: do not directly handle subsys reset fallout Keith Busch
2024-06-24 17:36 ` Christoph Hellwig
2024-06-24 19:20 ` Keith Busch
2024-06-25 10:02 ` Nilay Shroff [this message]
2024-06-25 16:02 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e4eb3d7c-e9d9-41fa-81cf-e3f248d3e0a8@linux.ibm.com \
--to=nilay@linux.ibm.com \
--cc=gjoyce@ibm.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=kbusch@meta.com \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.