Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: mlin@kernel.org (Ming Lin)
Subject: [PATCH v2 1/2] nvme: switch to RCU freeing the namespace
Date: Wed, 18 May 2016 22:52:26 -0700	[thread overview]
Message-ID: <1463637146.8610.4.camel@kernel.org> (raw)
In-Reply-To: <20160517212542.GC19325@localhost.localdomain>

On Tue, 2016-05-17@17:25 -0400, Keith Busch wrote:
> On Tue, May 17, 2016@02:09:34PM -0700, Ming Lin wrote:
> > On Tue, May 17, 2016 at 2:07 PM, Keith Busch <keith.busch at intel.com
> > > wrote:
> > > Great, thanks. I was getting good results with this as well.
> > 
> > Thanks for the fix. Could you send the patch formally?
> 
> Will do. Sending shortly.
> 
> > > Bummer. Is your controller using sparse namespaces? The kernel
> > > message
> > > before the bug appears to indicate that.
> > 
> > No. Only 1 namespace.
> 
> Something must be corrupt then. The below line shows an unallocated
> namespace is failing to identify itself, but if you only report 1 ns,
> we shouldn't have been able to get here from a simple nvme reset.
> 
> I think your resets are occuring faster than we anticipated and
> you've
> uncovered another bug. It looks like these may cause trouble if reset
> occurs during active scan work.

I have not found the root cause yet.
Below patch makes reset not occur during active scan work.
And I didn't see the crash any more with this patch.

So it seems there is a race somewhere between reset work and scan work.

?drivers/nvme/host/core.c | 13 ++++++++++++-
?drivers/nvme/host/nvme.h |??1 +
?drivers/nvme/host/pci.c??|??3 +++
?3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a57ccd3..8560774 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -89,6 +89,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
?		case NVME_CTRL_NEW:
?		case NVME_CTRL_RESETTING:
?		case NVME_CTRL_RECONNECTING:
+		case NVME_CTRL_SCANING:
?			changed = true;
?			/* FALLTHRU */
?		default:
@@ -126,6 +127,14 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
?			break;
?		}
?		break;
+	case NVME_CTRL_SCANING:
+		switch (old_state) {
+		case NVME_CTRL_LIVE:
+			changed = true;
+			/* FALLTHRU */
+		default:
+			break;
+		}
?	default:
?		break;
?	}
@@ -1755,7 +1764,7 @@ static void nvme_scan_work(struct work_struct *work)
?	struct nvme_id_ctrl *id;
?	unsigned nn;
?
-	if (ctrl->state != NVME_CTRL_LIVE)
+	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_SCANING))
?		return;
?
?	if (nvme_identify_ctrl(ctrl, &id))
@@ -1776,6 +1785,8 @@ static void nvme_scan_work(struct work_struct *work)
?
?	if (ctrl->ops->post_scan)
?		ctrl->ops->post_scan(ctrl);
+
+	nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE);
?}
?
?void nvme_queue_scan(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 3f3945a..2827825 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -76,6 +76,7 @@ enum nvme_ctrl_state {
?	NVME_CTRL_RESETTING,
?	NVME_CTRL_RECONNECTING,
?	NVME_CTRL_DELETING,
+	NVME_CTRL_SCANING,
?};
?
?struct nvme_ctrl {
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 02105da..71260c8 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1761,6 +1761,9 @@ static void nvme_reset_work(struct work_struct *work)
?	struct nvme_dev *dev = container_of(work, struct nvme_dev, reset_work);
?	int result = -ENODEV;
?
+	if (dev->ctrl.state == NVME_CTRL_SCANING)
+		return;
+
?	if (WARN_ON(dev->ctrl.state == NVME_CTRL_RESETTING))
?		goto out;
?

  reply	other threads:[~2016-05-19  5:52 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-25 21:20 [PATCH v2 0/2] nvme_ns_remove() deadlock fix Ming Lin
2016-04-25 21:20 ` [PATCH v2 1/2] nvme: switch to RCU freeing the namespace Ming Lin
2016-04-26  8:41   ` Christoph Hellwig
2016-04-26 21:17   ` Sagi Grimberg
2016-05-15  6:58   ` Ming Lin
2016-05-16 22:38     ` Ming Lin
2016-05-17 15:05       ` Keith Busch
2016-05-17 15:23         ` Keith Busch
2016-05-17 15:30           ` Keith Busch
2016-05-17 20:48             ` Ming Lin
2016-05-17 21:07               ` Keith Busch
2016-05-17 21:09                 ` Ming Lin
2016-05-17 21:25                   ` Keith Busch
2016-05-19  5:52                     ` Ming Lin [this message]
2016-05-19 20:48                       ` Keith Busch
2016-05-20 14:16                         ` Keith Busch
2016-05-20 17:57                           ` Ming Lin
2016-05-23 10:38                           ` Christoph Hellwig
2016-05-23 15:22                             ` Keith Busch
2016-04-25 21:20 ` [PATCH v2 2/2] nvme: fix nvme_ns_remove() deadlock Ming Lin
2016-04-26  8:42   ` Christoph Hellwig
2016-04-26 21:17   ` Sagi Grimberg
2016-04-26 15:39 ` [PATCH v2 0/2] nvme_ns_remove() deadlock fix Keith Busch
2016-05-02 15:16 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1463637146.8610.4.camel@kernel.org \
    --to=mlin@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox