From: mlin@kernel.org (Ming Lin)
Subject: [PATCH v2 1/2] nvme: switch to RCU freeing the namespace
Date: Wed, 18 May 2016 22:52:26 -0700 [thread overview]
Message-ID: <1463637146.8610.4.camel@kernel.org> (raw)
In-Reply-To: <20160517212542.GC19325@localhost.localdomain>
On Tue, 2016-05-17@17:25 -0400, Keith Busch wrote:
> On Tue, May 17, 2016@02:09:34PM -0700, Ming Lin wrote:
> > On Tue, May 17, 2016 at 2:07 PM, Keith Busch <keith.busch at intel.com
> > > wrote:
> > > Great, thanks. I was getting good results with this as well.
> >
> > Thanks for the fix. Could you send the patch formally?
>
> Will do. Sending shortly.
>
> > > Bummer. Is your controller using sparse namespaces? The kernel
> > > message
> > > before the bug appears to indicate that.
> >
> > No. Only 1 namespace.
>
> Something must be corrupt then. The below line shows an unallocated
> namespace is failing to identify itself, but if you only report 1 ns,
> we shouldn't have been able to get here from a simple nvme reset.
>
> I think your resets are occuring faster than we anticipated and
> you've
> uncovered another bug. It looks like these may cause trouble if reset
> occurs during active scan work.
I have not found the root cause yet.
Below patch makes reset not occur during active scan work.
And I didn't see the crash any more with this patch.
So it seems there is a race somewhere between reset work and scan work.
?drivers/nvme/host/core.c | 13 ++++++++++++-
?drivers/nvme/host/nvme.h |??1 +
?drivers/nvme/host/pci.c??|??3 +++
?3 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a57ccd3..8560774 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -89,6 +89,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
? case NVME_CTRL_NEW:
? case NVME_CTRL_RESETTING:
? case NVME_CTRL_RECONNECTING:
+ case NVME_CTRL_SCANING:
? changed = true;
? /* FALLTHRU */
? default:
@@ -126,6 +127,14 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
? break;
? }
? break;
+ case NVME_CTRL_SCANING:
+ switch (old_state) {
+ case NVME_CTRL_LIVE:
+ changed = true;
+ /* FALLTHRU */
+ default:
+ break;
+ }
? default:
? break;
? }
@@ -1755,7 +1764,7 @@ static void nvme_scan_work(struct work_struct *work)
? struct nvme_id_ctrl *id;
? unsigned nn;
?
- if (ctrl->state != NVME_CTRL_LIVE)
+ if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_SCANING))
? return;
?
? if (nvme_identify_ctrl(ctrl, &id))
@@ -1776,6 +1785,8 @@ static void nvme_scan_work(struct work_struct *work)
?
? if (ctrl->ops->post_scan)
? ctrl->ops->post_scan(ctrl);
+
+ nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE);
?}
?
?void nvme_queue_scan(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 3f3945a..2827825 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -76,6 +76,7 @@ enum nvme_ctrl_state {
? NVME_CTRL_RESETTING,
? NVME_CTRL_RECONNECTING,
? NVME_CTRL_DELETING,
+ NVME_CTRL_SCANING,
?};
?
?struct nvme_ctrl {
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 02105da..71260c8 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1761,6 +1761,9 @@ static void nvme_reset_work(struct work_struct *work)
? struct nvme_dev *dev = container_of(work, struct nvme_dev, reset_work);
? int result = -ENODEV;
?
+ if (dev->ctrl.state == NVME_CTRL_SCANING)
+ return;
+
? if (WARN_ON(dev->ctrl.state == NVME_CTRL_RESETTING))
? goto out;
?
next prev parent reply other threads:[~2016-05-19 5:52 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-25 21:20 [PATCH v2 0/2] nvme_ns_remove() deadlock fix Ming Lin
2016-04-25 21:20 ` [PATCH v2 1/2] nvme: switch to RCU freeing the namespace Ming Lin
2016-04-26 8:41 ` Christoph Hellwig
2016-04-26 21:17 ` Sagi Grimberg
2016-05-15 6:58 ` Ming Lin
2016-05-16 22:38 ` Ming Lin
2016-05-17 15:05 ` Keith Busch
2016-05-17 15:23 ` Keith Busch
2016-05-17 15:30 ` Keith Busch
2016-05-17 20:48 ` Ming Lin
2016-05-17 21:07 ` Keith Busch
2016-05-17 21:09 ` Ming Lin
2016-05-17 21:25 ` Keith Busch
2016-05-19 5:52 ` Ming Lin [this message]
2016-05-19 20:48 ` Keith Busch
2016-05-20 14:16 ` Keith Busch
2016-05-20 17:57 ` Ming Lin
2016-05-23 10:38 ` Christoph Hellwig
2016-05-23 15:22 ` Keith Busch
2016-04-25 21:20 ` [PATCH v2 2/2] nvme: fix nvme_ns_remove() deadlock Ming Lin
2016-04-26 8:42 ` Christoph Hellwig
2016-04-26 21:17 ` Sagi Grimberg
2016-04-26 15:39 ` [PATCH v2 0/2] nvme_ns_remove() deadlock fix Keith Busch
2016-05-02 15:16 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463637146.8610.4.camel@kernel.org \
--to=mlin@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.