From: hch@lst.de (Christoph Hellwig)
Subject: [PATCH] nvme: avoid hang on inaccessible paths
Date: Wed, 30 May 2018 14:54:52 +0200 [thread overview]
Message-ID: <20180530125452.GA2763@lst.de> (raw)
In-Reply-To: <20180530143020.50a299b8@pentland.suse.de>
On Wed, May 30, 2018@02:30:20PM +0200, Hannes Reinecke wrote:
> > Now it might make sense to have a (configurable) timeout to give
> > up in all those case, and if my vague memory serves me right you
> > actually volunteered to implement that a while ago.
> >
> The problem is that this particular code path is triggered for the
> revalidate_disk() case; when opting for requeue (as the original code
> did) the system will hang during revalidate_disk(), via
So we add the first path, which in inaccessible. So yes, the I/O
should block for now (that is until we have a timeout for that)
> As this blocks the nvmf_dev_mutex we don't have any chance to connect
> the other, optimized path.
And I guess this where the real issue starts. We should not do block
I/O under nvmf_dev_mutex.
James has some work on asynchronous connects pending for FC, so I guess
we could look into generalizing that and always exectute the real connect
work in a different thread.
Or we could move create_ctrl outside the lock, which Johannes had
patch for a few days ago, although that didn't work as-is. I'll attach
my completely untested attempt at that below.
> I had been thinking of implementing that particular handling from the
> ANA spec, but that would mean we're adding an ANA TT delay for each
> inaccessible path, which with the current defaults would mean booting
> is delayed by 10 seconds per inaccessible path.
No. We should only retry on an inaccessible or change path if we
don't have an optimized or non-optimized path available.
---
>From bf729292705351639aa1eac1bde2e1afc25f11b7 Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn@suse.de>
Date: Wed, 30 May 2018 14:46:39 +0200
Subject: nvme: don't hold nvmf_transports_rwsem for more than transport
lookups
Only take nvmf_transports_rwsem when doing a lookup of registered
transports, so that a blocking ->create_ctrl doesn't prevent other
actions on /dev/nvme-fabrics.
Signed-off-by: Johannes Thumshirn <jthumshirn at suse.de>
[hch: increased lock hold time a bit to be safe, added a comment
and updated the changelog]
Signed-off-by: Christoph Hellwig <hch at lst.de>
---
drivers/nvme/host/fabrics.c | 3 ++-
drivers/nvme/host/fabrics.h | 3 +++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index aa318136460e..5f9618a2fd3d 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -969,6 +969,7 @@ nvmf_create_ctrl(struct device *dev, const char *buf, size_t count)
ret = -EBUSY;
goto out_unlock;
}
+ up_read(&nvmf_transports_rwsem);
ret = nvmf_check_required_opts(opts, ops->required_opts);
if (ret)
@@ -985,11 +986,11 @@ nvmf_create_ctrl(struct device *dev, const char *buf, size_t count)
}
module_put(ops->module);
- up_read(&nvmf_transports_rwsem);
return ctrl;
out_module_put:
module_put(ops->module);
+ goto out_free_opts;
out_unlock:
up_read(&nvmf_transports_rwsem);
out_free_opts:
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index ef46c915b7b5..d778f14bff20 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -124,6 +124,9 @@ struct nvmf_ctrl_options {
* 1. At minimum, 'required_opts' and 'allowed_opts' should
* be set to the same enum parsing options defined earlier.
* 2. create_ctrl() must be defined (even if it does nothing)
+ * 3. struct nvmf_transport_ops must be statically allocated in the
+ * modules .bss section so that a pure module_get on @module
+ * prevents the memory from beeing freed.
*/
struct nvmf_transport_ops {
struct list_head entry;
--
2.17.0
next prev parent reply other threads:[~2018-05-30 12:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-30 11:16 [PATCH] nvme: avoid hang on inaccessible paths Hannes Reinecke
2018-05-30 12:12 ` Christoph Hellwig
2018-05-30 12:30 ` Hannes Reinecke
2018-05-30 12:54 ` Christoph Hellwig [this message]
2018-05-30 23:10 ` Sagi Grimberg
2018-06-04 6:24 ` Hannes Reinecke
2018-06-04 6:37 ` Christoph Hellwig
2018-06-04 12:17 ` Sagi Grimberg
2018-06-04 12:56 ` Christoph Hellwig
2018-06-06 19:02 ` Popuri, Sriram
2018-06-07 5:54 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180530125452.GA2763@lst.de \
--to=hch@lst.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.