Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: snitzer@redhat.com (Mike Snitzer)
Subject: [PATCH 4/7] nvme: implement multipath access to nvme subsystems
Date: Thu, 9 Nov 2017 16:22:17 -0500	[thread overview]
Message-ID: <20171109212217.GA16454@redhat.com> (raw)
In-Reply-To: <20171109174450.17142-5-hch@lst.de>

On Thu, Nov 09 2017 at 12:44pm -0500,
Christoph Hellwig <hch@lst.de> wrote:

> This patch adds native multipath support to the nvme driver.  For each
> namespace we create only single block device node, which can be used
> to access that namespace through any of the controllers that refer to it.
> The gendisk for each controllers path to the name space still exists
> inside the kernel, but is hidden from userspace.  The character device
> nodes are still available on a per-controller basis.  A new link from
> the sysfs directory for the subsystem allows to find all controllers
> for a given subsystem.
> 
> Currently we will always send I/O to the first available path, this will
> be changed once the NVMe Asynchronous Namespace Access (ANA) TP is
> ratified and implemented, at which point we will look at the ANA state
> for each namespace.  Another possibility that was prototyped is to
> use the path that is closes to the submitting NUMA code, which will be
> mostly interesting for PCI, but might also be useful for RDMA or FC
> transports in the future.  There is not plan to implement round robin
> or I/O service time path selectors, as those are not scalable with
> the performance rates provided by NVMe.
> 
> The multipath device will go away once all paths to it disappear,
> any delay to keep it alive needs to be implemented at the controller
> level.
> 
> Signed-off-by: Christoph Hellwig <hch at lst.de>

Your 0th header speaks to the NVMe multipath IO path leveraging NVMe's
lack of partial completion but I think it'd be useful to have this
header (that actually gets committed) speak to it.

> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> new file mode 100644
> index 000000000000..062754ebebfd
> --- /dev/null
> +++ b/drivers/nvme/host/multipath.c
...
> +void nvme_failover_req(struct request *req)
> +{
> +	struct nvme_ns *ns = req->q->queuedata;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&ns->head->requeue_lock, flags);
> +	blk_steal_bios(&ns->head->requeue_list, req);
> +	spin_unlock_irqrestore(&ns->head->requeue_lock, flags);
> +	blk_mq_end_request(req, 0);
> +
> +	nvme_reset_ctrl(ns->ctrl);
> +	kblockd_schedule_work(&ns->head->requeue_work);
> +}

Also, the block core patch to introduce blk_steal_bios() already went in
but should there be a QUEUE_FLAG that gets set by drivers like NVMe that
don't support partial completion?

This would make it easier for other future drivers to know whether they
can use a more optimized IO path.

Mike

  parent reply	other threads:[~2017-11-09 21:22 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-09 17:44 nvme multipath support V7 Christoph Hellwig
2017-11-09 17:44 ` [PATCH 1/7] nvme: track subsystems Christoph Hellwig
2017-11-09 20:23   ` Martin K. Petersen
2017-11-09 17:44 ` [PATCH 2/7] nvme: introduce a nvme_ns_ids structure Christoph Hellwig
2017-11-09 20:25   ` Martin K. Petersen
2017-11-09 17:44 ` [PATCH 3/7] nvme: track shared namespaces Christoph Hellwig
2017-11-09 20:28   ` Martin K. Petersen
2017-11-09 17:44 ` [PATCH 4/7] nvme: implement multipath access to nvme subsystems Christoph Hellwig
2017-11-09 18:17   ` Keith Busch
2017-11-09 20:32   ` Martin K. Petersen
2017-11-09 21:21   ` Keith Busch
2017-11-10  4:52     ` Christoph Hellwig
2017-11-10  5:07       ` Christoph Hellwig
2017-11-09 21:22   ` Mike Snitzer [this message]
2017-11-10  4:54     ` Christoph Hellwig
2017-11-10  7:27   ` Hannes Reinecke
2017-11-09 17:44 ` [PATCH 5/7] nvme: also expose the namespace identification sysfs files for mpath nodes Christoph Hellwig
2017-11-09 20:33   ` Martin K. Petersen
2017-11-10  8:21   ` Hannes Reinecke
2017-11-09 17:44 ` [PATCH 6/7] block: create 'slaves' and 'holders' entries for hidden gendisks Christoph Hellwig
2017-11-09 20:34   ` Martin K. Petersen
2017-11-09 17:44 ` [PATCH 7/7] nvme: create 'slaves' and 'holders' entries for hidden controllers Christoph Hellwig
2017-11-09 20:34   ` Martin K. Petersen
2017-11-10  8:44 ` nvme multipath support V7 Christoph Hellwig
2018-04-10 19:32 ` Gruher, Joseph R

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171109212217.GA16454@redhat.com \
    --to=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox