linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rfc 00/30] centralize nvme controller reset, delete and periodic reconnects
@ 2017-06-18 15:21 Sagi Grimberg
  2017-06-18 15:21 ` [PATCH rfc 01/30] nvme: Add admin connect request queue Sagi Grimberg
                   ` (30 more replies)
  0 siblings, 31 replies; 69+ messages in thread
From: Sagi Grimberg @ 2017-06-18 15:21 UTC (permalink / raw)
  To: linux-nvme; +Cc: Christoph Hellwig, Keith Busch, linux-block

We know for a some time now that ever since NVMe grew additional
transports, we should really look into centralizing lots of code
around controller resets, removals, and periodic reconnects for fabrics.

This series is the first attempt to move shared logic to nvme core.
Controller probe, reset and removal flows are completely driven
from nvme core while the various transports simply implement hooks
for alloc/free/start/stop queues, alloc/free tagsets, and some
sanity post-init checks. Similarly, nvme-fabrics lib drives periodic
reconnects and fabric error recovery.

In this set, rdma and loop drivers are fully converted to delegate
these flows to the nvme core. The implementation is incremental in
the sense that it adds the logic to nvme core but does not obligate
drivers to use it, so pci and fc drivers are left intact.

I tested rdma and loop stress reset, delete and fabric errors
during live IO and they seem to work (on poor vms though, thanks
Johannes for fixing up rxe ;)).

I've started looking into converting pci and fc, but I don't have
much time to make real progress at the moment. Assuming that the
scheme looks fine for everyone (big IFF), I'd like to ask the
community if it will be acceptable to merge this and incrementally
enhance it to accommodate pci and fc (pci is more of a challenge in
my PoV).

About the patch set itself, I sorta worked my way up from rdma.c to
make the relevant flows and routines generic by slowly removing
transport dependancies, then made the some routines controller ops,
and then moved some chunks of the code as-is to core.c and fabrics.c
respectively, this was mainly for debugging purposes. Each patch of it's
own, might not make perfect sense (and probably I didn't put too much
effort in their change logs). when we get closer to inclusion, we can
squash lots of these together if desired. 

Feedback is appreciated and highly needed!

As a side note, I also had a go with adding queues representation to the
nvme core (with proper states), but it seemed to be too far out there for
now... I'll consider proposing it as a follow up series.

Sagi Grimberg (30):
  nvme: Add admin connect request queue
  nvme-rdma: Don't alloc/free the tagset on reset
  nvme-rdma: reuse configure/destroy admin queue
  nvme-rdma: introduce configure/destroy io queues
  nvme-rdma: introduce nvme_rdma_start_queue
  nvme-rdma: rename nvme_rdma_init_queue to nvme_rdma_alloc_queue
  nvme-rdma: make stop/free queue receive a ctrl and qid struct
  nvme-rdma: cleanup error path in controller reset
  nvme: Move queue_count to the nvme_ctrl
  nvme: Add admin_tagset pointer to nvme_ctrl
  nvme: move controller cap to struct nvme_ctrl
  nvme-rdma: disable controller in reset instead of shutdown
  nvme-rdma: move queue LIVE/DELETING flags settings to queue routines
  nvme-rdma: stop queues instead of simply flipping their state
  nvme-rdma: don't check queue state for shutdown/disable
  nvme-rdma: move tagset allocation to a dedicated routine
  nvme-rdma: move admin specific resources to alloc_queue
  nvme-rdma: limit max_queues to rdma device number of completion
    vectors
  nvme-rdma: call ops->reg_read64 instead of nvmf_reg_read64
  nvme: add err, reconnect and delete work items to nvme core
  nvme-rdma: plumb nvme_ctrl down the calls tack
  nvme-rdma: Split create_ctrl to transport specific and generic parts
  nvme: add low level queue and tagset controller ops
  nvme-pci: rename to nvme_pci_configure_admin_queue
  nvme: move control plane handling to nvme core
  nvme-fabrics: handle reconnects in fabrics library
  nvme-loop: convert to nvme-core control plane management
  nvme: update tagset nr_hw_queues when reallocating io queues
  nvme: add sed-opal ctrl manipulation in admin configuration
  nvme: Add queue freeze/unfreeze handling on controller resets

 drivers/nvme/host/core.c    | 415 +++++++++++++++++++++++++
 drivers/nvme/host/fabrics.c | 104 ++++++-
 drivers/nvme/host/fabrics.h |   1 +
 drivers/nvme/host/fc.c      |  11 +-
 drivers/nvme/host/nvme.h    |  31 ++
 drivers/nvme/host/pci.c     |   4 +-
 drivers/nvme/host/rdma.c    | 741 ++++++++++----------------------------------
 drivers/nvme/target/loop.c  | 415 +++++++------------------
 8 files changed, 840 insertions(+), 882 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2017-07-10 18:57 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-18 15:21 [PATCH rfc 00/30] centralize nvme controller reset, delete and periodic reconnects Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 01/30] nvme: Add admin connect request queue Sagi Grimberg
2017-06-19  7:13   ` Christoph Hellwig
2017-06-19  7:49     ` Sagi Grimberg
2017-06-19 12:30       ` Christoph Hellwig
2017-06-19 15:56       ` Hannes Reinecke
2017-06-18 15:21 ` [PATCH rfc 02/30] nvme-rdma: Don't alloc/free the tagset on reset Sagi Grimberg
2017-06-19  7:18   ` Christoph Hellwig
2017-06-19  7:59     ` Sagi Grimberg
2017-06-19 12:35       ` Christoph Hellwig
2017-07-10 18:50     ` James Smart
2017-06-18 15:21 ` [PATCH rfc 03/30] nvme-rdma: reuse configure/destroy admin queue Sagi Grimberg
2017-06-19  7:20   ` Christoph Hellwig
2017-06-19  8:00     ` Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 04/30] nvme-rdma: introduce configure/destroy io queues Sagi Grimberg
2017-06-19 12:35   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 05/30] nvme-rdma: introduce nvme_rdma_start_queue Sagi Grimberg
2017-06-19 12:38   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 06/30] nvme-rdma: rename nvme_rdma_init_queue to nvme_rdma_alloc_queue Sagi Grimberg
2017-06-19 12:38   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 07/30] nvme-rdma: make stop/free queue receive a ctrl and qid struct Sagi Grimberg
2017-06-19 12:39   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 08/30] nvme-rdma: cleanup error path in controller reset Sagi Grimberg
2017-06-19 12:40   ` Christoph Hellwig
2017-07-10 18:57   ` James Smart
2017-06-18 15:21 ` [PATCH rfc 09/30] nvme: Move queue_count to the nvme_ctrl Sagi Grimberg
2017-06-19 12:41   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 10/30] nvme: Add admin_tagset pointer to nvme_ctrl Sagi Grimberg
2017-06-19 12:41   ` Christoph Hellwig
2017-06-19 13:58     ` Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 11/30] nvme: move controller cap to struct nvme_ctrl Sagi Grimberg
2017-06-19 12:42   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 12/30] nvme-rdma: disable controller in reset instead of shutdown Sagi Grimberg
2017-06-19 12:43   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 13/30] nvme-rdma: move queue LIVE/DELETING flags settings to queue routines Sagi Grimberg
2017-06-19 12:44   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 14/30] nvme-rdma: stop queues instead of simply flipping their state Sagi Grimberg
2017-06-19 12:44   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 15/30] nvme-rdma: don't check queue state for shutdown/disable Sagi Grimberg
2017-06-19 12:44   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 16/30] nvme-rdma: move tagset allocation to a dedicated routine Sagi Grimberg
2017-06-19 12:45   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 17/30] nvme-rdma: move admin specific resources to alloc_queue Sagi Grimberg
2017-06-19 12:46   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 18/30] nvme-rdma: limit max_queues to rdma device number of completion vectors Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 19/30] nvme-rdma: call ops->reg_read64 instead of nvmf_reg_read64 Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 20/30] nvme: add err, reconnect and delete work items to nvme core Sagi Grimberg
2017-06-19 12:49   ` Christoph Hellwig
2017-06-19 14:14     ` Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 21/30] nvme-rdma: plumb nvme_ctrl down the calls tack Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 22/30] nvme-rdma: Split create_ctrl to transport specific and generic parts Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 23/30] nvme: add low level queue and tagset controller ops Sagi Grimberg
2017-06-18 15:21 ` [PATCH rfc 24/30] nvme-pci: rename to nvme_pci_configure_admin_queue Sagi Grimberg
2017-06-19  7:20   ` Christoph Hellwig
2017-06-18 15:21 ` [PATCH rfc 25/30] nvme: move control plane handling to nvme core Sagi Grimberg
2017-06-19 12:55   ` Christoph Hellwig
2017-06-19 16:24     ` Sagi Grimberg
2017-06-18 15:22 ` [PATCH rfc 26/30] nvme-fabrics: handle reconnects in fabrics library Sagi Grimberg
2017-06-18 15:22 ` [PATCH rfc 27/30] nvme-loop: convert to nvme-core control plane management Sagi Grimberg
2017-06-18 15:22 ` [PATCH rfc 28/30] nvme: update tagset nr_hw_queues when reallocating io queues Sagi Grimberg
2017-06-19  7:21   ` Christoph Hellwig
2017-06-19  8:06     ` Ming Lei
2017-06-19 16:21       ` Sagi Grimberg
2017-06-18 15:22 ` [PATCH rfc 29/30] nvme: add sed-opal ctrl manipulation in admin configuration Sagi Grimberg
2017-06-19  7:22   ` Christoph Hellwig
2017-06-19  8:03     ` Sagi Grimberg
2017-06-19 12:55       ` Christoph Hellwig
2017-06-18 15:22 ` [PATCH rfc 30/30] nvme: Add queue freeze/unfreeze handling on controller resets Sagi Grimberg
2017-06-18 15:24 ` [PATCH rfc 00/30] centralize nvme controller reset, delete and periodic reconnects Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).