public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v0 0/6] nvme-fc: fix blktests nvme/041
@ 2024-02-16  8:45 Daniel Wagner
  2024-02-16  8:45 ` [PATCH v0 1/6] nvme-fabrics: introduce connect_sync option Daniel Wagner
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Daniel Wagner @ 2024-02-16  8:45 UTC (permalink / raw)
  To: James Smart
  Cc: Keith Busch, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke,
	linux-nvme, linux-kernel, Daniel Wagner

After the target side is working with blktests and blktest is also able
to deal with the FC transport, it's time to address the fallouts on the host
side. As first step, let's fix the failing nvme/041 tests.

As we arleady discussed, the main issue here is that FC transport is deferring
the connect attempt to a workqueue. The other fabric transport don't do this.
And all blktests expect that the 'nvme connect' call is synchronous.

Initially, I just added the completion and waited on connect to succeed or fail.
But this triggered a lot of UAFs. After banging my head on this problem for a
while I decided to replace the complete ref counting strategy.

With this new approach all execept nvme/048 are passing and no UAFs or other
troubles observed. I also tested with real hardware (lpfc, qla2xxx), though I
don't have a way to trigger all sorts of transport errors which would be
interesting to see if my patches are breaking anything. 

I think there is still on problem left in the module exit code path. The cleanup
function iterates over the ctrl list storred in the rport object. The delete
code path is not atomic and removes the controller from the list somewhere in
the delete path. Thus this races with the module unload, IMO. we could just
maintain a list of controllers which is protected a lock as we have in tcp/rdma.

Daniel Wagner (6):
  nvme-fabrics: introduce connect_sync option
  nvme-fc: rename free_ctrl callback to match name pattern
  nvme-fc: do not retry when auth fails or connection is refused
  nvme-fabrics: introduce ref counting for nvmf_ctrl_options
  nvme-fc: redesign locking and refcounting
  nvme-fc: wait for connect attempt to finish

 drivers/nvme/host/fabrics.c |  28 +++++-
 drivers/nvme/host/fabrics.h |   9 +-
 drivers/nvme/host/fc.c      | 180 ++++++++++++++++--------------------
 drivers/nvme/host/rdma.c    |  18 +++-
 drivers/nvme/host/tcp.c     |  21 +++--
 drivers/nvme/target/loop.c  |  19 ++--
 6 files changed, 150 insertions(+), 125 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-02-20  6:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-16  8:45 [PATCH v0 0/6] nvme-fc: fix blktests nvme/041 Daniel Wagner
2024-02-16  8:45 ` [PATCH v0 1/6] nvme-fabrics: introduce connect_sync option Daniel Wagner
2024-02-16  9:49   ` Christoph Hellwig
2024-02-16 16:44     ` Daniel Wagner
2024-02-20  6:51       ` Christoph Hellwig
2024-02-17 16:27     ` Hannes Reinecke
2024-02-16  8:45 ` [PATCH v0 2/6] nvme-fc: rename free_ctrl callback to match name pattern Daniel Wagner
2024-02-16  9:49   ` Christoph Hellwig
2024-02-16  8:45 ` [PATCH v0 3/6] nvme-fc: do not retry when auth fails or connection is refused Daniel Wagner
2024-02-16  9:49   ` Christoph Hellwig
2024-02-16  8:45 ` [PATCH v0 4/6] nvme-fabrics: introduce ref counting for nvmf_ctrl_options Daniel Wagner
2024-02-16  9:50   ` Christoph Hellwig
2024-02-16  8:45 ` [PATCH v0 5/6] nvme-fc: redesign locking and refcounting Daniel Wagner
2024-02-16  9:51   ` Christoph Hellwig
2024-02-16 11:09   ` Hannes Reinecke
2024-02-16 12:40     ` Daniel Wagner
2024-02-16  8:45 ` [PATCH v0 6/6] nvme-fc: wait for connect attempt to finish Daniel Wagner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox