public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC v1 0/4] nvmet-fc blktests & autoconnect fixes
@ 2023-08-29  9:13 Daniel Wagner
  2023-08-29  9:13 ` [RFC v1 1/4] nvmet-trace: avoid dereferencing pointer too early Daniel Wagner
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Daniel Wagner @ 2023-08-29  9:13 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Hannes Reinecke, Sagi Grimberg, Jason Gunthorpe,
	James Smart, Chaitanya Kulkarni, Christoph Hellwig, Daniel Wagner

Currently, blktests will pass with the patches [1] and the revert of
[2]. This is possible because blktests is still disables the
nvmf-autoconnect auto connect service [3].

As I previously reported, blktests is able to trigger various kernel
panics with the system auto-connect running in the background. Let's try
to fix these problems.

The first two patches are fixing nvmet ftrace infrastructure. I think
they could go in right now.

The third patch changes the way the refcounting for association and
queues is done. There is a cycling dependency between these two objects
and this makes the shutdown path very complex and error prone. As the
life time of the queues is coupled to the association, I decided to drop
the refcounting of the queues and only rely on the refcounts of the
association. This made the code a bit simpler to follow and also allowed
to cleanup path to split into two halfs. The first one is to remove the
association from the association RCU list and wait for an grace period
so we know that now new I/Os will enter any queues. Then we drop the
refcounts and then actually remove any resources when the refcount drops
to 0 (all in-flight I/O has been processed). nvme/003 is particular good
in triggering crashes in this path.

nvme/005 is triggering crashes in get discovery log page. The req->port
pointer was never assign a valid pointer. This looks like there is way
to have no port entry binding (remember we have the external autoconnect
running in background).

Unfortunately, there are still some more fallouts, but I though I post
these patches now when my memory is fresh if there are any questions.

[1] https://lore.kernel.org/linux-nvme/sgoyzwj6ckrdrpq22u6fhtcemul5rqj6de4l5gw73vz77o3ils@vmv3jue4rom7/
[2] linux: ee6fdc5055e9 ("nvme-fc: fix race between error recovery and creating association")
[3] blktests: 0478dce70696 ("nvme/rc: Avoid triggering host nvme-cli autoconnect")

Daniel Wagner (4):
  nvmet-trace: avoid dereferencing pointer too early
  nvmet-trace: null terminate device name string correctly
  nvmet-fc: untangle cross refcounting objects
  nvmet-discovery: do not use invalid port

 drivers/nvme/target/discovery.c |  9 +++++
 drivers/nvme/target/fc.c        | 67 ++++++++++++++++-----------------
 drivers/nvme/target/trace.c     |  6 +--
 drivers/nvme/target/trace.h     | 28 +++++++-------
 4 files changed, 60 insertions(+), 50 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-09-13 11:58 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-29  9:13 [RFC v1 0/4] nvmet-fc blktests & autoconnect fixes Daniel Wagner
2023-08-29  9:13 ` [RFC v1 1/4] nvmet-trace: avoid dereferencing pointer too early Daniel Wagner
2023-09-05  6:48   ` Christoph Hellwig
2023-09-05  8:24     ` Daniel Wagner
2023-09-05  8:33       ` Christoph Hellwig
2023-09-06 11:00   ` Hannes Reinecke
2023-08-29  9:13 ` [RFC v1 2/4] nvmet-trace: null terminate device name string correctly Daniel Wagner
2023-09-05  6:49   ` Christoph Hellwig
2023-09-05 10:25     ` Daniel Wagner
2023-09-06 11:01   ` Hannes Reinecke
2023-08-29  9:13 ` [RFC v1 3/4] nvmet-fc: untangle cross refcounting objects Daniel Wagner
2023-09-06 11:22   ` Hannes Reinecke
2023-09-11 10:08     ` Daniel Wagner
2023-08-29  9:13 ` [RFC v1 4/4] nvmet-discovery: do not use invalid port Daniel Wagner
2023-09-05  6:50   ` Christoph Hellwig
2023-09-05 10:40     ` Daniel Wagner
2023-09-11 14:44       ` Daniel Wagner
2023-09-11 18:19         ` Daniel Wagner
2023-09-12  6:38           ` Daniel Wagner
2023-09-13 11:35             ` Christoph Hellwig
2023-09-13 11:59               ` Daniel Wagner
2023-09-06 11:23   ` Hannes Reinecke
2023-08-29  9:13 ` [RFC v1 4/4] nvmet-discovery: Do " Daniel Wagner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox