All of lore.kernel.org
 help / color / mirror / Atom feed
From: keith.busch@intel.com (Keith Busch)
Subject: [PATCH] nvme-pci: Fix rapid add remove sequence
Date: Thu, 31 Jan 2019 14:05:49 -0700	[thread overview]
Message-ID: <20190131210548.GA21082@localhost.localdomain> (raw)
In-Reply-To: <a3b0f71e-0694-36eb-4eff-2a20d51a6d5d@grimberg.me>

On Thu, Jan 31, 2019@12:54:03PM -0800, Sagi Grimberg wrote:
> 
> > A surprise removal may fail to tear down request queues if it is racing
> > with the initial asynchronous probe. If that happens, the remove path
> > won't see the queue resources to tear down, and the controller reset
> > path may create a new request queue on a removed device, but will not
> > be able to make forward progress, deadlocking the pci removal.
> 
> Doesn't pci removal flush the reset work before making forward
> progress? Perhaps what is needed that it will flush it earlier instead
> of serializing with the shutdown lock?

Removal does flush reset work, but doesn't help this particular
issue. It's pretty timing sensitive to trigger.

Before flushing reset on an surprise removal, we do an ungraceful device
teardown first in order to unblock any IO that reset work is waiting on.

In this case that Alex discovered, though, the surprise removal happens
just before the nvme driver has set up the admin and io tagsets, so
removal doesn't find any tagsets to kill, and proceeds with flushing
the reset work.

The reset work, though, just allocated brand new tagsets right after
that, so it looks like they are good to use, so dispatches an admin
command to a device that's gone.

You might expect the nvme_timeout() work to trigger 60 seconds later,
but we can't use that when the pci device is not in a normal channel
state. I wouldn't want to wait 60 seconds either, so the removal task
needs to handle get things unblocked.

  reply	other threads:[~2019-01-31 21:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-24  1:46 [PATCH] nvme-pci: Fix rapid add remove sequence Keith Busch
2019-01-30 17:22 ` Alex_Gagniuc
2019-01-31 20:54 ` Sagi Grimberg
2019-01-31 21:05   ` Keith Busch [this message]
2019-02-05  8:51 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190131210548.GA21082@localhost.localdomain \
    --to=keith.busch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.