From: Lukas Wunner <lukas@wunner.de>
To: Keith Busch <kbusch@kernel.org>
Cc: James Puthukattukaran <james.puthukattukaran@oracle.com>,
Bjorn Helgaas <helgaas@kernel.org>,
Hans de Goede <hdegoede@redhat.com>,
linux-pci@vger.kernel.org
Subject: Re: [External] : Re: sysfs interface to force power off
Date: Tue, 8 Nov 2022 21:16:53 +0100 [thread overview]
Message-ID: <20221108201653.GA4919@wunner.de> (raw)
In-Reply-To: <Y2p//Eqa9HGRmwWW@kbusch-mbp>
On Tue, Nov 08, 2022 at 09:12:44AM -0700, Keith Busch wrote:
> On Mon, Nov 07, 2022 at 04:14:54PM -0500, James Puthukattukaran wrote:
> >
> > There is a path to disable the controller and that code ran but did
> > not help. I checked wit the nvme folks and Keith mentioned that there
> > might be an issue with the nvme queue management. Unfortunately, we
> > can't try newer kernels in the field. So, looking for a way to just
> > "shut off the device" when we have scenarios like this where we can't
> > untangle the mess.
>
> Well, I didn't request you try new kernels in the field. I asked if you
> could experiment with a newer one on a development machine to confirm if
> the bug was fixed by some of the significant changes in this path so
> that we could confirm a reason to port to stable. You're going to have
> to change your kernel to fix this observation, so it would be worth the
> effort to know if the changes being considered actually address the
> problem.
Current mainline still contains this problematic sequence:
nvme_reset_work()
nvme_wait_freeze()
blk_mq_freeze_queue_wait()
So I'm inclined to believe that the issue still persists, but I agree
that validating that hypothesis with a contemporary kernel should be
the first step.
I think nvme_reset_work() is overly optimistic that resetting the drive
succeeded. It just freezes and unfreezes the I/O queue without checking
for errors.
In particular, nvme_wait_freeze() should call the _timeout variant of
blk_mq_freeze_queue_wait() and cope with failure of freezing.
Thanks,
Lukas
next prev parent reply other threads:[~2022-11-08 20:17 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-04 23:08 sysfs interface to force power off James Puthukattukaran
2022-11-07 20:41 ` Bjorn Helgaas
2022-11-07 21:14 ` [External] : " James Puthukattukaran
2022-11-07 21:29 ` Bjorn Helgaas
2022-11-08 16:12 ` Keith Busch
2022-11-08 20:16 ` Lukas Wunner [this message]
2022-11-08 20:37 ` Keith Busch
2022-11-08 9:53 ` Lukas Wunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221108201653.GA4919@wunner.de \
--to=lukas@wunner.de \
--cc=hdegoede@redhat.com \
--cc=helgaas@kernel.org \
--cc=james.puthukattukaran@oracle.com \
--cc=kbusch@kernel.org \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).