From: jsmart2021@gmail.com (James Smart)
Subject: [PATCH v2] nvme: validate controller state before rescheduling keep alive
Date: Tue, 27 Nov 2018 17:04:44 -0800 [thread overview]
Message-ID: <20181128010444.8747-1-jsmart2021@gmail.com> (raw)
Delete operations are seeing NULL pointer references in call_timer_fn.
Tracking these back, the timer appears to be the keep alive timer.
nvme_keep_alive_work() which is tied to the timer that is cancelled
by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't
wait for it's completion. So nvme_stop_keep_alive() only stops a timer
when it's pending. When a keep alive is in flight, there is no timer
running and the nvme_stop_keep_alive() will have no affect on the keep
alive io. Thus, if the io completes successfully, the keep alive timer
will be rescheduled. In the failure case, delete is called, the
controller state is changed, the nvme_stop_keep_alive() is called while
the io is outstanding, and the delete path continues on. The keep
alive happens to successfully complete before the delete paths mark it
as aborted as part of the queue termination, so the timer is restarted.
The delete paths then tear down the controller, and later on the timer
code fires and the timer entry is now corrupt.
Fix by validating the controller state before rescheduling the keep
alive. Testing with the fix has confirmed the condition above was hit.
Signed-off-by: James Smart <jsmart2021 at gmail.com>
---
v2:
added locking around controller state check. Could have used rmb
and wmb, but this isn't a performance condition.
---
drivers/nvme/host/core.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index bb39b91253c2..7c2184fa2917 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -831,6 +831,8 @@ static int nvme_submit_user_cmd(struct request_queue *q,
static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
{
struct nvme_ctrl *ctrl = rq->end_io_data;
+ unsigned long flags;
+ bool startka = false;
blk_mq_free_request(rq);
@@ -841,7 +843,13 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
return;
}
- schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+ spin_lock_irqsave(&ctrl->lock, flags);
+ if (ctrl->state == NVME_CTRL_LIVE ||
+ ctrl->state == NVME_CTRL_CONNECTING)
+ startka = true;
+ spin_unlock_irqrestore(&ctrl->lock, flags);
+ if (startka)
+ schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
}
static int nvme_keep_alive(struct nvme_ctrl *ctrl)
--
2.13.7
next reply other threads:[~2018-11-28 1:04 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 1:04 James Smart [this message]
2018-12-04 15:10 ` [PATCH v2] nvme: validate controller state before rescheduling keep alive Christoph Hellwig
2018-12-04 17:07 ` Sagi Grimberg
2018-12-04 22:24 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181128010444.8747-1-jsmart2021@gmail.com \
--to=jsmart2021@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox