From: Greg Joyce <gjoyce@linux.ibm.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org, kbusch@kernel.org, axboe@fb.com,
sagi@grimberg.me, hare@suse.de, dwagner@suse.de,
msuchanek@suse.de, jonathan.derrick@linux.dev,
okozina@redhat.com, nilay@linux.ibm.com
Subject: Re: [PATCH 1/1] nvme: retry security commands if media not ready
Date: Wed, 02 Oct 2024 11:51:56 -0500 [thread overview]
Message-ID: <dbfd87e4900bea3a28ac46f5abcfb39ef2c13fc1.camel@linux.ibm.com> (raw)
In-Reply-To: <20241002081633.GA22436@lst.de>
On Wed, 2024-10-02 at 10:16 +0200, Christoph Hellwig wrote:
> On Mon, Sep 30, 2024 at 11:48:43AM -0500, gjoyce@linux.ibm.com wrote:
> > +static u32 nvme_get_timeout(struct nvme_ctrl *ctrl)
>
> get_timeout feels a bit too generic for this specific
> controller/media
> ready timeout.
Agreed. I'll change the name to something more specific.
>
> > + timeout = NVME_CAP_TIMEOUT(ctrl->cap);
> > + if (ctrl->cap & NVME_CAP_CRMS_CRWMS) {
> > + u32 crto, ready_timeout;
> > +
> > + ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CRTO,
> > &crto);
> > + if (ret) {
> > + dev_err(ctrl->device, "Reading CRTO failed
> > (%d)\n",
> > + ret);
> > + return ret;
> > + }
>
> And we really should be caching these values instead of reading the
> register for every security command.
If we cache the timeout value(s) in nvme_ctrl then it may be possible
to just eliminate nvme_get_timeout() entirely. Is this the approach
that you were thinking?
>
> > + u32 timeout;
> > + unsigned long timeout_jiffies;
> > + int ret;
> > +
> > + timeout = nvme_get_timeout(ctrl);
> > + timeout_jiffies = jiffies + timeout * HZ;
> >
> > if (send)
> > cmd.common.opcode = nvme_admin_security_send;
> > @@ -2335,8 +2376,19 @@ static int nvme_sec_submit(void *data, u16
> > spsp, u8 secp, void *buffer, size_t l
> > cmd.common.cdw10 = cpu_to_le32(((u32)secp) << 24 |
> > ((u32)spsp) << 8);
> > cmd.common.cdw11 = cpu_to_le32(len);
> >
> > - return __nvme_submit_sync_cmd(ctrl->admin_q, &cmd, NULL,
> > buffer, len,
> > + ret = __nvme_submit_sync_cmd(ctrl->admin_q, &cmd, NULL,
> > buffer, len,
> > NVME_QID_ANY, NVME_SUBMIT_AT_HEAD);
> > + while (ret == NVME_SC_ADMIN_COMMAND_MEDIA_NOT_READY) {
> > + if (time_after(jiffies, timeout_jiffies)) {
> > + dev_err(ctrl->device,
> > + "Device media not ready;
> > aborting\n");
> > + return -ENODEV;
> > + }
> > + ssleep(1);
> > + ret = __nvme_submit_sync_cmd(ctrl->admin_q, &cmd,
> > NULL, buffer,
> > + len, NVME_QID_ANY,
> > NVME_SUBMIT_AT_HEAD);
> > + }
>
> And this also feels a bit odd in that it doesn't catch
> NVME_SC_ADMIN_COMMAND_MEDIA_NOT_READY when it should be ready.
> I think just marking when the controller is past the timeout and
> only doing the retry until then might be the better approach. And
> maybe we should have it in the __nvme_submit_sync_cmd helper for
> admin command as Security Send/Receive aren't the only commands with
> this issue.
>
I'll look at the spec some more but you're probably correct that when
the status code transitions from NVME_SC_ADMIN_COMMAND_MEDIA_NOT_READY
to success, then NVME_SC_ADMIN_COMMAND_MEDIA_NOT_READY should not be
returned again.
I'm a little confused about what you're saying about the timeout.
nvme_enable_ctrl() does determine the correct timeout value and passes
it to nvme_wait_ready() but NVME_CSTS_RDY is set well before the media
is ready (if CC.CRIME is set). Unfortunately there doesn't seem to be
any controller status that indicates when the media is ready. I thought
about having nvme_wait_ready() wait the whole timeout if CC.CRIME is
set, but I think that is contrary to intent of CC.CRIME. And on the SSD
that I'm looking at the timeout is 15 seconds which would be a pretty
big hit to boot time.
I was looking to solve the specific case of the security commands but
certainly any of the commands in Figure 103 could have the same issue.
The heart of the problem is that CC.CRIME support is not complete. I'll
see if I can address that in a more comprehensive manner.
-Greg
next prev parent reply other threads:[~2024-10-02 16:52 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-30 16:48 [PATCH 0/1] nvme: add retry for media not ready error gjoyce
2024-09-30 16:48 ` [PATCH 1/1] nvme: retry security commands if media not ready gjoyce
2024-10-02 8:16 ` Christoph Hellwig
2024-10-02 16:51 ` Greg Joyce [this message]
2024-10-03 12:43 ` Christoph Hellwig
2024-10-03 13:30 ` Greg Joyce
2024-10-03 14:41 ` Christoph Hellwig
2024-10-03 23:35 ` Greg Joyce
2024-10-04 5:41 ` Christoph Hellwig
2024-10-04 7:22 ` Nilay Shroff
2024-10-04 12:24 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dbfd87e4900bea3a28ac46f5abcfb39ef2c13fc1.camel@linux.ibm.com \
--to=gjoyce@linux.ibm.com \
--cc=axboe@fb.com \
--cc=dwagner@suse.de \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jonathan.derrick@linux.dev \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=msuchanek@suse.de \
--cc=nilay@linux.ibm.com \
--cc=okozina@redhat.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox