From: jsmart2021@gmail.com (James Smart)
Subject: [PATCH 3/7] nvme_fc: retry failures to set io queue count
Date: Fri, 11 May 2018 13:19:03 -0700 [thread overview]
Message-ID: <481a6feb-1a3b-72b4-29a7-310d3525e21c@gmail.com> (raw)
In-Reply-To: <20180508080041.62af2693@pentland.suse.de>
On 5/7/2018 11:00 PM, Hannes Reinecke wrote:
> On Mon, 7 May 2018 17:12:10 -0700
> "James Smart" <jsmart2021@gmail.com> wrote:
>
>> During the creation of a new controller association, it's possible for
>> errors and link connectivity issues to cause nvme_set_queue_count() to
>> have its SET_FEATURES command fail with a positive non-zero code. The
>> routine doesn't treat this as a hard error, instead setting the io
>> queue count to zero and returning success. This has the result of the
>> transport setting the io queue count to 0, making the storage
>> controller inoperable. The message "...Could not set queue count..."
>> is seen.
>>
>> Revise the fc transport to detect when it asked for io queues but got
>> back a result of 0 io queues. In such a case, fail the re-connection
>> attempt and fall into the retry loop.
>>
>> Signed-off-by: James Smart <james.smart at broadcom.com>
>> ---
>> drivers/nvme/host/fc.c | 14 ++++++++------
>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>
> The usual problem when having _two_ return values.
> Can't have nvme_set_queue_count() return the number of queues or a
> negative number on failure?
> Then the check would be much simplified.
>
> Cheers,
>
> Hannes
>
the routine that nvme_set_queue_count() uses to perform the SET_FEATURES
command to set the queue count returns either a negative error (linux
error code), 0 for success, or a positive error (nvme command status
value). The nvme_set_queue_count() routine, in the case where it is a
positive error - converts it to logging a message, setting io count to
zero, and returning success. In the case where it was a negative error,
nvme_set_queue_count() returns the negative error. Success sets the io
count and returns zero.
In looking through history, it appears the desired to not fault
controller connect if the SET_FEATURES command failed due to a nvme
status was to allow the (degraded) controller to come up and at least
offer the adminq for maintenance.
Obviously, my patch is reverting that "maintenance" path. This is a
little more complex than desired as its not clear whether the nvme
status was from a real device completion or one that was stamped by the
transport as it recovered from a transport error or connectivity error
(NVME_SC_ABORT_REQ or NVME_SC_INTERNAL). Former is when you would want
to continue, latter you want to fail. It appears the other transports
stamp only NVME_SC_ABORT_REQ.
I'm going to repost with nvme_set_queue_count() returning failure
(positive nvme result) if one of those two codes are returned. Any other
status will keep the existing behavior.
-- james
next prev parent reply other threads:[~2018-05-11 20:19 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-08 0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
2018-05-08 0:12 ` [PATCH 1/7] nvme: remove unnecessary controller subnqn validation James Smart
2018-05-08 5:57 ` Hannes Reinecke
2018-05-08 0:12 ` [PATCH 2/7] nvme_fc: remove setting DNR on exception conditions James Smart
2018-05-08 5:58 ` Hannes Reinecke
2018-05-08 0:12 ` [PATCH 3/7] nvme_fc: retry failures to set io queue count James Smart
2018-05-08 6:00 ` Hannes Reinecke
2018-05-11 20:19 ` James Smart [this message]
2018-05-08 0:12 ` [PATCH 4/7] nvme_fc: remove reinit_request routine James Smart
2018-05-08 6:01 ` Hannes Reinecke
2018-05-08 0:12 ` [PATCH 5/7] nvme_fc: change controllers first connect to use reconnect path James Smart
2018-05-08 6:03 ` Hannes Reinecke
2018-05-08 0:12 ` [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect James Smart
2018-05-08 6:05 ` Hannes Reinecke
2018-05-08 15:12 ` James Smart
2018-05-08 15:28 ` Hannes Reinecke
2018-05-08 0:12 ` [PATCH 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device James Smart
2018-05-08 6:06 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=481a6feb-1a3b-72b4-29a7-310d3525e21c@gmail.com \
--to=jsmart2021@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).