Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Sagi Grimberg <sagi@grimberg.me>, Daniel Wagner <dwagner@suse.de>,
	Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>,
	James Smart <james.smart@broadcom.com>,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 3/6] nvme-tcp: short-circuit reconnect retries
Date: Wed, 10 Apr 2024 08:01:49 +0200	[thread overview]
Message-ID: <dfb41081-bf2a-4e57-9414-1cb7e678dd3f@suse.de> (raw)
In-Reply-To: <e73b73af-a8c6-49f2-b47a-126fab4836b2@grimberg.me>

On 4/9/24 22:20, Sagi Grimberg wrote:
> 
> 
> On 09/04/2024 12:35, Daniel Wagner wrote:
>> From: Hannes Reinecke <hare@suse.de>
>>
>> Returning an nvme status from nvme_tcp_setup_ctrl() indicates that the
>> association was established and we have received a status from the
>> controller; consequently we should honour the DNR bit. If not any future
>> reconnect attempts will just return the same error, so we can
>> short-circuit the reconnect attempts and fail the connection directly.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> [dwagner: add helper to decide to reconnect]
>> Signed-off-by: Daniel Wagner <dwagner@suse.de>
>> ---
>>   drivers/nvme/host/nvme.h | 24 ++++++++++++++++++++++++
>>   drivers/nvme/host/tcp.c  | 23 +++++++++++++++--------
>>   2 files changed, 39 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index 9b8904a476b8..dfe103283a3d 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -701,6 +701,30 @@ static inline bool nvme_is_path_error(u16 status)
>>       return (status & 0x700) == 0x300;
>>   }
>> +/*
>> + * Evaluate the status information returned by the LLDD in order to
>> + * decided if a reconnect attempt should be scheduled.
>> + *
>> + * There are two cases where no reconnect attempt should be attempted:
>> + *
>> + * 1) The LLDD reports an negative status. There was an error (e.g. no
>> + *    memory) on the host side and thus abort the operation.
>> + *    Note, there are exception such as ENOTCONN which is
>> + *    not an internal driver error, thus we filter these errors
>> + *    out and retry later.
>> + * 2) The DNR bit is set and the specification states no further
>> + *    connect attempts with the same set of paramenters should be
>> + *    attempted.
>> + */
>> +static inline bool nvme_ctrl_reconnect(int status)
>> +{
>> +    if (status < 0 && status != -ENOTCONN)
>> +        return false;
> 
> So if the host failed to allocate a buffer it will never attempt
> another reconnect? doesn't sound right to me..,

Memory allocation errors are always tricky. If a system is under memory
pressure yours won't be the only process which will exhibit issues,
so it's questionable whether you should retry, or whether it wouldn't
be safer to just abort the operation, and retry manually once the
system has recovered.
Also we just have the interesting case where RDMA goes haywire and
reports a log buffer size of several TB. No amount of retries will
be fixing that.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich



  reply	other threads:[~2024-04-10  6:02 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-09  9:35 [PATCH v5 0/6] nvme-fabrics: short-circuit connect retries Daniel Wagner
2024-04-09  9:35 ` [PATCH v5 1/6] nvme: authentication error are always non-retryable Daniel Wagner
2024-04-09 13:59   ` Christoph Hellwig
2024-04-09 20:16   ` Sagi Grimberg
2024-04-09 20:26   ` Sagi Grimberg
2024-04-10  6:52     ` Daniel Wagner
2024-04-10 10:21       ` Sagi Grimberg
2024-04-10 12:05         ` Hannes Reinecke
2024-04-10 13:50           ` Sagi Grimberg
2024-04-11  7:11             ` Hannes Reinecke
2024-04-11  8:37               ` Sagi Grimberg
2024-04-09  9:35 ` [PATCH v5 2/6] nvmet: lock config semaphore when accessing DH-HMAC-CHAP key Daniel Wagner
2024-04-09  9:35 ` [PATCH v5 3/6] nvme-tcp: short-circuit reconnect retries Daniel Wagner
2024-04-09 13:59   ` Christoph Hellwig
2024-04-09 20:20   ` Sagi Grimberg
2024-04-10  6:01     ` Hannes Reinecke [this message]
2024-04-09 20:40   ` Sagi Grimberg
2024-04-10 10:06     ` Daniel Wagner
2024-04-09  9:35 ` [PATCH v5 4/6] nvme-rdma: " Daniel Wagner
2024-04-09 14:00   ` Christoph Hellwig
2024-04-09 14:19     ` Keith Busch
2024-04-09 20:21       ` Sagi Grimberg
2024-04-09 20:28   ` Sagi Grimberg
2024-04-12  2:50     ` Keith Busch
2024-04-09  9:35 ` [PATCH v5 5/6] nvme-fc: use nvme_ctrl_reconnect to decide reconnect attempts Daniel Wagner
2024-04-09 14:01   ` Christoph Hellwig
2024-04-09  9:35 ` [PATCH v5 6/6] nvmet: return DHCHAP status codes from nvmet_setup_auth() Daniel Wagner
2024-04-09 14:03   ` Christoph Hellwig
2024-04-09 20:23   ` Sagi Grimberg
2024-04-10  6:45     ` Hannes Reinecke
2024-04-12  0:35 ` [PATCH v5 0/6] nvme-fabrics: short-circuit connect retries Keith Busch
2024-04-12  7:24   ` Daniel Wagner
2024-04-12 15:24     ` Keith Busch
2024-04-14  8:34       ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dfb41081-bf2a-4e57-9414-1cb7e678dd3f@suse.de \
    --to=hare@suse.de \
    --cc=dwagner@suse.de \
    --cc=hch@lst.de \
    --cc=james.smart@broadcom.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox