From: Hannes Reinecke <hare@suse.de>
To: Sagi Grimberg <sagi@grimberg.me>, Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>, linux-nvme@lists.infradead.org
Subject: Re: [PATCH 0/3] nvme-tcp: queue stalls under high load
Date: Fri, 20 May 2022 12:01:48 +0200 [thread overview]
Message-ID: <23ab4753-593f-e44a-b768-0c1d3682abb0@suse.de> (raw)
In-Reply-To: <64f57dcf-f3d2-d1db-782e-4f48c542754d@grimberg.me>
On 5/20/22 11:20, Sagi Grimberg wrote:
>
>> Hi all,
>>
>> one of our partners registered queue stalls and I/O timeouts under
>> high load. Analysis revealed that we see an extremely 'choppy' I/O
>> behaviour when running large transfers on systems on low-performance
>> links (eg 1GigE networks).
>> We had a system with 30 queues trying to transfer 128M requests; simple
>> calculation shows that transferring a _single_ request on all queues
>> will take up to 38 seconds, thereby timing out the last request before
>> it got sent.
>> As a solution I first fixed up the timeout handler to reset the timeout
>> if the request is still queued or in the process of being send. The
>> second path modifies the send path to only allow for new requests if we
>> have enough space on the TX queue, and finally break up the send loop to
>> avoid system stalls when sending large request.
>
> What is the average latency you are seeing with this test?
> I'm guessing more than 30 seconds :)
Yes, of course. Simple maths, in the end.
(Actually it's more as we're always triggering a reconnect cycle...)
And telling the customer to change his testcase only helps _so_ much.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
next prev parent reply other threads:[~2022-05-20 10:10 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-19 6:26 [PATCH 0/3] nvme-tcp: queue stalls under high load Hannes Reinecke
2022-05-19 6:26 ` [PATCH 1/3] nvme-tcp: spurious I/O timeout " Hannes Reinecke
2022-05-20 9:05 ` Sagi Grimberg
2022-05-23 8:42 ` Hannes Reinecke
2022-05-23 13:36 ` Sagi Grimberg
2022-05-23 14:01 ` Hannes Reinecke
2022-05-23 15:05 ` Sagi Grimberg
2022-05-23 16:07 ` Hannes Reinecke
2022-05-24 7:57 ` Sagi Grimberg
2022-05-24 8:08 ` Hannes Reinecke
2022-05-24 8:53 ` Sagi Grimberg
2022-05-24 9:34 ` Hannes Reinecke
2022-05-24 9:58 ` Sagi Grimberg
2022-05-19 6:26 ` [PATCH 2/3] nvme-tcp: Check for write space before queueing requests Hannes Reinecke
2022-05-20 9:17 ` Sagi Grimberg
2022-05-20 10:05 ` Hannes Reinecke
2022-05-21 20:01 ` Sagi Grimberg
2022-05-19 6:26 ` [PATCH 3/3] nvme-tcp: send quota for nvme_tcp_send_all() Hannes Reinecke
2022-05-20 9:19 ` Sagi Grimberg
2022-05-20 9:59 ` Hannes Reinecke
2022-05-21 20:02 ` Sagi Grimberg
2022-05-20 9:20 ` [PATCH 0/3] nvme-tcp: queue stalls under high load Sagi Grimberg
2022-05-20 10:01 ` Hannes Reinecke [this message]
2022-05-21 20:03 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=23ab4753-593f-e44a-b768-0c1d3682abb0@suse.de \
--to=hare@suse.de \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox