Re: [PATCH 4/4] nvme-tcp: switch to 'cpu' affinity scope for unbound workqueues

Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Hannes Reinecke <hare@suse.de>
To: Sagi Grimberg <sagi@grimberg.me>, Hannes Reinecke <hare@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>, Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org
Subject: Re: [PATCH 4/4] nvme-tcp: switch to 'cpu' affinity scope for unbound workqueues
Date: Thu, 4 Jul 2024 17:54:51 +0200	[thread overview]
Message-ID: <b6c7fe9e-d15d-45be-912b-d9ef35c15c49@suse.de> (raw)
In-Reply-To: <ed21cca6-4d79-4edc-935b-20e342a4d352@grimberg.me>

On 7/4/24 11:11, Sagi Grimberg wrote:
> 
> 
> On 7/3/24 18:50, Hannes Reinecke wrote:
[ .. ]
>>
>> As you can see, with unbound and 'cpu' affinity we are basically on par
>> with the default implementations (all tests are run with per-controller
>> workqueues, mind).
> 
> I'm puzzled that the seq vs. rand vary this much when you work against a 
> brd device.
> Are these results stable?
> 
There is quite a bit of flutter ongoing, but the overall picture doesn't 
change.

>> Running the same workload with 4 subsystems and 8 paths will run into
>> I/O timeouts for the default implementation, but perfectly succeed with
>> unbound and 'cpu' affinity.
>> So definitely an improvement there.
> 
> I tend to think that the io timeouts are caused by a bug, not by "non 
> optimized" code. io timeouts are eternity for this test, which makes me
> think we have a different issue here.

I did some latency measurements for the send and receive loop, and found 
that we are in fact starved by the receive side. The sending side is 
pretty well limited by the 'deadline' setting, but the receiving side 
has no such precaution, and I have seen per-queue receive latencies of 
over 5 milliseconds.
The worrying thing here was that only individual queues have been 
affected; most queues had the expected latency of around 50usecs, but
some really went over the top with 1000s of usecs. And these were the
queues which were generating I/O timeouts.

I have now modified the deadline method to cover both receive and 
sending side, and the results were pretty good; timeouts are gone and
even the overall performance for the 4 subsystem case has gone up.

Will be posting an updated patchset shortly.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

next prev parent reply	other threads:[~2024-07-04 15:55 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-03 13:50 [PATCH 0/4] nvme-tcp: improve scalability Hannes Reinecke
2024-07-03 13:50 ` [PATCH 1/4] nvme-tcp: per-controller I/O workqueues Hannes Reinecke
2024-07-03 14:11   ` Sagi Grimberg
2024-07-03 14:46     ` Hannes Reinecke
2024-07-03 15:16       ` Sagi Grimberg
2024-07-03 17:07         ` Tejun Heo
2024-07-03 19:14           ` Sagi Grimberg
2024-07-03 19:17             ` Tejun Heo
2024-07-03 19:41               ` Sagi Grimberg
2024-07-04  7:36               ` Hannes Reinecke
2024-07-05  7:10                 ` Christoph Hellwig
2024-07-05  8:11                   ` Hannes Reinecke
2024-07-05  8:16                     ` Jens Axboe
2024-07-04  5:36   ` Christoph Hellwig
2024-07-03 13:50 ` [PATCH 2/4] nvme-tcp: align I/O cpu with blk-mq mapping Hannes Reinecke
2024-07-03 14:19   ` Sagi Grimberg
2024-07-03 14:53     ` Hannes Reinecke
2024-07-03 15:03       ` Sagi Grimberg
2024-07-03 15:40         ` Hannes Reinecke
2024-07-03 19:38           ` Sagi Grimberg
2024-07-03 19:47             ` Sagi Grimberg
2024-07-04  6:43             ` Hannes Reinecke
2024-07-04  9:07               ` Sagi Grimberg
2024-07-04 14:03                 ` Hannes Reinecke
2024-07-04  5:37     ` Christoph Hellwig
2024-07-04  9:13       ` Sagi Grimberg
2024-07-03 13:50 ` [PATCH 3/4] workqueue: introduce helper workqueue_unbound_affinity_scope() Hannes Reinecke
2024-07-03 17:31   ` Tejun Heo
2024-07-04  6:04     ` Hannes Reinecke
2024-07-03 13:50 ` [PATCH 4/4] nvme-tcp: switch to 'cpu' affinity scope for unbound workqueues Hannes Reinecke
2024-07-03 14:22   ` Sagi Grimberg
2024-07-03 15:01     ` Hannes Reinecke
2024-07-03 15:09       ` Sagi Grimberg
2024-07-03 15:50         ` Hannes Reinecke
2024-07-04  9:11           ` Sagi Grimberg
2024-07-04 15:54             ` Hannes Reinecke [this message]
2024-07-05 11:48               ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6c7fe9e-d15d-45be-912b-d9ef35c15c49@suse.de \
    --to=hare@suse.de \
    --cc=hare@kernel.org \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox