Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Hannes Reinecke <hare@suse.de>, Hannes Reinecke <hare@kernel.org>,
	Christoph Hellwig <hch@lst.de>, Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org
Subject: Re: [PATCH 1/4] nvme-tcp: per-controller I/O workqueues
Date: Wed, 3 Jul 2024 07:07:06 -1000	[thread overview]
Message-ID: <ZoWFOnvgWEiKipW6@slm.duckdns.org> (raw)
In-Reply-To: <ad15b94b-9414-4ec1-9acf-465a8b190fe5@grimberg.me>

Hello,

On Wed, Jul 03, 2024 at 06:16:32PM +0300, Sagi Grimberg wrote:
...
> > > OK, wonder what is the cost here. Is it in ALL conditions better
> > > than a single workqueue?
> > 
> > Well, clearly not on memory-limited systems; a workqueue per controller
> > takes up more memory that a single one. And it's questionable whether
> > such a system isn't underprovisioned for nvme anyway.

Each workqueue does take up some memory but it's not enormous (I think it's
512 + 512 * nr_cpus + some extra + rescuer if MEM_RECLAIM). Each workqueue
is just a frontend to shared backend worker pools, so splitting a workqueue
into multiple that do about the same work usually won't create more workers.

> > We will see a higher scheduler interaction as the scheduler needs to
> > switch between workqueues, but that was kinda the idea. And I doubt one

This isn't necessarily true. The backend worker pools don't care whether you
have one or multiple workqueues. For per-cpu workqueues, the concurrency
management applies across different workqueues. For unbound workqueues,
because concurrency limit is per workqueue, if there are enough concurrent
work items being queued, the concurrent number of running kworkers may go up
but that's just because the total concurrency went up. Whether you have one
or many workqueues, as long as workqueues share the properties, they map to
the same backend worker pools and execute exactly the same way.

> > can measure it; the overhead between switching workqueues should be
> > pretty much identical to the overhead switching between workqueue items.

They are identical.

> > I could do some measurements, but really I don't think it'll yield any
> > surprising results.
> 
> I'm just not used to seeing drivers create non-global workqueues. I've seen
> some filesystems have workqueues per-super, but
> it's not a common pattern around the kernel.
> 
> Tejun,
> Is this a pattern that we should pursue? Do multiple symmetric workqueues
> really work better (faster, with less overhead) than
> a single global workqueues?

Yeah, there's nothing wrong with creating multiple if for the right reasons.
Here are some reasons I can think of:

- Not wanting to share concurrency limit so that one one device can't
  interfere with another. Not sharing rescuer may also have *some* benefits
  although I doubt it'd be all that noticeable.

- To get separate flush domains. e.g. If you want to be able to do
  flush_workqueue() on the work items that service one device without
  getting affected by work items from other devices.

- To get different per-device workqueue attribtes - e.g. maybe you wanna
  confine workers serving a specific device to a subset of CPUs or give them
  higher priority.

Note that separating workqueues does not necessarily change how things are
executed. e.g. You don't get your own kworkers.

Thanks.

-- 
tejun


  reply	other threads:[~2024-07-03 17:07 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-03 13:50 [PATCH 0/4] nvme-tcp: improve scalability Hannes Reinecke
2024-07-03 13:50 ` [PATCH 1/4] nvme-tcp: per-controller I/O workqueues Hannes Reinecke
2024-07-03 14:11   ` Sagi Grimberg
2024-07-03 14:46     ` Hannes Reinecke
2024-07-03 15:16       ` Sagi Grimberg
2024-07-03 17:07         ` Tejun Heo [this message]
2024-07-03 19:14           ` Sagi Grimberg
2024-07-03 19:17             ` Tejun Heo
2024-07-03 19:41               ` Sagi Grimberg
2024-07-04  7:36               ` Hannes Reinecke
2024-07-05  7:10                 ` Christoph Hellwig
2024-07-05  8:11                   ` Hannes Reinecke
2024-07-05  8:16                     ` Jens Axboe
2024-07-04  5:36   ` Christoph Hellwig
2024-07-03 13:50 ` [PATCH 2/4] nvme-tcp: align I/O cpu with blk-mq mapping Hannes Reinecke
2024-07-03 14:19   ` Sagi Grimberg
2024-07-03 14:53     ` Hannes Reinecke
2024-07-03 15:03       ` Sagi Grimberg
2024-07-03 15:40         ` Hannes Reinecke
2024-07-03 19:38           ` Sagi Grimberg
2024-07-03 19:47             ` Sagi Grimberg
2024-07-04  6:43             ` Hannes Reinecke
2024-07-04  9:07               ` Sagi Grimberg
2024-07-04 14:03                 ` Hannes Reinecke
2024-07-04  5:37     ` Christoph Hellwig
2024-07-04  9:13       ` Sagi Grimberg
2024-07-03 13:50 ` [PATCH 3/4] workqueue: introduce helper workqueue_unbound_affinity_scope() Hannes Reinecke
2024-07-03 17:31   ` Tejun Heo
2024-07-04  6:04     ` Hannes Reinecke
2024-07-03 13:50 ` [PATCH 4/4] nvme-tcp: switch to 'cpu' affinity scope for unbound workqueues Hannes Reinecke
2024-07-03 14:22   ` Sagi Grimberg
2024-07-03 15:01     ` Hannes Reinecke
2024-07-03 15:09       ` Sagi Grimberg
2024-07-03 15:50         ` Hannes Reinecke
2024-07-04  9:11           ` Sagi Grimberg
2024-07-04 15:54             ` Hannes Reinecke
2024-07-05 11:48               ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZoWFOnvgWEiKipW6@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=hare@kernel.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox