RE: [PATCH v2] nvme/tcp: Add support to set the tcp worker cpu affinity

public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed

From: David Laight <David.Laight@ACULAB.COM>
To: 'Li Feng' <fengli@smartx.com>, Hannes Reinecke <hare@suse.de>
Cc: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
	"Christoph Hellwig" <hch@lst.de>,
	Sagi Grimberg <sagi@grimberg.me>,
	"open list:NVM EXPRESS DRIVER" <linux-nvme@lists.infradead.org>,
	open list <linux-kernel@vger.kernel.org>,
	"lifeng1519@gmail.com" <lifeng1519@gmail.com>
Subject: RE: [PATCH v2] nvme/tcp: Add support to set the tcp worker cpu affinity
Date: Sat, 15 Apr 2023 21:06:36 +0000	[thread overview]
Message-ID: <3e45f600db2049c4986fd8bb6aea69f4@AcuMS.aculab.com> (raw)
In-Reply-To: <CAHckoCxcmNC++AXELmnCVZNjpHcaOQWOGcjia=NBCnOA7S7EeQ@mail.gmail.com>

From: Li Feng
> Sent: 14 April 2023 10:35
> >
> > On 4/13/23 15:29, Li Feng wrote:
> > > The default worker affinity policy is using all online cpus, e.g. from 0
> > > to N-1. However, some cpus are busy for other jobs, then the nvme-tcp will
> > > have a bad performance.
> > >
> > > This patch adds a module parameter to set the cpu affinity for the nvme-tcp
> > > socket worker threads.  The parameter is a comma separated list of CPU
> > > numbers.  The list is parsed and the resulting cpumask is used to set the
> > > affinity of the socket worker threads.  If the list is empty or the
> > > parsing fails, the default affinity is used.
> > >
...
> > I am not in favour of this.
> > NVMe-over-Fabrics has _virtual_ queues, which really have no
> > relationship to the underlying hardware.
> > So trying to be clever here by tacking queues to CPUs sort of works if
> > you have one subsystem to talk to, but if you have several where each
> > exposes a _different_ number of queues you end up with a quite
> > suboptimal setting (ie you rely on the resulting cpu sets to overlap,
> > but there is no guarantee that they do).
> 
> Thanks for your comment.
> The current io-queues/cpu map method is not optimal.
> It is stupid, and just starts from 0 to the last CPU, which is not configurable.

Module parameters suck, and passing the buck to the user
when you can't decide how to do something isn't a good idea either.

If the system is busy pinning threads to cpus is very hard to
get right.

It can be better to set the threads to run at the lowest RT
priority - so they have priority over all 'normal' threads
and also have a very sticky (but not fixed) cpu affinity so
that all such threads tends to get spread out by the scheduler.
This all works best if the number of RT threads isn't greater
than the number of physical cpu.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

next prev parent reply	other threads:[~2023-04-15 21:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230413062339.2454616-1-fengli@smartx.com>
2023-04-13  6:33 ` [PATCH] nvme/tcp: Add support to set the tcp worker cpu affinity Li Feng
2023-04-13 12:53   ` kernel test robot
2023-04-17 13:45   ` Sagi Grimberg
2023-04-18  3:39     ` Li Feng
2023-04-19  9:32       ` Sagi Grimberg
2023-04-25  8:32         ` Li Feng
2023-04-26 11:31           ` Hannes Reinecke
2023-04-27 12:21             ` Sagi Grimberg
2023-04-27 14:36             ` Ming Lei
2023-04-27 12:11           ` Sagi Grimberg
2023-04-18  3:58     ` Chaitanya Kulkarni
2023-04-18  4:21       ` Li Feng
2023-04-18  9:20       ` Li Feng
2023-04-13 13:29 ` [PATCH v2] " Li Feng
2023-04-14  8:36   ` Hannes Reinecke
2023-04-14  9:35     ` Li Feng
2023-04-15 20:21       ` Chaitanya Kulkarni
2023-04-15 21:06       ` David Laight [this message]
2023-04-17  3:31         ` Li Feng
2023-04-17  6:27         ` Hannes Reinecke
2023-04-17  8:32           ` Li Feng
2023-04-17  7:37   ` Ming Lei
2023-04-17  7:50     ` Li Feng
2023-04-17  8:05       ` Ming Lei
2023-04-17 13:33         ` Sagi Grimberg
2023-04-18  3:29           ` Li Feng
2023-04-18  4:33           ` Ming Lei
2023-04-18  9:32             ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e45f600db2049c4986fd8bb6aea69f4@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=axboe@fb.com \
    --cc=fengli@smartx.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=lifeng1519@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox