From: Keith Busch <kbusch@kernel.org>
To: Pratyush Yadav <ptyadav@amazon.de>
Cc: Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
Jens Axboe <axboe@kernel.dk>,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nvme-pci: do not set the NUMA node of device if it has none
Date: Wed, 26 Jul 2023 16:25:58 -0600 [thread overview]
Message-ID: <ZMGddjINDt10BSvf@kbusch-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <mafs08rb28o4u.fsf_-_@amazon.de>
On Wed, Jul 26, 2023 at 09:32:33PM +0200, Pratyush Yadav wrote:
> On Wed, Jul 26 2023, Keith Busch wrote:
> > Could you send the output of:
> >
> > numactl --hardware
>
> $ numactl --hardware
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
> node 0 size: 245847 MB
> node 0 free: 245211 MB
> node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
> node 1 size: 245932 MB
> node 1 free: 245328 MB
> node distances:
> node 0 1
> 0: 10 21
> 1: 21 10
>
> >
> > and then with and without your patch:
> >
> > for i in $(cat /proc/interrupts | grep nvme0 | sed "s/^ *//g" | cut -d":" -f 1); do \
> > cat /proc/irq/$i/{smp,effective}_affinity_list; \
> > done
>
> Without my patch:
>
> $ for i in $(cat /proc/interrupts | grep nvme0 | sed "s/^ *//g" | cut -d":" -f 1); do \
> > cat /proc/irq/$i/{smp,effective}_affinity_list; \
> > done
Hm, I wonder if there's something wrong with my script. All the cpu's
should be accounted for in the smp_affinity_list, assuming it captured
all the vectors of the nvme device, but both examples are missing half
the CPUs. It looks like you have 32 vectors. Does that sound right?
This does show the effective affinity is indeed always on node 0 without
your patch. I don't see why, though: the "group_cpus_evenly()" function
that spreads the interrupts doesn't know anything about the device the
resource is being grouped for, so it shouldn't even take its NUMA node
into consideration. It's just supposed to ensure all CPUs have a shared
resource, preferring to not share across numa nodes.
I'll emulate a similar CPU topology with similar nvme vector count and
see if I can find anything suspicious. I'm a little concerned we may
have the same problem for devices that have an associated NUMA node that
your patch isn't addressing.
> 41
> 40
> 33
> 33
> 44
> 44
> 9
> 9
> 32
> 32
> 2
> 2
> 6
> 6
> 11
> 11
> 1
> 1
> 35
> 35
> 39
> 39
> 13
> 13
> 42
> 42
> 46
> 46
> 41
> 41
> 46
> 46
> 15
> 15
> 5
> 5
> 43
> 43
> 0
> 0
> 14
> 14
> 8
> 8
> 12
> 12
> 7
> 7
> 10
> 10
> 47
> 47
> 38
> 38
> 36
> 36
> 3
> 3
> 34
> 34
> 45
> 45
> 5
> 5
>
> With my patch:
>
> $ for i in $(cat /proc/interrupts | grep nvme0 | sed "s/^ *//g" | cut -d":" -f 1); do \
> > cat /proc/irq/$i/{smp,effective}_affinity_list; \
> > done
> 9
> 9
> 15
> 15
> 5
> 5
> 23
> 23
> 38
> 38
> 52
> 52
> 21
> 21
> 36
> 36
> 13
> 13
> 56
> 56
> 44
> 44
> 42
> 42
> 31
> 31
> 48
> 48
> 5
> 5
> 3
> 3
> 1
> 1
> 11
> 11
> 28
> 28
> 18
> 18
> 34
> 34
> 29
> 29
> 58
> 58
> 46
> 46
> 54
> 54
> 59
> 59
> 32
> 32
> 7
> 7
> 56
> 56
> 62
> 62
> 49
> 49
> 57
> 57
next prev parent reply other threads:[~2023-07-26 22:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-25 11:06 [PATCH] nvme-pci: do not set the NUMA node of device if it has none Pratyush Yadav
2023-07-25 14:35 ` Keith Busch
2023-07-26 7:58 ` Sagi Grimberg
2023-07-26 13:14 ` Christoph Hellwig
2023-07-26 15:30 ` Pratyush Yadav
2023-07-26 16:17 ` Keith Busch
2023-07-26 19:32 ` Pratyush Yadav
2023-07-26 22:25 ` Keith Busch [this message]
2023-07-28 18:09 ` Pratyush Yadav
2023-07-28 19:34 ` Keith Busch
2023-08-04 14:50 ` Pratyush Yadav
2023-08-04 15:19 ` Keith Busch
2023-08-08 15:51 ` Pratyush Yadav
2023-08-08 16:35 ` Keith Busch
2024-07-23 9:49 ` Maurizio Lombardi
2024-07-23 14:39 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZMGddjINDt10BSvf@kbusch-mbp.dhcp.thefacebook.com \
--to=kbusch@kernel.org \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=ptyadav@amazon.de \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox