From: Jiri Slaby <jirislaby@kernel.org>
To: 'Guanjun' <guanjun@linux.alibaba.com>,
corbet@lwn.net, axboe@kernel.dk, mst@redhat.com,
jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
eperezma@redhat.com, vgoyal@redhat.com, stefanha@redhat.com,
miklos@szeredi.hu, tglx@linutronix.de, peterz@infradead.org,
akpm@linux-foundation.org, paulmck@kernel.org, thuth@redhat.com,
rostedt@goodmis.org, bp@alien8.de, xiongwei.song@windriver.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, virtualization@lists.linux.dev,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH RFC v1 1/2] genirq/affinity: add support for limiting managed interrupts
Date: Fri, 1 Nov 2024 08:06:02 +0100 [thread overview]
Message-ID: <0f5c1192-090b-4c00-a951-9613289057df@kernel.org> (raw)
In-Reply-To: <20241031074618.3585491-2-guanjun@linux.alibaba.com>
Hi,
On 31. 10. 24, 8:46, 'Guanjun' wrote:
> From: Guanjun <guanjun@linux.alibaba.com>
>
> Commit c410abbbacb9 (genirq/affinity: Add is_managed to struct irq_affinity_desc)
> introduced is_managed bit to struct irq_affinity_desc. Due to queue interrupts
> treated as managed interrupts, in scenarios where a large number of
> devices are present (using massive msix queue interrupts), an excessive number
> of IRQ matrix bits (about num_online_cpus() * nvecs) are reserved during
> interrupt allocation. This sequently leads to the situation where interrupts
> for some devices cannot be properly allocated.
>
> Support for limiting the number of managed interrupts on every node per allocation.
>
> Signed-off-by: Guanjun <guanjun@linux.alibaba.com>
> ---
> .../admin-guide/kernel-parameters.txt | 9 +++
> block/blk-mq-cpumap.c | 2 +-
> drivers/virtio/virtio_vdpa.c | 2 +-
> fs/fuse/virtio_fs.c | 2 +-
> include/linux/group_cpus.h | 2 +-
> kernel/irq/affinity.c | 11 ++--
> lib/group_cpus.c | 55 ++++++++++++++++++-
> 7 files changed, 73 insertions(+), 10 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 9b61097a6448..ac80f35d04c9 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3238,6 +3238,15 @@
> different yeeloong laptops.
> Example: machtype=lemote-yeeloong-2f-7inch
>
> + managed_irqs_per_node=
> + [KNL,SMP] Support for limiting the number of managed
> + interrupts on every node to prevent the case that
> + interrupts cannot be properly allocated where a large
> + number of devices are present. The default number is 0,
> + that means no limit to the number of managed irqs.
> + Format: integer between 0 and num_possible_cpus() / num_possible_nodes()
> + Default: 0
Kernel parameters suck. Esp. here you have to guess to even properly
boot. Could this be auto-tuned instead?
> --- a/lib/group_cpus.c
> +++ b/lib/group_cpus.c
> @@ -11,6 +11,30 @@
>
> #ifdef CONFIG_SMP
>
> +static unsigned int __read_mostly managed_irqs_per_node;
> +static struct cpumask managed_irqs_cpumsk[MAX_NUMNODES] __cacheline_aligned_in_smp = {
This is quite excessive. On SUSE configs, this is 8192 cpu bits * 1024
nodes = 1 M. For everyone. You have to allocate this dynamically
instead. See e.g. setup_node_to_cpumask_map().
> + [0 ... MAX_NUMNODES-1] = {CPU_BITS_ALL}
> +};
> +
> +static int __init irq_managed_setup(char *str)
> +{
> + int ret;
> +
> + ret = kstrtouint(str, 10, &managed_irqs_per_node);
> + if (ret < 0) {
> + pr_warn("managed_irqs_per_node= cannot parse, ignored\n");
could not be parsed
> + return 0;
> + }
> +
> + if (managed_irqs_per_node * num_possible_nodes() > num_possible_cpus()) {
> + managed_irqs_per_node = num_possible_cpus() / num_possible_nodes();
> + pr_warn("managed_irqs_per_node= cannot be larger than %u\n",
> + managed_irqs_per_node);
> + }
> + return 1;
> +}
> +__setup("managed_irqs_per_node=", irq_managed_setup);
> +
> static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> unsigned int cpus_per_grp)
> {
...
> @@ -332,6 +380,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps,
> /**
> * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality
> * @numgrps: number of groups
> + * @is_managed: if these groups managed by kernel
are managed by the kernel
> *
> * Return: cpumask array if successful, NULL otherwise. And each element
> * includes CPUs assigned to this group
thanks,
--
js
suse labs
next prev parent reply other threads:[~2024-11-01 7:06 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-31 7:46 [PATCH RFC v1 0/2] Support for limiting the number of managed interrupts on every node per allocation 'Guanjun'
2024-10-31 7:46 ` [PATCH RFC v1 1/2] genirq/affinity: add support for limiting managed interrupts 'Guanjun'
2024-10-31 10:35 ` Thomas Gleixner
2024-10-31 10:50 ` Ming Lei
[not found] ` <43FD1116-C188-4729-A3AB-C2A0F5A087D2@linux.alibaba.com>
2024-11-01 3:34 ` Jason Wang
2024-11-01 3:03 ` mapicccy
2024-11-01 23:37 ` Thomas Gleixner
2024-11-01 7:06 ` Jiri Slaby [this message]
2024-10-31 7:46 ` [PATCH RFC v1 2/2] genirq/cpuhotplug: Handle managed IRQs when the last CPU hotplug out in the affinity 'Guanjun'
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0f5c1192-090b-4c00-a951-9613289057df@kernel.org \
--to=jirislaby@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=eperezma@redhat.com \
--cc=guanjun@linux.alibaba.com \
--cc=jasowang@redhat.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=mst@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stefanha@redhat.com \
--cc=tglx@linutronix.de \
--cc=thuth@redhat.com \
--cc=vgoyal@redhat.com \
--cc=virtualization@lists.linux.dev \
--cc=xiongwei.song@windriver.com \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).