From: Marc Zyngier <maz@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>,
Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
"open list:IRQ\ SUBSYSTEM" <linux-kernel@vger.kernel.org>,
"moderated list:ARM SMMU\ DRIVERS"
<linux-arm-kernel@lists.infradead.org>,
kvmarm@lists.linux.dev,
"wanghaibin.wang@huawei.com" <wanghaibin.wang@huawei.com>,
nizhiqiang1@huawei.com,
"tangnianyao@huawei.com" <tangnianyao@huawei.com>,
wangzhou1@hisilicon.com
Subject: Re: [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption
Date: Fri, 23 Aug 2024 09:49:25 +0100 [thread overview]
Message-ID: <86zfp3wrmy.wl-maz@kernel.org> (raw)
In-Reply-To: <87o75kgspg.ffs@tglx>
On Thu, 22 Aug 2024 22:20:43 +0100,
Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Thu, Aug 22 2024 at 13:47, Marc Zyngier wrote:
> > On Thu, 22 Aug 2024 11:59:50 +0100,
> > Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> >> > but that will eat a significant portion of your stack if your kernel is
> >> > configured for a large number of CPUs.
> >> >
> >>
> >> Currently CONFIG_NR_CPUS=4096,each `struct cpumask` occupies 512 bytes.
> >
> > This seems crazy. Why would you build a kernel with something *that*
> > big, specially considering that you have a lot less than 1k CPUs?
>
> That's why CONFIG_CPUMASK_OFFSTACK exists, but that does not help in
> that context. :)
>
> >> > The removal of this global lock is the only option in my opinion.
> >> > Either the cpumask becomes a stack variable, or it becomes a static
> >> > per-CPU variable. Both have drawbacks, but they are not a bottleneck
> >> > anymore.
> >>
> >> I also prefer to remove the global lock. Which variable do you think is
> >> better?
> >
> > Given the number of CPUs your system is configured for, there is no
> > good answer. An on-stack variable is dangerously large, and a per-CPU
> > cpumask results in 2MB being allocated, which I find insane.
>
> Only if there are actually 4096 CPUs enumerated. The per CPU magic is
> smart enough to limit the damage to the actual number of possible CPUs
> which are enumerated at boot time. It still will over-allocate due to
> NR_CPUS being insanely large but on a 4 CPU machine this boils down to
> 2k of memory waste unless Aaarg64 is stupid enough to allocate for
> NR_CPUS instead of num_possible_cpus()...
No difference between arm64 and xyz85.999 here.
>
> That said, on a real 4k CPU system 2M of memory should be the least of
> your worries.
Don't underestimate the general level of insanity!
>
> > You'll have to pick your own poison and convince Thomas of the
> > validity of your approach.
>
> As this is an operation which is really not suitable for on demand
> or large stack allocations the per CPU approach makes sense.
Right, so let's shoot for that. Kunkun, can you please give the
following hack a go with your workload?
Thanks,
M.
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index dd53298ef1a5..b6aa259ac749 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -224,15 +224,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
struct irq_desc *desc = irq_data_to_desc(data);
struct irq_chip *chip = irq_data_get_irq_chip(data);
const struct cpumask *prog_mask;
+ struct cpumask *tmp_mask;
int ret;
- static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
- static struct cpumask tmp_mask;
+ static DEFINE_PER_CPU(struct cpumask, __tmp_mask);
if (!chip || !chip->irq_set_affinity)
return -EINVAL;
- raw_spin_lock(&tmp_mask_lock);
+ tmp_mask = this_cpu_ptr(&__tmp_mask);
+
/*
* If this is a managed interrupt and housekeeping is enabled on
* it check whether the requested affinity mask intersects with
@@ -258,11 +259,11 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
- cpumask_and(&tmp_mask, mask, hk_mask);
- if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
+ cpumask_and(tmp_mask, mask, hk_mask);
+ if (!cpumask_intersects(tmp_mask, cpu_online_mask))
prog_mask = mask;
else
- prog_mask = &tmp_mask;
+ prog_mask = tmp_mask;
} else {
prog_mask = mask;
}
@@ -272,16 +273,14 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
* unless we are being asked to force the affinity (in which
* case we do as we are told).
*/
- cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
- if (!force && !cpumask_empty(&tmp_mask))
- ret = chip->irq_set_affinity(data, &tmp_mask, force);
+ cpumask_and(tmp_mask, prog_mask, cpu_online_mask);
+ if (!force && !cpumask_empty(tmp_mask))
+ ret = chip->irq_set_affinity(data, tmp_mask, force);
else if (force)
ret = chip->irq_set_affinity(data, mask, force);
else
ret = -EINVAL;
- raw_spin_unlock(&tmp_mask_lock);
-
switch (ret) {
case IRQ_SET_MASK_OK:
case IRQ_SET_MASK_OK_DONE:
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2024-08-23 8:57 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-21 9:51 [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption Kunkun Jiang
2024-08-21 10:59 ` Marc Zyngier
2024-08-21 18:23 ` Kunkun Jiang
2024-08-22 8:26 ` Marc Zyngier
2024-08-22 10:59 ` Kunkun Jiang
2024-08-22 12:47 ` Marc Zyngier
2024-08-22 21:20 ` Thomas Gleixner
2024-08-23 8:49 ` Marc Zyngier [this message]
2024-08-26 3:10 ` Kunkun Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86zfp3wrmy.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=james.morse@arm.com \
--cc=jiangkunkun@huawei.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nizhiqiang1@huawei.com \
--cc=oliver.upton@linux.dev \
--cc=suzuki.poulose@arm.com \
--cc=tangnianyao@huawei.com \
--cc=tglx@linutronix.de \
--cc=wanghaibin.wang@huawei.com \
--cc=wangzhou1@hisilicon.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).