linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	"open list:IRQ\ SUBSYSTEM" <linux-kernel@vger.kernel.org>,
	"moderated list:ARM SMMU DRIVERS"
	<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
	"wanghaibin.wang@huawei.com" <wanghaibin.wang@huawei.com>,
	<nizhiqiang1@huawei.com>,
	"tangnianyao@huawei.com" <tangnianyao@huawei.com>,
	<wangzhou1@hisilicon.com>
Subject: Re: [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption
Date: Thu, 22 Aug 2024 13:47:01 +0100	[thread overview]
Message-ID: <864j7cybay.wl-maz@kernel.org> (raw)
In-Reply-To: <bd3c3103-a6d7-a91b-911d-5bc5f2382dae@huawei.com>

On Thu, 22 Aug 2024 11:59:50 +0100,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2024/8/22 16:26, Marc Zyngier wrote:
> >>>> According to analysis, this problem is due to the execution of vgic_v4_load.
> >>>> vcpu_load or kvm_sched_in
> >>>>       kvm_arch_vcpu_load
> >>>>       ...
> >>>>           vgic_v4_load
> >>>>               irq_set_affinity
> >>>>               ...
> >>>>                   irq_do_set_affinity
> >>>>                       raw_spin_lock(&tmp_mask_lock)
> >>>>                       chip->irq_set_affinity
> >>>>                       ...
> >>>>                         its_vpe_set_affinity
> >>>> 
> >>>> The tmp_mask_lock is the key. This is a global lock. I don't quite
> >>>> understand
> >>>> why tmp_mask_lock is needed here. I think there are two possible
> >>>> solutions here:
> >>>> 1. Remove this tmp_mask_lock
> >>> 
> >>> Maybe you could have a look at 33de0aa4bae98 (and 11ea68f553e24)? It
> >>> would allow you to understand the nature of the problem.
> >>> 
> >>> This can probably be replaced with a per-CPU cpumask, which would
> >>> avoid the locking, but potentially result in a larger memory usage.
> >> 
> >> Thanks, I will try it.
> > 
> > A simple alternative would be this:
> > 
> > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > index dd53298ef1a5..0d11b74af38c 100644
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -224,15 +224,12 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> >   	struct irq_desc *desc = irq_data_to_desc(data);
> >   	struct irq_chip *chip = irq_data_get_irq_chip(data);
> >   	const struct cpumask  *prog_mask;
> > +	struct cpumask tmp_mask = {};
> >   	int ret;
> >   -	static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> > -	static struct cpumask tmp_mask;
> > -
> >   	if (!chip || !chip->irq_set_affinity)
> >   		return -EINVAL;
> >   -	raw_spin_lock(&tmp_mask_lock);
> >   	/*
> >   	 * If this is a managed interrupt and housekeeping is enabled on
> >   	 * it check whether the requested affinity mask intersects with
> > @@ -280,8 +277,6 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> >   	else
> >   		ret = -EINVAL;
> >   -	raw_spin_unlock(&tmp_mask_lock);
> > -
> >   	switch (ret) {
> >   	case IRQ_SET_MASK_OK:
> >   	case IRQ_SET_MASK_OK_DONE:
> > 
> > but that will eat a significant portion of your stack if your kernel is
> > configured for a large number of CPUs.
> > 
> 
> Currently CONFIG_NR_CPUS=4096,each `struct cpumask` occupies 512 bytes.

This seems crazy. Why would you build a kernel with something *that*
big, specially considering that you have a lot less than 1k CPUs?

[...]

> > The removal of this global lock is the only option in my opinion.
> > Either the cpumask becomes a stack variable, or it becomes a static
> > per-CPU variable. Both have drawbacks, but they are not a bottleneck
> > anymore.
> 
> I also prefer to remove the global lock. Which variable do you think is
> better?

Given the number of CPUs your system is configured for, there is no
good answer. An on-stack variable is dangerously large, and a per-CPU
cpumask results in 2MB being allocated, which I find insane.

You'll have to pick your own poison and convince Thomas of the
validity of your approach.

	M.

-- 
Without deviation from the norm, progress is not possible.


  reply	other threads:[~2024-08-22 12:48 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-21  9:51 [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption Kunkun Jiang
2024-08-21 10:59 ` Marc Zyngier
2024-08-21 18:23   ` Kunkun Jiang
2024-08-22  8:26     ` Marc Zyngier
2024-08-22 10:59       ` Kunkun Jiang
2024-08-22 12:47         ` Marc Zyngier [this message]
2024-08-22 21:20           ` Thomas Gleixner
2024-08-23  8:49             ` Marc Zyngier
2024-08-26  3:10               ` Kunkun Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=864j7cybay.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=james.morse@arm.com \
    --cc=jiangkunkun@huawei.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nizhiqiang1@huawei.com \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=tangnianyao@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=wanghaibin.wang@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).