All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	"open list:IRQ\ SUBSYSTEM" <linux-kernel@vger.kernel.org>,
	"moderated list:ARM SMMU DRIVERS"
	<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
	"wanghaibin.wang@huawei.com" <wanghaibin.wang@huawei.com>,
	<nizhiqiang1@huawei.com>,
	"tangnianyao@huawei.com" <tangnianyao@huawei.com>,
	<wangzhou1@hisilicon.com>
Subject: Re: [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption
Date: Thu, 22 Aug 2024 13:47:01 +0100	[thread overview]
Message-ID: <864j7cybay.wl-maz@kernel.org> (raw)
In-Reply-To: <bd3c3103-a6d7-a91b-911d-5bc5f2382dae@huawei.com>

On Thu, 22 Aug 2024 11:59:50 +0100,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2024/8/22 16:26, Marc Zyngier wrote:
> >>>> According to analysis, this problem is due to the execution of vgic_v4_load.
> >>>> vcpu_load or kvm_sched_in
> >>>>       kvm_arch_vcpu_load
> >>>>       ...
> >>>>           vgic_v4_load
> >>>>               irq_set_affinity
> >>>>               ...
> >>>>                   irq_do_set_affinity
> >>>>                       raw_spin_lock(&tmp_mask_lock)
> >>>>                       chip->irq_set_affinity
> >>>>                       ...
> >>>>                         its_vpe_set_affinity
> >>>> 
> >>>> The tmp_mask_lock is the key. This is a global lock. I don't quite
> >>>> understand
> >>>> why tmp_mask_lock is needed here. I think there are two possible
> >>>> solutions here:
> >>>> 1. Remove this tmp_mask_lock
> >>> 
> >>> Maybe you could have a look at 33de0aa4bae98 (and 11ea68f553e24)? It
> >>> would allow you to understand the nature of the problem.
> >>> 
> >>> This can probably be replaced with a per-CPU cpumask, which would
> >>> avoid the locking, but potentially result in a larger memory usage.
> >> 
> >> Thanks, I will try it.
> > 
> > A simple alternative would be this:
> > 
> > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > index dd53298ef1a5..0d11b74af38c 100644
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -224,15 +224,12 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> >   	struct irq_desc *desc = irq_data_to_desc(data);
> >   	struct irq_chip *chip = irq_data_get_irq_chip(data);
> >   	const struct cpumask  *prog_mask;
> > +	struct cpumask tmp_mask = {};
> >   	int ret;
> >   -	static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> > -	static struct cpumask tmp_mask;
> > -
> >   	if (!chip || !chip->irq_set_affinity)
> >   		return -EINVAL;
> >   -	raw_spin_lock(&tmp_mask_lock);
> >   	/*
> >   	 * If this is a managed interrupt and housekeeping is enabled on
> >   	 * it check whether the requested affinity mask intersects with
> > @@ -280,8 +277,6 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> >   	else
> >   		ret = -EINVAL;
> >   -	raw_spin_unlock(&tmp_mask_lock);
> > -
> >   	switch (ret) {
> >   	case IRQ_SET_MASK_OK:
> >   	case IRQ_SET_MASK_OK_DONE:
> > 
> > but that will eat a significant portion of your stack if your kernel is
> > configured for a large number of CPUs.
> > 
> 
> Currently CONFIG_NR_CPUS=4096,each `struct cpumask` occupies 512 bytes.

This seems crazy. Why would you build a kernel with something *that*
big, specially considering that you have a lot less than 1k CPUs?

[...]

> > The removal of this global lock is the only option in my opinion.
> > Either the cpumask becomes a stack variable, or it becomes a static
> > per-CPU variable. Both have drawbacks, but they are not a bottleneck
> > anymore.
> 
> I also prefer to remove the global lock. Which variable do you think is
> better?

Given the number of CPUs your system is configured for, there is no
good answer. An on-stack variable is dangerously large, and a per-CPU
cpumask results in 2MB being allocated, which I find insane.

You'll have to pick your own poison and convince Thomas of the
validity of your approach.

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2024-08-22 12:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-21  9:51 [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption Kunkun Jiang
2024-08-21 10:59 ` Marc Zyngier
2024-08-21 18:23   ` Kunkun Jiang
2024-08-22  8:26     ` Marc Zyngier
2024-08-22 10:59       ` Kunkun Jiang
2024-08-22 12:47         ` Marc Zyngier [this message]
2024-08-22 21:20           ` Thomas Gleixner
2024-08-23  8:49             ` Marc Zyngier
2024-08-26  3:10               ` Kunkun Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=864j7cybay.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=james.morse@arm.com \
    --cc=jiangkunkun@huawei.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nizhiqiang1@huawei.com \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=tangnianyao@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=wanghaibin.wang@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.