From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 309277BB14;
	Wed, 21 Aug 2024 10:59:04 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1724237945; cv=none; b=koSAYtcJwKRk2SN/CDXSdrXnnFNSuz3Y5dY2ZWQU2g5HSZMXQYPa8BZ/o22TzgdgmWdBcpSs2CGry0Mtvz2Sbg/Twds9M0YeB8HBYV27gPTYzaVuE20ptRls68QqBs9RVPEoTzWh/O8CewWvlxTG6yIQbfZ2MAc2A9fToQALBWI=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1724237945; c=relaxed/simple;
	bh=6VtQ654ZipmwCUWFmVP4piOvyD/1VFnoOAb687T2AYM=;
	h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References:
	 MIME-Version:Content-Type; b=BdP45ZD64okK/kFzlti4zQ8agI3S3BaXSSwJvW+HyyuxDSk0bZdjUpHSmTDnfuXOWp5DVThHgbZgYkSwBKPT5eKXYQ2qSyLBzN6U11EKMjyhE6zSpu3c7+clDTN7Qjxigk1BP9EX6qflZf6YKIEASNECcMvFW05xIFzyW9dEBbk=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JqOK4Tp8; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JqOK4Tp8"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 98B4AC32782;
	Wed, 21 Aug 2024 10:59:04 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1724237944;
	bh=6VtQ654ZipmwCUWFmVP4piOvyD/1VFnoOAb687T2AYM=;
	h=Date:From:To:Cc:Subject:In-Reply-To:References:From;
	b=JqOK4Tp8kjl/H7p9XVAPCs/YbKZ2lCYJYbalErDH+pX9g2W8k8dGyBAx4Uysw9uTG
	 LZgF/UwmrnEAQLLH2zoDDKHbj5FKcs1lIEY0V3XtIkhrsgbrm9N4ctIvPxCulJrv8g
	 Q4jA5FKnUPihHi1ULapkwcOCkcTObb6fbpaf+d/fOZnIvbJ559k0OYZkR/jE016afQ
	 mcLWE63gpgkd/mW9Lyjtkp5M9zEy2/0hUBfUSx93wOGGdRpRZ77RaFo151VhwZfnel
	 v/agk5+UD12Ss0bUNlfeHVd+dCxYAXm1foxwHGQPpTQCG47ssM4/Q5fvyvfpcZTmjg
	 2X8RCLYV26Dug==
Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org)
	by disco-boy.misterjones.org with esmtpsa  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
	(Exim 4.95)
	(envelope-from <maz@kernel.org>)
	id 1sgj3O-005Y81-1P;
	Wed, 21 Aug 2024 11:59:02 +0100
Date: Wed, 21 Aug 2024 11:59:01 +0100
Message-ID: <86msl6xhu2.wl-maz@kernel.org>
From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	"open list:IRQ SUBSYSTEM" <linux-kernel@vger.kernel.org>,
	"moderated\
 list:ARM SMMU DRIVERS" <linux-arm-kernel@lists.infradead.org>,
	<kvmarm@lists.linux.dev>,
	"wanghaibin.wang@huawei.com"
	<wanghaibin.wang@huawei.com>,
	<nizhiqiang1@huawei.com>,
	"tangnianyao@huawei.com" <tangnianyao@huawei.com>,
	<wangzhou1@hisilicon.com>
Subject: Re: [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption
In-Reply-To: <a7fc58e4-64c2-77fc-c1dc-f5eb78dbbb01@huawei.com>
References: <a7fc58e4-64c2-77fc-c1dc-f5eb78dbbb01@huawei.com>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4
 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO)
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-SA-Exim-Connect-IP: 185.219.108.64
X-SA-Exim-Rcpt-To: jiangkunkun@huawei.com, tglx@linutronix.de, oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, wanghaibin.wang@huawei.com, nizhiqiang1@huawei.com, tangnianyao@huawei.com, wangzhou1@hisilicon.com
X-SA-Exim-Mail-From: maz@kernel.org
X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false

On Wed, 21 Aug 2024 10:51:27 +0100,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>=20
> Hi all,
>=20
> Recently I discovered a problem about GICv4.1, the scenario is as follows:
> 1. Enable GICv4.1
> 2. Create multiple VMs.For example, 50 VMs(4U8G)

I don't know what 4U8G means. On how many physical CPUs are you
running 50 VMs? Direct injection of interrupts and over-subscription
are fundamentally incompatible.

> 3. The business running in VMs has a frequent mmio access and need to exit
> =C2=A0 to qemu for processing.
> 4. Or modify the kvm code so that wfi must trap to kvm
> 5. Then the utilization of pcpu where the vcpu is located will be 100%,and
> =C2=A0 basically all in sys.

What did you expect? If you trap all the time, your performance will
suck.  Don't do that.

> 6. This problem does not exist in GICv3.

Because GICv3 doesn't have the same constraints.

>=20
> According to analysis, this problem is due to the execution of vgic_v4_lo=
ad.
> vcpu_load or kvm_sched_in
> =C2=A0=C2=A0=C2=A0 kvm_arch_vcpu_load
> =C2=A0=C2=A0=C2=A0 ...
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 vgic_v4_load
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 irq_set_affinity
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 ...
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=
=A0 irq_do_set_affinity
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=
=A0 =C2=A0=C2=A0=C2=A0 raw_spin_lock(&tmp_mask_lock)
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=
=A0 =C2=A0=C2=A0=C2=A0 chip->irq_set_affinity
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=
=A0 =C2=A0=C2=A0=C2=A0 ...
> =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=
=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 its_vpe_set_affinity
>=20
> The tmp_mask_lock is the key. This is a global lock. I don't quite
> understand
> why tmp_mask_lock is needed here. I think there are two possible
> solutions here:
> 1. Remove this tmp_mask_lock

Maybe you could have a look at 33de0aa4bae98 (and 11ea68f553e24)? It
would allow you to understand the nature of the problem.

This can probably be replaced with a per-CPU cpumask, which would
avoid the locking, but potentially result in a larger memory usage.

> 2. Modify the gicv4 driver,do not perfrom VMOVP via
> irq_set_affinity.

Sure. You could also not use KVM at all if don't care about interrupts
being delivered to your VM. We do not send a VMOVP just for fun. We
send it because your vcpu has moved to a different CPU, and the ITS
needs to know about that.

You seem to be misunderstanding the use case for GICv4: a partitioned
system, without any over-subscription, no vcpu migration between CPUs.
If that's not your setup, then GICv4 will always be a net loss
compared to SW injection with GICv3 (additional HW interaction,
doorbell interrupts).

	M.

--=20
Without deviation from the norm, progress is not possible.