From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 309277BB14; Wed, 21 Aug 2024 10:59:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724237945; cv=none; b=koSAYtcJwKRk2SN/CDXSdrXnnFNSuz3Y5dY2ZWQU2g5HSZMXQYPa8BZ/o22TzgdgmWdBcpSs2CGry0Mtvz2Sbg/Twds9M0YeB8HBYV27gPTYzaVuE20ptRls68QqBs9RVPEoTzWh/O8CewWvlxTG6yIQbfZ2MAc2A9fToQALBWI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724237945; c=relaxed/simple; bh=6VtQ654ZipmwCUWFmVP4piOvyD/1VFnoOAb687T2AYM=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=BdP45ZD64okK/kFzlti4zQ8agI3S3BaXSSwJvW+HyyuxDSk0bZdjUpHSmTDnfuXOWp5DVThHgbZgYkSwBKPT5eKXYQ2qSyLBzN6U11EKMjyhE6zSpu3c7+clDTN7Qjxigk1BP9EX6qflZf6YKIEASNECcMvFW05xIFzyW9dEBbk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JqOK4Tp8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JqOK4Tp8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 98B4AC32782; Wed, 21 Aug 2024 10:59:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724237944; bh=6VtQ654ZipmwCUWFmVP4piOvyD/1VFnoOAb687T2AYM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=JqOK4Tp8kjl/H7p9XVAPCs/YbKZ2lCYJYbalErDH+pX9g2W8k8dGyBAx4Uysw9uTG LZgF/UwmrnEAQLLH2zoDDKHbj5FKcs1lIEY0V3XtIkhrsgbrm9N4ctIvPxCulJrv8g Q4jA5FKnUPihHi1ULapkwcOCkcTObb6fbpaf+d/fOZnIvbJ559k0OYZkR/jE016afQ mcLWE63gpgkd/mW9Lyjtkp5M9zEy2/0hUBfUSx93wOGGdRpRZ77RaFo151VhwZfnel v/agk5+UD12Ss0bUNlfeHVd+dCxYAXm1foxwHGQPpTQCG47ssM4/Q5fvyvfpcZTmjg 2X8RCLYV26Dug== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1sgj3O-005Y81-1P; Wed, 21 Aug 2024 11:59:02 +0100 Date: Wed, 21 Aug 2024 11:59:01 +0100 Message-ID: <86msl6xhu2.wl-maz@kernel.org> From: Marc Zyngier To: Kunkun Jiang Cc: Thomas Gleixner , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , "open list:IRQ SUBSYSTEM" , "moderated\ list:ARM SMMU DRIVERS" , , "wanghaibin.wang@huawei.com" , , "tangnianyao@huawei.com" , Subject: Re: [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: jiangkunkun@huawei.com, tglx@linutronix.de, oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, wanghaibin.wang@huawei.com, nizhiqiang1@huawei.com, tangnianyao@huawei.com, wangzhou1@hisilicon.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Wed, 21 Aug 2024 10:51:27 +0100, Kunkun Jiang wrote: >=20 > Hi all, >=20 > Recently I discovered a problem about GICv4.1, the scenario is as follows: > 1. Enable GICv4.1 > 2. Create multiple VMs.For example, 50 VMs(4U8G) I don't know what 4U8G means. On how many physical CPUs are you running 50 VMs? Direct injection of interrupts and over-subscription are fundamentally incompatible. > 3. The business running in VMs has a frequent mmio access and need to exit > =C2=A0 to qemu for processing. > 4. Or modify the kvm code so that wfi must trap to kvm > 5. Then the utilization of pcpu where the vcpu is located will be 100%,and > =C2=A0 basically all in sys. What did you expect? If you trap all the time, your performance will suck. Don't do that. > 6. This problem does not exist in GICv3. Because GICv3 doesn't have the same constraints. >=20 > According to analysis, this problem is due to the execution of vgic_v4_lo= ad. > vcpu_load or kvm_sched_in > =C2=A0=C2=A0=C2=A0 kvm_arch_vcpu_load > =C2=A0=C2=A0=C2=A0 ... > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 vgic_v4_load > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 irq_set_affinity > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 ... > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2= =A0 irq_do_set_affinity > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2= =A0 =C2=A0=C2=A0=C2=A0 raw_spin_lock(&tmp_mask_lock) > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2= =A0 =C2=A0=C2=A0=C2=A0 chip->irq_set_affinity > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2= =A0 =C2=A0=C2=A0=C2=A0 ... > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2= =A0 =C2=A0=C2=A0=C2=A0 =C2=A0 its_vpe_set_affinity >=20 > The tmp_mask_lock is the key. This is a global lock. I don't quite > understand > why tmp_mask_lock is needed here. I think there are two possible > solutions here: > 1. Remove this tmp_mask_lock Maybe you could have a look at 33de0aa4bae98 (and 11ea68f553e24)? It would allow you to understand the nature of the problem. This can probably be replaced with a per-CPU cpumask, which would avoid the locking, but potentially result in a larger memory usage. > 2. Modify the gicv4 driver,do not perfrom VMOVP via > irq_set_affinity. Sure. You could also not use KVM at all if don't care about interrupts being delivered to your VM. We do not send a VMOVP just for fun. We send it because your vcpu has moved to a different CPU, and the ITS needs to know about that. You seem to be misunderstanding the use case for GICv4: a partitioned system, without any over-subscription, no vcpu migration between CPUs. If that's not your setup, then GICv4 will always be a net loss compared to SW injection with GICv3 (additional HW interaction, doorbell interrupts). M. --=20 Without deviation from the norm, progress is not possible.