From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB8F923BCED for ; Thu, 9 Oct 2025 13:48:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760017723; cv=none; b=rwsvwRieN0yLoNBMyBmJE6OGg7Dh/gSIHbDtYkmK8HBiGu1fYXmbBRvpgPImvPjxmFqsgqfNZ9WiG2mdKKfJ7MwtnNOTGbKVJ29w6x7rlI1kxO97qBMJq+EuUccRp4s+zFBa8RZ0x6Y4RXtXF2Mt1gtSQQXTkahF0ax3QEfBe9I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760017723; c=relaxed/simple; bh=yOPE8FhFDCKAnOAaZuxzb+46ozcT2L+rNhVYaTDtEeU=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=FIVLZZepUuYwMl0GgTiy0GVpaKvlgOrL3chO/HR960mw/B7kmhADlDdbCRrVwWPV/kpr+ZI44EMbz3iTCQcWLyRK6ibbfH2/dT3VGG1vKwbNrXI4gqn4tX3Bxlf3fvNKwDDS/aOl1ovO0osNQLVeLCCb8lIJpvjaZ/+1uMIacAQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VFhfxFsO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VFhfxFsO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F1EFC4CEE7; Thu, 9 Oct 2025 13:48:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760017723; bh=yOPE8FhFDCKAnOAaZuxzb+46ozcT2L+rNhVYaTDtEeU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=VFhfxFsOcKAv6SuuT9uUV+W7DSKX5NHkXLEwuebXTq960Mh+d+BxtGNrg49a3GhYO nUO43NHbDniYGBAJcuxgW2VVDoWgd8VmLPIDhNmuB9KNoUe8QtnPn87MEtOOYBvRJw zZUZbd0T7oLmWSg2ITUD0JuWF5ZAWFwA+E48LEZLJt4Z1zkGggXKx91DXYqN6ppHbq /joUKuXXMHyFj/KBvSNAvTFu3p7Fw5xqHWAxsgQTpGTcWF0bImEe4DQprDD1hssyQa aHE+XCY8GjSpmxnBWvkOL/VNfc3SmM36mXerFaQtRHb+LYJ/N/6icGB/8rM5ZKfDAL adoGttQ4AJ1/g== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1v6r0a-0000000Ce22-2vNk; Thu, 09 Oct 2025 13:48:40 +0000 Date: Thu, 09 Oct 2025 14:48:40 +0100 Message-ID: <86v7koxk1z.wl-maz@kernel.org> From: Marc Zyngier To: salil.mehta@opnsrc.net Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, salil.mehta@huawei.com, jonathan.cameron@huawei.com, will@kernel.org, catalin.marinas@arm.com, mark.rutland@arm.com, james.morse@arm.com, sudeep.holla@arm.com, lpieralisi@kernel.org, jean-philippe@linaro.org, tglx@linutronix.de, oliver.upton@linux.dev, peter.maydell@linaro.org, richard.henderson@linaro.org, andrew.jones@linux.dev, mst@redhat.com, david@redhat.com, philmd@linaro.org, ardb@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, linux@armlinux.org.uk, karl.heubaum@oracle.com, miguel.luis@oracle.com, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com Subject: Re: [RFC PATCH] KVM: arm64: vgic-v3: Cache ICC_CTLR_EL1 and allow lockless read when ready In-Reply-To: <20251008201955.3919537-1-salil.mehta@opnsrc.net> References: <20251008201955.3919537-1-salil.mehta@opnsrc.net> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: salil.mehta@opnsrc.net, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, salil.mehta@huawei.com, jonathan.cameron@huawei.com, will@kernel.org, catalin.marinas@arm.com, mark.rutland@arm.com, james.morse@arm.com, sudeep.holla@arm.com, lpieralisi@kernel.org, jean-philippe@linaro.org, tglx@linutronix.de, oliver.upton@linux.dev, peter.maydell@linaro.org, richard.henderson@linaro.org, andrew.jones@linux.dev, mst@redhat.com, david@redhat.com, philmd@linaro.org, ardb@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, linux@armlinux.org.uk, karl.heubaum@oracle.com, miguel.luis@oracle.com, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Wed, 08 Oct 2025 21:19:55 +0100, salil.mehta@opnsrc.net wrote: > > From: Salil Mehta > > [A rough illustration of the problem and the probable solution] > > Userspace reads of ICC_CTLR_EL1 via KVM device attributes currently takes a slow > path that may acquire all vCPU locks. Under workloads that exercise userspace > PSCI CPU_ON flows or frequent vCPU resets, this can cause vCPU lock contention > in KVM and, in the worst cases, -EBUSY returns to userspace. > > When PSCI CPU_ON and CPU_OFF calls are handled entirely in KVM, these operations > are executed under KVM vCPU locks in the host kernel (EL1) and appear atomic to > other vCPU threads. In this context, system register accesses are serialized > under KVM vCPU locks, ensuring atomicity with respect to other vCPUs. After > SMCCC filtering was introduced, PSCI CPU_ON and CPU_OFF calls can now exit to > userspace (QEMU). During the handling of PSCI CPU_ON call in userspace, a > cpu_reset() is exerted which reads ICC_CTLR_EL1 through KVM device attribute > IOCTLs. To avoid transient inconsistency and -EBUSY errors, QEMU is forced to > pause all vCPUs before issuing these IOCTLs. I'm going to repeat in public what I already said in private. Why does QEMU need to know this? I don't see how this is related to PSCI, and outside of save/restore, there is no reason why QEMU should poke at this. If QEMU needs fixing, please fix QEMU. Honestly, I don't see why the kernel should even care about this, and I have no intention of adopting anything of the sort for something that has all the hallmarks of a userspace bug. M. -- Without deviation from the norm, progress is not possible.