From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FACA2E633; Fri, 24 Nov 2023 14:32:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jkenYbL4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B6F3C433C8; Fri, 24 Nov 2023 14:32:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700836359; bh=gd9+eyKPQRosH6Zy3rvn5fSoMQYqPc0CQ/4F+lvxRpo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=jkenYbL4C3P1ur7PB5kdTPPQR04bbX0G9eSIFprhwFfRMBWcAxkf+CQgmG8xq8Wfn xHKIixMBrlqATC9SpOsNVLxaxmyT/80R8N6mNrm4/erfoZUU14ZoTZw1Z1IYPK9uR6 IBpxney2Z1GXn6Z7/PIJR1yYIEVDpIBzF4ACcjvW0izlpNmRtLZkwvIelHUheihOWD IqHAKRlGY/eBDcnBrxhY22eXA3gn+D8MsEvC11WvgU2ma2ElC4e8IfrToUsy04B3I9 PZd29BXpSrPSD3yNgnqaVrFIk4lDI1brbVt7l94dS7Zg41pjxPuJ2AiuG8N6PUw0dj /JUXUV4d0MVNQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1r6XES-00G7hW-QL; Fri, 24 Nov 2023 14:32:36 +0000 Date: Fri, 24 Nov 2023 14:32:36 +0000 Message-ID: <86leancjcr.wl-maz@kernel.org> From: Marc Zyngier To: Ganapatrao Kulkarni Cc: Miguel Luis , "kvmarm@lists.linux.dev" , "kvm@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Darren Hart , Jintack Lim , Russell King , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: Re: [PATCH v11 00/43] KVM: arm64: Nested Virtualization support (FEAT_NV2 only) In-Reply-To: References: <20231120131027.854038-1-maz@kernel.org> <86msv7ylnu.wl-maz@kernel.org> <05733774-4210-4097-9912-fb3aa8542fdd@oracle.com> <86a5r4zafh.wl-maz@kernel.org> <134912e4-beed-4ab6-8ce1-33e69ec382b3@os.amperecomputing.com> <868r6nzc5y.wl-maz@kernel.org> <65dc2a93-0a17-4433-b3a5-430bf516ffe9@os.amperecomputing.com> <86o7fjco13.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: gankulkarni@os.amperecomputing.com, miguel.luis@oracle.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Fri, 24 Nov 2023 13:22:22 +0000, Ganapatrao Kulkarni wrote: >=20 > > How is this value possible if the write to HCR_EL2 has taken place? > > When do you sample this? >=20 > I am not sure how and where it got set. I think, whatever it is set, > it is due to false return of vcpu_el2_e2h_is_set(). Need to > understand/debug. > The vhcr_el2 value I have shared is traced along with hcr in function > __activate_traps/__compute_hcr. Here's my hunch: The guest boots with E2H=3D0, because we don't advertise anything else on your HW. So we run with NV1=3D1 until we try to *upgrade* to VHE. NV2 means that HCR_EL2 is writable (to memory) without a trap. But we're still running with NV1=3D1. Subsequently, we access a sysreg that should never trap for a VHE guest, but we're with the wrong config. Bad things happen. Unfortunately, NV2 is pretty much incompatible with E2H being updated, because it cannot perform the changes that this would result into at the point where they should happen. We can try and do a best effort handling, but you can always trick it. Anyway, can you see if the hack below helps? I'm not keen on it at all, but this would be a good data point. M. =46rom c4b856221661393b884cbf673d100faaa8dc018a Mon Sep 17 00:00:00 2001 From: Marc Zyngier Date: Fri, 26 May 2023 12:16:05 +0100 Subject: [PATCH] KVM: arm64: Opportunistically track HCR_EL2.E2H being flip= ped Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 9 +++++++-- arch/arm64/kvm/hyp/include/hyp/switch.h | 13 +++++++++++++ arch/arm64/kvm/hyp/vhe/switch.c | 10 ++++++++-- 3 files changed, 28 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm= _host.h index c91f607e989d..d45ef41de5fb 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -655,6 +655,9 @@ struct kvm_vcpu_arch { /* State flags for kernel bookkeeping, unused by the hypervisor code */ u8 sflags; =20 + /* Bookkeeping flags for NV */ + u8 nvflags; + /* * Don't run the guest (internal implementation need). * @@ -858,8 +861,6 @@ struct kvm_vcpu_arch { #define DEBUG_STATE_SAVE_SPE __vcpu_single_flag(iflags, BIT(5)) /* Save TRBE context if active */ #define DEBUG_STATE_SAVE_TRBE __vcpu_single_flag(iflags, BIT(6)) -/* vcpu running in HYP context */ -#define VCPU_HYP_CONTEXT __vcpu_single_flag(iflags, BIT(7)) =20 /* SVE enabled for host EL0 */ #define HOST_SVE_ENABLED __vcpu_single_flag(sflags, BIT(0)) @@ -878,6 +879,10 @@ struct kvm_vcpu_arch { /* WFI instruction trapped */ #define IN_WFI __vcpu_single_flag(sflags, BIT(7)) =20 +/* vcpu running in HYP context */ +#define VCPU_HYP_CONTEXT __vcpu_single_flag(nvflags, BIT(0)) +/* vcpu entered with HCR_EL2.E2H set */ +#define VCPU_HCR_E2H __vcpu_single_flag(nvflags, BIT(1)) =20 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \ diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/i= nclude/hyp/switch.h index aed2ea35082c..9c1346116d61 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -669,6 +669,19 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *v= cpu, u64 *exit_code) */ synchronize_vcpu_pstate(vcpu, exit_code); =20 + if (vcpu_has_nv(vcpu) && + (!!vcpu_get_flag(vcpu, VCPU_HCR_E2H) ^ vcpu_el2_e2h_is_set(vcpu))) { + if (vcpu_el2_e2h_is_set(vcpu)) { + sysreg_clear_set(hcr_el2, HCR_NV1, 0); + vcpu_set_flag(vcpu, VCPU_HCR_E2H); + } else { + sysreg_clear_set(hcr_el2, 0, HCR_NV1); + vcpu_clear_flag(vcpu, VCPU_HCR_E2H); + } + + return true; + } + /* * Check whether we want to repaint the state one way or * another. diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switc= h.c index 8d1e9d1adabe..395aaa06f358 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -447,10 +447,16 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) sysreg_restore_guest_state_vhe(guest_ctxt); __debug_switch_to_guest(vcpu); =20 - if (is_hyp_ctxt(vcpu)) + if (is_hyp_ctxt(vcpu)) { + if (vcpu_el2_e2h_is_set(vcpu)) + vcpu_set_flag(vcpu, VCPU_HCR_E2H); + else + vcpu_clear_flag(vcpu, VCPU_HCR_E2H); + vcpu_set_flag(vcpu, VCPU_HYP_CONTEXT); - else + } else { vcpu_clear_flag(vcpu, VCPU_HYP_CONTEXT); + } =20 do { /* Jump in the fire! */ --=20 2.39.2 --=20 Without deviation from the norm, progress is not possible.