From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4E61B8801 for ; Wed, 31 Jul 2024 10:28:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722421731; cv=none; b=UIYtob3jXvtOkhqaoMweD2ZI1dUfLzT3l25GO4jvUr5bXrXvqTmmK7eaJyvgbvGPBxIl4anoY8PT4RWReNkSGMjks9i/a8hOMGAVBfTnVblEuKVNN4nrC/tw+hhkdY11LlisWlLHYNVe79vQu+xltE2bGI5JeV5cxR7ThbejwAU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722421731; c=relaxed/simple; bh=c5Ew2+ZqjYSk7vcJ7k5ogBFHyj8kv53ZT6zR2pFd/W4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=STCDfGLpyvdBGiGUHcwPWwqUICBncpWZs4DpX1niL6N/NNrQowe7Pb+N74eS1A8z1m3st+7zrTQyQh2KNcK+AMV6XHgamZBSXCrahOF9hUhbsGI2/vM4Dl5HGENoBBBYjQbUW73XXI3fEZVNiMujpJnwTt8e3TFohESeMJkOXBQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 283E81007; Wed, 31 Jul 2024 03:29:14 -0700 (PDT) Received: from raptor (usa-sjc-mx-foss1.foss.arm.com [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E5F613F5A1; Wed, 31 Jul 2024 03:28:46 -0700 (PDT) Date: Wed, 31 Jul 2024 11:28:44 +0100 From: Alexandru Elisei To: Marc Zyngier Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu , Joey Gouly Subject: Re: [PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation Message-ID: References: <20240625133508.259829-1-maz@kernel.org> <20240708165800.1220065-1-maz@kernel.org> <86v80m0wlb.wl-maz@kernel.org> <86ttg527c1.wl-maz@kernel.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86ttg527c1.wl-maz@kernel.org> Hi, On Wed, Jul 31, 2024 at 11:18:06AM +0100, Marc Zyngier wrote: > On Wed, 31 Jul 2024 10:53:14 +0100, > Alexandru Elisei wrote: > > > > Hi, > > > > On Wed, Jul 31, 2024 at 09:55:28AM +0100, Marc Zyngier wrote: > > > On Mon, 29 Jul 2024 16:26:00 +0100, > > > Alexandru Elisei wrote: > > > > > > > > Hi Marc, > > > > > > > > On Mon, Jul 08, 2024 at 05:57:58PM +0100, Marc Zyngier wrote: > > > > > In order to plug the brokenness of our current AT implementation, > > > > > we need a SW walker that is going to... err.. walk the S1 tables > > > > > and tell us what it finds. > > > > > > > > > > Of course, it builds on top of our S2 walker, and share similar > > > > > concepts. The beauty of it is that since it uses kvm_read_guest(), > > > > > it is able to bring back pages that have been otherwise evicted. > > > > > > > > > > This is then plugged in the two AT S1 emulation functions as > > > > > a "slow path" fallback. I'm not sure it is that slow, but hey. > > > > > > > > > > Signed-off-by: Marc Zyngier > > > > > --- > > > > > arch/arm64/kvm/at.c | 538 ++++++++++++++++++++++++++++++++++++++++++-- > > > > > 1 file changed, 520 insertions(+), 18 deletions(-) > > > > > > > > > > diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c > > > > > index 71e3390b43b4c..8452273cbff6d 100644 > > > > > --- a/arch/arm64/kvm/at.c > > > > > +++ b/arch/arm64/kvm/at.c > > > > > @@ -4,9 +4,305 @@ > > > > > * Author: Jintack Lim > > > > > */ > > > > > > > > > > +#include > > > > > + > > > > > +#include > > > > > #include > > > > > #include > > > > > > > > > > +struct s1_walk_info { > > > > > + u64 baddr; > > > > > + unsigned int max_oa_bits; > > > > > + unsigned int pgshift; > > > > > + unsigned int txsz; > > > > > + int sl; > > > > > + bool hpd; > > > > > + bool be; > > > > > + bool nvhe; > > > > > + bool s2; > > > > > +}; > > > > > + > > > > > +struct s1_walk_result { > > > > > + union { > > > > > + struct { > > > > > + u64 desc; > > > > > + u64 pa; > > > > > + s8 level; > > > > > + u8 APTable; > > > > > + bool UXNTable; > > > > > + bool PXNTable; > > > > > + }; > > > > > + struct { > > > > > + u8 fst; > > > > > + bool ptw; > > > > > + bool s2; > > > > > + }; > > > > > + }; > > > > > + bool failed; > > > > > +}; > > > > > + > > > > > +static void fail_s1_walk(struct s1_walk_result *wr, u8 fst, bool ptw, bool s2) > > > > > +{ > > > > > + wr->fst = fst; > > > > > + wr->ptw = ptw; > > > > > + wr->s2 = s2; > > > > > + wr->failed = true; > > > > > +} > > > > > + > > > > > +#define S1_MMU_DISABLED (-127) > > > > > + > > > > > +static int setup_s1_walk(struct kvm_vcpu *vcpu, struct s1_walk_info *wi, > > > > > + struct s1_walk_result *wr, const u64 va, const int el) > > > > > +{ > > > > > + u64 sctlr, tcr, tg, ps, ia_bits, ttbr; > > > > > + unsigned int stride, x; > > > > > + bool va55, tbi; > > > > > + > > > > > + wi->nvhe = el == 2 && !vcpu_el2_e2h_is_set(vcpu); > > > > > > > > Where 'el' is computed in handle_at_slow() as: > > > > > > > > /* > > > > * We only get here from guest EL2, so the translation regime > > > > * AT applies to is solely defined by {E2H,TGE}. > > > > */ > > > > el = (vcpu_el2_e2h_is_set(vcpu) && > > > > vcpu_el2_tge_is_set(vcpu)) ? 2 : 1; > > > > > > > > I think 'nvhe' will always be false ('el' is 2 only when E2H is > > > > set). > > > > > > Yeah, there is a number of problems here. el should depend on both the > > > instruction (some are EL2-specific) and the HCR control bits. I'll > > > tackle that now. > > > > Yeah, also noticed that how sctlr, tcr and ttbr are chosen in setup_s1_walk() > > doesn't look quite right for the nvhe case. > > Are you sure? Assuming the 'el' value is correct (and I think I fixed > that on my local branch), they seem correct to me (we check for va55 > early in the function to avoid an later issue). > > Can you point out what exactly fails in that logic? I was trying to say that another consequence of el being 1 in the nvhe case was that sctlr, tcr and ttbr were read from the EL1 variants of the registers, instead of EL2. Sorry if that wasn't clear. Thanks, Alex > > > > > > > > > > I'm curious about what 'el' represents. The translation regime for the AT > > > > instruction? > > > > > > Exactly that. > > > > Might I make a suggestion here? I was thinking about dropping the (el, wi-nvhe*) > > tuple to represent the translation regime and have a wi->regime (or similar) to > > unambiguously encode the regime. The value can be an enum with three values to > > represent the three possible regimes (REGIME_EL10, REGIME_EL2, REGIME_EL20). > > I've been thinking of that, but I'm wondering whether that just > results in pretty awful code in the end, because we go from 2 cases > (el==1 or el==2) to 3. But most of the time, we don't care about the > E2H=0 case, because we can handle it just like E2H=1. > > I'll give it a go and see what it looks like. > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible.