From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7507FEB64DC for ; Fri, 14 Jul 2023 10:53:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236159AbjGNKx4 (ORCPT ); Fri, 14 Jul 2023 06:53:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235980AbjGNKxz (ORCPT ); Fri, 14 Jul 2023 06:53:55 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1F183586 for ; Fri, 14 Jul 2023 03:53:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 03CE461254 for ; Fri, 14 Jul 2023 10:53:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4EC4BC433C7; Fri, 14 Jul 2023 10:53:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689332030; bh=hb6G0NFnxdFLajeZOK790VLhDODwvyklb8u4Z0C1JIk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=rrWne86xEumeDuFtAxg3Vquie4wsPdB5J2wCZX3tzvRUcu6/VBVvEQ2daeVuHeIly Qc+8Glt12wZYDUg0ZnNiBFlI2nd4cMVJzmOtpNDjAYfnnIRkdPQgs8Vz/H2L0ufF/p mDy7t1GyNYYV/N05+Sib+QnGiROQRkLaJBPvL2o0/+QYmMQ0FAcwi/RVDPgECgtUve eE2u83RaTkYcHuX7+saBtFrK99XlGMYzS3ce5nS/+0ga27nRs/s/OtpV21u3lizfAJ qXcdBRJ0MXTMou1f35cKOT88vK27GLOJsyH0lV3GbhJR4MTg5F5uAoakI1f94Mo1IS VrGpMq1ZHyoOA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qKGQl-00D53c-Q9; Fri, 14 Jul 2023 11:53:47 +0100 Date: Fri, 14 Jul 2023 11:53:45 +0100 Message-ID: <86h6q6vk5i.wl-maz@kernel.org> From: Marc Zyngier To: Eric Auger Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: Re: [PATCH v10 09/59] KVM: arm64: nv: Add trap forwarding infrastructure In-Reply-To: <03f175b2-af7d-ea94-38c5-0f414518dcff@redhat.com> References: <20230515173103.1017669-1-maz@kernel.org> <20230515173103.1017669-10-maz@kernel.org> <03f175b2-af7d-ea94-38c5-0f414518dcff@redhat.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: eauger@redhat.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hi Eric, Careful, you are not replying to the trap forwarding series, but to an older full NV series. The code hasn't majorly changed since, but there are some differences. On Thu, 13 Jul 2023 15:29:02 +0100, Eric Auger wrote: > > Hi Marc, > > On 5/15/23 19:30, Marc Zyngier wrote: > > A significant part of what a NV hypervisor needs to do is to decide > > whether a trap from a L2+ guest has to be forwarded to a L1 guest > > or handled locally. This is done by checking for the trap bits that > I am confused by the terminology. The comment below says > ' When the trapped access matches one of the trap controls, the > exception is re-injected in the nested hypervisor. ' Can you spell out what confuses you here? I'm happy to rework the commit log, the comment, or even both of them. > > > the guest hypervisor has set and acting accordingly, as described by > > the architecture. > > > > A previous approach was to sprinkle a bunch of checks in all the > > system register accessors, but this is pretty error prone and doesn't > > help getting an overview of what is happening. > > > > Instead, implement a set of global tables that describe a trap bit, > > combinations of trap bits, behaviours on trap, and what bits must > > be evaluated on a system register trap. > > > > Although this is painful to describe, this allows to specify each > > and every control bit in a static manner. To make it efficient, > > the table is inserted in an xarray that is global to the system, > > and checked each time we trap a system register. > > > > Add the basic infrastructure for now, while additional patches will > > implement configuration registers. > > > > Signed-off-by: Marc Zyngier > > --- > > arch/arm64/include/asm/kvm_host.h | 1 + > > arch/arm64/include/asm/kvm_nested.h | 2 + > > arch/arm64/kvm/emulate-nested.c | 175 ++++++++++++++++++++++++++++ > > arch/arm64/kvm/sys_regs.c | 6 + > > arch/arm64/kvm/trace_arm.h | 19 +++ > > 5 files changed, 203 insertions(+) > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index f2e3b5889f8b..65810618cb42 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -960,6 +960,7 @@ int kvm_handle_cp10_id(struct kvm_vcpu *vcpu); > > void kvm_reset_sys_regs(struct kvm_vcpu *vcpu); > > > > int __init kvm_sys_reg_table_init(void); > > +void __init populate_nv_trap_config(void); > > > > bool lock_all_vcpus(struct kvm *kvm); > > void unlock_all_vcpus(struct kvm *kvm); > > diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h > > index 8fb67f032fd1..fa23cc9c2adc 100644 > > --- a/arch/arm64/include/asm/kvm_nested.h > > +++ b/arch/arm64/include/asm/kvm_nested.h > > @@ -11,6 +11,8 @@ static inline bool vcpu_has_nv(const struct kvm_vcpu *vcpu) > > test_bit(KVM_ARM_VCPU_HAS_EL2, vcpu->arch.features)); > > } > > > > +extern bool __check_nv_sr_forward(struct kvm_vcpu *vcpu); > > + > > struct sys_reg_params; > > struct sys_reg_desc; > > > > diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c > > index b96662029fb1..a923f7f47add 100644 > > --- a/arch/arm64/kvm/emulate-nested.c > > +++ b/arch/arm64/kvm/emulate-nested.c > > @@ -14,6 +14,181 @@ > > > > #include "trace.h" > > > > +enum trap_behaviour { > > + BEHAVE_HANDLE_LOCALLY = 0, > > + BEHAVE_FORWARD_READ = BIT(0), > > + BEHAVE_FORWARD_WRITE = BIT(1), > > + BEHAVE_FORWARD_ANY = BEHAVE_FORWARD_READ | BEHAVE_FORWARD_WRITE, > > +}; > > + > > +struct trap_bits { > > + const enum vcpu_sysreg index; > > + const enum trap_behaviour behaviour; > > + const u64 value; > > + const u64 mask; > > +}; > > + > > +enum coarse_grain_trap_id { > drop coarse in the above name? It seems to feature both coarse, combos > and complex conditions ids? I used 'coarse' in opposition to 'fine', but I agree this is confusing. How about 'trap_group' instead, in an effort to preserve the idea that it has a wider impact than the fine-grained traps? > > + /* Indicates no coarse trap control */ > > + __RESERVED__, > > + > > + /* > > + * The first batch of IDs denote coarse trapping that are used > > + * on their own instead of being part of a combination of > > + * trap controls. > > + */ > > + > > + /* > > + * Anything after this point is a combination of trap controls, > > + * which all must be evaluated to decide what to do. > > + */ > > + __MULTIPLE_CONTROL_BITS__, > > + > > + /* > > + * Anything after this point requires a callback evaluating a > > + * complex trap condition. Hopefully we'll never need this... > > + */ > > + __COMPLEX_CONDITIONS__,> +}; > > + > > +static const struct trap_bits coarse_trap_bits[] = { > > +}; > > + > > +#define MCB(id, ...) \ > > + [id - __MULTIPLE_CONTROL_BITS__] = \ > > + (const enum coarse_grain_trap_id []){ \ > > + __VA_ARGS__ , __RESERVED__ \ > > + } > nit there are few check patch errors checkpatch? is this still a thing? ;-) I'll have a look at what it's angry about... > > + > > +static const enum coarse_grain_trap_id *coarse_control_combo[] = { > > +}; > > + > > +typedef enum trap_behaviour (*complex_condition_check)(struct kvm_vcpu *); > > + > > +#define CCC(id, fn) [id - __COMPLEX_CONDITIONS__] = fn > > + > > +static const complex_condition_check ccc[] = { > > +}; > > + > > +struct encoding_to_trap_configs { > > + const u32 encoding; > > + const u32 end; > > + const enum coarse_grain_trap_id id; > > +}; > > + > > +#define SR_RANGE_TRAP(sr_start, sr_end, trap_id) \ > > + { \ > > + .encoding = sr_start, \ > > + .end = sr_end, \ > > + .id = trap_id, \ > > + } > > + > > +#define SR_TRAP(sr, trap_id) SR_RANGE_TRAP(sr, sr, trap_id) > > + > > +/* > > + * Map encoding to trap bits for exception reported with EC=0x18. > > + * These must only be evaluated when running a nested hypervisor, but > > + * that the current context is not a hypervisor context. When the > > + * trapped access matches one of the trap controls, the exception is > > + * re-injected in the nested hypervisor. > > + */ > > +static const struct encoding_to_trap_configs encoding_to_traps[] __initdata = { > > +}; > > + > > +static DEFINE_XARRAY(sr_forward_xa); > > + > > +void __init populate_nv_trap_config(void) > > +{ > > + for (int i = 0; i < ARRAY_SIZE(encoding_to_traps); i++) { > > + const struct encoding_to_trap_configs *ett = &encoding_to_traps[i]; > > + void *prev; > > + > > + prev = xa_store_range(&sr_forward_xa, ett->encoding, ett->end, > > + xa_mk_value(ett->id), GFP_KERNEL); > > + WARN_ON(prev); > > + } > > + > > + kvm_info("nv: %ld trap handlers\n", ARRAY_SIZE(encoding_to_traps)); > > +} > > + > > +static const enum coarse_grain_trap_id get_trap_config(u32 sysreg) > > +{ > > + return xa_to_value(xa_load(&sr_forward_xa, sysreg)); > > +} > > + > > +static enum trap_behaviour get_behaviour(struct kvm_vcpu *vcpu, > > + const struct trap_bits *tb) > > +{ > > + enum trap_behaviour b = BEHAVE_HANDLE_LOCALLY; > > + u64 val; > > + > > + val = __vcpu_sys_reg(vcpu, tb->index); > > + if ((val & tb->mask) == tb->value) > > + b |= tb->behaviour; > > + > > + return b; > > +} > > + > > +static enum trap_behaviour __do_compute_behaviour(struct kvm_vcpu *vcpu, > > + const enum coarse_grain_trap_id id, > > + enum trap_behaviour b) > > +{ > > + switch (id) { > > + const enum coarse_grain_trap_id *cgids; > > + > > + case __RESERVED__ ... __MULTIPLE_CONTROL_BITS__ - 1: > > + if (likely(id != __RESERVED__)) > > + b |= get_behaviour(vcpu, &coarse_trap_bits[id]); > > + break; > > + case __MULTIPLE_CONTROL_BITS__ ... __COMPLEX_CONDITIONS__ - 1: > > + /* Yes, this is recursive. Don't do anything stupid. */ > > + cgids = coarse_control_combo[id - __MULTIPLE_CONTROL_BITS__]; > > + for (int i = 0; cgids[i] != __RESERVED__; i++) > > + b |= __do_compute_behaviour(vcpu, cgids[i], b); > > + break; > > + default: > > + if (ARRAY_SIZE(ccc)) > > + b |= ccc[id - __COMPLEX_CONDITIONS__](vcpu); > > + break; > > + } > > + > > + return b; > > +} > > + > > +static enum trap_behaviour compute_behaviour(struct kvm_vcpu *vcpu, u32 sysreg) > > +{ > > + const enum coarse_grain_trap_id id = get_trap_config(sysreg); > > + enum trap_behaviour b = BEHAVE_HANDLE_LOCALLY; > > + > > + return __do_compute_behaviour(vcpu, id, b); > > +} > > + > > +bool __check_nv_sr_forward(struct kvm_vcpu *vcpu) > > +{ > > + enum trap_behaviour b; > > + bool is_read; > > + u32 sysreg; > > + u64 esr; > > + > > + if (!vcpu_has_nv(vcpu) || is_hyp_ctxt(vcpu)) > > + return false; > > + > > + esr = kvm_vcpu_get_esr(vcpu); > > + sysreg = esr_sys64_to_sysreg(esr); > > + is_read = (esr & ESR_ELx_SYS64_ISS_DIR_MASK) == ESR_ELx_SYS64_ISS_DIR_READ; > > + > > + b = compute_behaviour(vcpu, sysreg); > nit maybe compute_trap_behaviour would be clearer/more explicit about > what it does here and before. Yup, that's a sensible change. I'll do that. Thanks, M. -- Without deviation from the norm, progress is not possible.