From: Marc Zyngier <maz@kernel.org>
To: eric.auger@redhat.com
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
Catalin Marinas <catalin.marinas@arm.com>,
Mark Brown <broonie@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Will Deacon <will@kernel.org>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Andre Przywara <andre.przywara@arm.com>,
Chase Conklin <chase.conklin@arm.com>,
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
Darren Hart <darren@os.amperecomputing.com>,
Miguel Luis <miguel.luis@oracle.com>,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Oliver Upton <oliver.upton@linux.dev>,
Zenghui Yu <yuzenghui@huawei.com>
Subject: Re: [PATCH v3 14/27] KVM: arm64: nv: Add trap forwarding infrastructure
Date: Thu, 10 Aug 2023 15:44:06 +0100 [thread overview]
Message-ID: <87ttt7ot3d.wl-maz@kernel.org> (raw)
In-Reply-To: <18eae581-500b-9c9e-2cce-e2f5fb007071@redhat.com>
Hi Eric,
On Wed, 09 Aug 2023 14:27:27 +0100,
Eric Auger <eric.auger@redhat.com> wrote:
>
> Hi Marc,
>
> On 8/8/23 13:46, Marc Zyngier wrote:
> > A significant part of what a NV hypervisor needs to do is to decide
> > whether a trap from a L2+ guest has to be forwarded to a L1 guest
> > or handled locally. This is done by checking for the trap bits that
> > the guest hypervisor has set and acting accordingly, as described by
> > the architecture.
> >
> > A previous approach was to sprinkle a bunch of checks in all the
> > system register accessors, but this is pretty error prone and doesn't
> > help getting an overview of what is happening.
> >
> > Instead, implement a set of global tables that describe a trap bit,
> > combinations of trap bits, behaviours on trap, and what bits must
> > be evaluated on a system register trap.
> >
> > Although this is painful to describe, this allows to specify each
> > and every control bit in a static manner. To make it efficient,
> > the table is inserted in an xarray that is global to the system,
> > and checked each time we trap a system register while running
> > a L2 guest.
> >
> > Add the basic infrastructure for now, while additional patches will
> > implement configuration registers.
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > arch/arm64/include/asm/kvm_host.h | 1 +
> > arch/arm64/include/asm/kvm_nested.h | 2 +
> > arch/arm64/kvm/emulate-nested.c | 262 ++++++++++++++++++++++++++++
> > arch/arm64/kvm/sys_regs.c | 6 +
> > arch/arm64/kvm/trace_arm.h | 26 +++
> > 5 files changed, 297 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 721680da1011..cb1c5c54cedd 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -988,6 +988,7 @@ int kvm_handle_cp10_id(struct kvm_vcpu *vcpu);
> > void kvm_reset_sys_regs(struct kvm_vcpu *vcpu);
> >
> > int __init kvm_sys_reg_table_init(void);
> > +int __init populate_nv_trap_config(void);
> >
> > bool lock_all_vcpus(struct kvm *kvm);
> > void unlock_all_vcpus(struct kvm *kvm);
> > diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h
> > index 8fb67f032fd1..fa23cc9c2adc 100644
> > --- a/arch/arm64/include/asm/kvm_nested.h
> > +++ b/arch/arm64/include/asm/kvm_nested.h
> > @@ -11,6 +11,8 @@ static inline bool vcpu_has_nv(const struct kvm_vcpu *vcpu)
> > test_bit(KVM_ARM_VCPU_HAS_EL2, vcpu->arch.features));
> > }
> >
> > +extern bool __check_nv_sr_forward(struct kvm_vcpu *vcpu);
> > +
> > struct sys_reg_params;
> > struct sys_reg_desc;
> >
> > diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c
> > index b96662029fb1..1b1148770d45 100644
> > --- a/arch/arm64/kvm/emulate-nested.c
> > +++ b/arch/arm64/kvm/emulate-nested.c
> > @@ -14,6 +14,268 @@
> >
> > #include "trace.h"
> >
> > +enum trap_behaviour {
> > + BEHAVE_HANDLE_LOCALLY = 0,
> > + BEHAVE_FORWARD_READ = BIT(0),
> > + BEHAVE_FORWARD_WRITE = BIT(1),
> > + BEHAVE_FORWARD_ANY = BEHAVE_FORWARD_READ | BEHAVE_FORWARD_WRITE,
> > +};
> > +
> > +struct trap_bits {
> > + const enum vcpu_sysreg index;
> > + const enum trap_behaviour behaviour;
> > + const u64 value;
> > + const u64 mask;
> > +};
> > +
> > +enum trap_group {
> nit: Maybe add a comment saying that it relates to *coarse* trapping as
> opposed to the other enum which is named fgt_group_id. cgt_group_id may
> have been better but well ;-)
You mean I should apply some sort of consistency to this code in order
to help reviewers understanding it? Fool! ;-)
Point taken, I'll go over it and apply your suggestion.
>
> > + /* Indicates no coarse trap control */
> > + __RESERVED__,
> > +
> > + /*
> > + * The first batch of IDs denote coarse trapping that are used
> > + * on their own instead of being part of a combination of
> > + * trap controls.
> > + */
> > +
> > + /*
> > + * Anything after this point is a combination of trap controls,
> coarse trap controls
Yup.
> > + * which all must be evaluated to decide what to do.
> > + */
> > + __MULTIPLE_CONTROL_BITS__,
> > +
> > + /*
> > + * Anything after this point requires a callback evaluating a
> > + * complex trap condition. Hopefully we'll never need this...
> > + */
> > + __COMPLEX_CONDITIONS__,
> > +
> > + /* Must be last */
> > + __NR_TRAP_GROUP_IDS__
> > +};
> > +
> > +static const struct trap_bits coarse_trap_bits[] = {
> > +};
> > +
> > +#define MCB(id, ...) \
> > + [id - __MULTIPLE_CONTROL_BITS__] = \
> > + (const enum trap_group []){ \
> > + __VA_ARGS__, __RESERVED__ \
> > + }
> > +
> > +static const enum trap_group *coarse_control_combo[] = {
> > +};
> > +
> > +typedef enum trap_behaviour (*complex_condition_check)(struct kvm_vcpu *);
> > +
> > +#define CCC(id, fn) \
> > + [id - __COMPLEX_CONDITIONS__] = fn
> > +
> > +static const complex_condition_check ccc[] = {
> > +};
> > +
> > +/*
> > + * Bit assignment for the trap controls. We use a 64bit word with the
> > + * following layout for each trapped sysreg:
> > + *
> > + * [9:0] enum trap_group (10 bits)
> > + * [13:10] enum fgt_group_id (4 bits)
> > + * [19:14] bit number in the FGT register (6 bits)
> > + * [20] trap polarity (1 bit)
> > + * [62:21] Unused (42 bits)
> > + * [63] RES0 - Must be zero, as lost on insertion in the xarray
> what do you mean by "as lost"
The xarray is only able to store 63 bit of information, not 64, as it
uses the LSB for its own purpose. This isn't a problem when storing a
pointer (the last 2 bits are usually 0 if pointing to a structure).
However, things get funny when you're trying to assign an integer
value to an xarray location.
This is why we use the xa_mk_value() helper, which is defined as:
static inline void *xa_mk_value(unsigned long v)
{
WARN_ON((long)v < 0);
return (void *)((v << 1) | 1);
}
and the opposite helper on retrieval:
static inline unsigned long xa_to_value(const void *entry)
{
return (unsigned long)entry >> 1;
}
As you now tell, the MSB is lost on insertion. Hence the comment that
warns against the use of bit 63, as we will never get it back. I'll
add some extra checks for that in the code that populates the trap
configuration.
> > + */
> > +#define TC_CGT_BITS 10
> > +#define TC_FGT_BITS 4
> > +
> > +union trap_config {
> > + u64 val;
> > + struct {
> > + unsigned long cgt:TC_CGT_BITS; /* Coarse trap id */
> > + unsigned long fgt:TC_FGT_BITS; /* Fing Grained Trap id */
> Fine & align capital letter in Trap for both comments
Done.
> > + unsigned long bit:6; /* Bit number */
> > + unsigned long pol:1; /* Polarity */
> > + unsigned long unk:42; /* Unknown */
> s//Unknown/Unused?
Yup, that's better.
> > + unsigned long mbz:1; /* Must Be Zero */
> > + };
> > +};
> > +
> > +struct encoding_to_trap_config {
> > + const u32 encoding;
> > + const u32 end;
> > + const union trap_config tc;
> > +};
> > +
> > +#define SR_RANGE_TRAP(sr_start, sr_end, trap_id) \
> > + { \
> > + .encoding = sr_start, \
> > + .end = sr_end, \
> > + .tc = { \
> > + .cgt = trap_id, \
> > + }, \
> > + }
> > +
> > +#define SR_TRAP(sr, trap_id) SR_RANGE_TRAP(sr, sr, trap_id)
> > +
> > +/*
> > + * Map encoding to trap bits for exception reported with EC=0x18.
> > + * These must only be evaluated when running a nested hypervisor, but
> > + * that the current context is not a hypervisor context. When the
> > + * trapped access matches one of the trap controls, the exception is
> > + * re-injected in the nested hypervisor.
> I must confess I was confused by the "forwarding" terminology versus
> "re-injection into the nested hyp"
>
> cf.
>
> "decide whether a trap from a L2+ guest has to be forwarded to a L1 guest
> or handled locally"
>
> "re-injection into the nested hyp" sounds clearer to me.
I see them as two sides of the same coin: the "forwarding" is the
high-level action (we pass the exception on to the L1 hypervisor). The
"re-injection" is the low-level implementation of the forwarding,
which is a complicated process (a full world switch).
>
> > + */
> > +static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = {
> > +};
> > +
> > +static DEFINE_XARRAY(sr_forward_xa);
> > +
> > +static union trap_config get_trap_config(u32 sysreg)
> > +{
> > + return (union trap_config) {
> > + .val = xa_to_value(xa_load(&sr_forward_xa, sysreg)),
> > + };
> > +}
> > +
> > +int __init populate_nv_trap_config(void)
> > +{
> > + int ret = 0;
> > +
> > + BUILD_BUG_ON(sizeof(union trap_config) != sizeof(void *));
> > + BUILD_BUG_ON(__NR_TRAP_GROUP_IDS__ > BIT(TC_CGT_BITS));
> > +
> > + for (int i = 0; i < ARRAY_SIZE(encoding_to_cgt); i++) {
> > + const struct encoding_to_trap_config *cgt = &encoding_to_cgt[i];
> > + void *prev;
> > +
> > + prev = xa_store_range(&sr_forward_xa, cgt->encoding, cgt->end,
> > + xa_mk_value(cgt->tc.val), GFP_KERNEL);
> > +
> > + if (prev) {
> > + kvm_err("Duplicate CGT for (%d, %d, %d, %d, %d)\n",
> > + sys_reg_Op0(cgt->encoding),
> > + sys_reg_Op1(cgt->encoding),
> > + sys_reg_CRn(cgt->encoding),
> > + sys_reg_CRm(cgt->encoding),
> > + sys_reg_Op2(cgt->encoding));
> > + ret = -EINVAL;
> > + }
> > + }
> > +
> > + kvm_info("nv: %ld coarse grained trap handlers\n",
> > + ARRAY_SIZE(encoding_to_cgt));
> > +
> > + for (int id = __MULTIPLE_CONTROL_BITS__;
> > + id < (__COMPLEX_CONDITIONS__ - 1);
> > + id++) {
> > + const enum trap_group *cgids;
> > +
> > + cgids = coarse_control_combo[id - __MULTIPLE_CONTROL_BITS__];
> > +
> > + for (int i = 0; cgids[i] != __RESERVED__; i++) {
> > + if (cgids[i] >= __MULTIPLE_CONTROL_BITS__) {
> > + kvm_err("Recursive MCB %d/%d\n", id, cgids[i]);
> > + ret = -EINVAL;
> > + }
> > + }
> > + }
> > +
> > + if (ret)
> > + xa_destroy(&sr_forward_xa);
> > +
> > + return ret;
> > +}
> > +
> > +static enum trap_behaviour get_behaviour(struct kvm_vcpu *vcpu,
> > + const struct trap_bits *tb)
> > +{
> > + enum trap_behaviour b = BEHAVE_HANDLE_LOCALLY;
> > + u64 val;
> > +
> > + val = __vcpu_sys_reg(vcpu, tb->index);
> > + if ((val & tb->mask) == tb->value)
> > + b |= tb->behaviour;
> > +
> > + return b;
> > +}
> > +
> > +static enum trap_behaviour __do_compute_trap_behaviour(struct kvm_vcpu *vcpu,
> > + const enum trap_group id,
> > + enum trap_behaviour b)
> > +{
> > + switch (id) {
> > + const enum trap_group *cgids;
> > +
> > + case __RESERVED__ ... __MULTIPLE_CONTROL_BITS__ - 1:
> > + if (likely(id != __RESERVED__))
> > + b |= get_behaviour(vcpu, &coarse_trap_bits[id]);
> > + break;
> > + case __MULTIPLE_CONTROL_BITS__ ... __COMPLEX_CONDITIONS__ - 1:
> > + /* Yes, this is recursive. Don't do anything stupid. */
> > + cgids = coarse_control_combo[id - __MULTIPLE_CONTROL_BITS__];
> > + for (int i = 0; cgids[i] != __RESERVED__; i++)
> > + b |= __do_compute_trap_behaviour(vcpu, cgids[i], b);
> > + break;
> > + default:
> > + if (ARRAY_SIZE(ccc))
> > + b |= ccc[id - __COMPLEX_CONDITIONS__](vcpu);
> > + break;
> > + }
> > +
> > + return b;
> > +}
> > +
> > +static enum trap_behaviour compute_trap_behaviour(struct kvm_vcpu *vcpu,
> > + const union trap_config tc)
> > +{
> > + enum trap_behaviour b = BEHAVE_HANDLE_LOCALLY;
> > +
> > + return __do_compute_trap_behaviour(vcpu, tc.cgt, b);
> > +}
> > +
> > +bool __check_nv_sr_forward(struct kvm_vcpu *vcpu)
> > +{
> > + union trap_config tc;
> > + enum trap_behaviour b;
> > + bool is_read;
> > + u32 sysreg;
> > + u64 esr;
> > +
> > + if (!vcpu_has_nv(vcpu) || is_hyp_ctxt(vcpu))
> > + return false;
> > +
> > + esr = kvm_vcpu_get_esr(vcpu);
> > + sysreg = esr_sys64_to_sysreg(esr);
> > + is_read = (esr & ESR_ELx_SYS64_ISS_DIR_MASK) == ESR_ELx_SYS64_ISS_DIR_READ;
> > +
> > + tc = get_trap_config(sysreg);
> > +
> > + /*
> > + * A value of 0 for the whole entry means that we know nothing
> > + * for this sysreg, and that it cannot be forwareded. In this
> forwarded
Fixed.
Thanks again for all your suggestions.
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-08-10 14:44 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-08 11:46 [PATCH v3 00/27] KVM: arm64: NV trap forwarding infrastructure Marc Zyngier
2023-08-08 11:46 ` [PATCH v3 01/27] arm64: Add missing VA CMO encodings Marc Zyngier
2023-08-10 3:14 ` Jing Zhang
2023-08-15 10:39 ` Marc Zyngier
2023-08-08 11:46 ` [PATCH v3 02/27] arm64: Add missing ERX*_EL1 encodings Marc Zyngier
2023-08-10 4:25 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 03/27] arm64: Add missing DC ZVA/GVA/GZVA encodings Marc Zyngier
2023-08-10 4:29 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 04/27] arm64: Add TLBI operation encodings Marc Zyngier
2023-08-10 5:22 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 05/27] arm64: Add AT " Marc Zyngier
2023-08-11 2:20 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 06/27] arm64: Add debug registers affected by HDFGxTR_EL2 Marc Zyngier
2023-08-11 3:00 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 07/27] arm64: Add missing BRB/CFP/DVP/CPP instructions Marc Zyngier
2023-08-11 3:07 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 08/27] arm64: Add HDFGRTR_EL2 and HDFGWTR_EL2 layouts Marc Zyngier
2023-08-11 3:19 ` Jing Zhang
2023-08-14 12:32 ` Eric Auger
2023-08-08 11:46 ` [PATCH v3 09/27] arm64: Add feature detection for fine grained traps Marc Zyngier
2023-08-11 15:26 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 10/27] KVM: arm64: Correctly handle ACCDATA_EL1 traps Marc Zyngier
2023-08-11 15:31 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 11/27] KVM: arm64: Add missing HCR_EL2 trap bits Marc Zyngier
2023-08-11 16:21 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 12/27] KVM: arm64: nv: Add FGT registers Marc Zyngier
2023-08-11 16:36 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 13/27] KVM: arm64: Restructure FGT register switching Marc Zyngier
2023-08-11 17:40 ` Jing Zhang
2023-08-08 11:46 ` [PATCH v3 14/27] KVM: arm64: nv: Add trap forwarding infrastructure Marc Zyngier
2023-08-09 13:27 ` Eric Auger
2023-08-10 14:44 ` Marc Zyngier [this message]
2023-08-10 17:34 ` Eric Auger
2023-08-09 18:28 ` Miguel Luis
2023-08-10 14:43 ` Marc Zyngier
2023-08-13 2:24 ` Jing Zhang
2023-08-15 10:38 ` Marc Zyngier
2023-08-08 11:46 ` [PATCH v3 15/27] KVM: arm64: nv: Add trap forwarding for HCR_EL2 Marc Zyngier
2023-08-12 3:08 ` Miguel Luis
2023-08-15 10:39 ` Marc Zyngier
2023-08-15 15:35 ` Miguel Luis
2023-08-15 16:07 ` Marc Zyngier
2023-08-15 15:46 ` Miguel Luis
2023-08-15 16:09 ` Marc Zyngier
2023-08-08 11:47 ` [PATCH v3 16/27] KVM: arm64: nv: Expose FEAT_EVT to nested guests Marc Zyngier
2023-08-14 21:08 ` Jing Zhang
2023-08-08 11:47 ` [PATCH v3 17/27] KVM: arm64: nv: Add trap forwarding for MDCR_EL2 Marc Zyngier
2023-08-08 11:47 ` [PATCH v3 18/27] KVM: arm64: nv: Add trap forwarding for CNTHCTL_EL2 Marc Zyngier
2023-08-08 11:47 ` [PATCH v3 19/27] KVM: arm64: nv: Add fine grained trap forwarding infrastructure Marc Zyngier
2023-08-14 17:18 ` Jing Zhang
2023-08-15 10:39 ` Marc Zyngier
2023-08-08 11:47 ` [PATCH v3 20/27] KVM: arm64: nv: Add trap forwarding for HFGxTR_EL2 Marc Zyngier
2023-08-08 11:47 ` [PATCH v3 21/27] KVM: arm64: nv: Add trap forwarding for HFGITR_EL2 Marc Zyngier
2023-08-08 11:47 ` [PATCH v3 22/27] KVM: arm64: nv: Add trap forwarding for HDFGxTR_EL2 Marc Zyngier
2023-08-08 12:30 ` Eric Auger
2023-08-08 11:47 ` [PATCH v3 23/27] KVM: arm64: nv: Add SVC trap forwarding Marc Zyngier
2023-08-10 8:35 ` Eric Auger
2023-08-10 10:42 ` Marc Zyngier
2023-08-10 17:30 ` Eric Auger
2023-08-11 7:36 ` Marc Zyngier
2023-08-14 9:37 ` Eric Auger
2023-08-14 9:37 ` Eric Auger
2023-08-08 11:47 ` [PATCH v3 24/27] KVM: arm64: nv: Add switching support for HFGxTR/HDFGxTR Marc Zyngier
2023-08-10 8:59 ` Eric Auger
2023-08-08 11:47 ` [PATCH v3 25/27] KVM: arm64: nv: Expose FGT to nested guests Marc Zyngier
2023-08-10 9:44 ` Eric Auger
2023-08-08 11:47 ` [PATCH v3 26/27] KVM: arm64: Move HCRX_EL2 switch to load/put on VHE systems Marc Zyngier
2023-08-10 12:38 ` Eric Auger
2023-08-08 11:47 ` [PATCH v3 27/27] KVM: arm64: nv: Add support for HCRX_EL2 Marc Zyngier
2023-08-14 12:17 ` Eric Auger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttt7ot3d.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=alexandru.elisei@arm.com \
--cc=andre.przywara@arm.com \
--cc=broonie@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=chase.conklin@arm.com \
--cc=darren@os.amperecomputing.com \
--cc=eric.auger@redhat.com \
--cc=gankulkarni@os.amperecomputing.com \
--cc=james.morse@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=miguel.luis@oracle.com \
--cc=oliver.upton@linux.dev \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).