From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 48D54C433EF for ; Wed, 8 Dec 2021 09:58:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=8bCq/wouF4qjf34Q8JRxvPNXtLDAig/uSsT73ppG9kc=; b=P1ClV3F5mMUIzC yEVfMDN7OGA1C2IqGXPrdMcHVRlwxTDVE6VB2OE1UD73fGabHklPyWguOajP3qsy5cC6VSaOwAXp1 LM4/Y6MzfEtMgcSyw1EPJhXMzQmfatUA8xqEfc7lWcmIeyxpDEwZ4aNPGzHG5IR6SGHvPg2tLJ+7j 24irzZK9TlOBYMIuB2yvZhcU0woUw8Y4BQFKloxl4cmKOSgyIZUDnfSC+sVKEBjNvDBTiFTujLV9c LtYhf8zlwx5K4cKK5QOOX1iqHyUHfZxbd771Id1n2CRY6XBfr+4/FBLM9YqZRxY26FfjB5D0b8/wQ RZIeL2hCs5ltHUhvB+4A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mutge-00C0li-Ou; Wed, 08 Dec 2021 09:56:33 +0000 Received: from sin.source.kernel.org ([2604:1380:40e1:4800::1]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1mutgZ-00C0jF-Np for linux-arm-kernel@lists.infradead.org; Wed, 08 Dec 2021 09:56:29 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id DE696CE2078; Wed, 8 Dec 2021 09:56:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 701C6C00446; Wed, 8 Dec 2021 09:56:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1638957382; bh=kACJOAGpr6pjj+J7FKVTJvDZlK87ySjdt47hQpxmgBw=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=E8k9xUvyaitaRHRfm8aDwHQSNU76ZROOv1qqofemOLtJLI35wXbAK3DAdYwjmfuo/ hXy5xK7yRxZB9JIP62WukcZ9cC2pqnBA7smYkD8jyQfnagHVdBQund7vk67C/IawD7 rMHUOusMt9cmghLOVckPawfZM2vxEeGvZS6hfaJPGS63TF/O1UocZzV5VF7lw3gmAx qKskYqLbzNzb3jSdheeIcAQlbN/YixQB6kI4wDnjKcHiWr0RwsJUmgtN7MlRFf6J7r Ag4KbsYf2+J6ydjtU/5zdEW815dORsPmdf8Er/X73X7zSKxNiiGigGa8FCdcPdwj+4 ywnkUeq/rp8Ow== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mutgS-00AjTu-JI; Wed, 08 Dec 2021 09:56:20 +0000 Date: Wed, 08 Dec 2021 09:56:20 +0000 Message-ID: <87h7bj1ku3.wl-maz@kernel.org> From: Marc Zyngier To: Alexandru Elisei Cc: james.morse@arm.com, suzuki.poulose@arm.com, will@kernel.org, mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, tglx@linutronix.de, mingo@redhat.com Subject: Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU In-Reply-To: References: <20211206170223.309789-1-alexandru.elisei@arm.com> <20211206170223.309789-5-alexandru.elisei@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: alexandru.elisei@arm.com, james.morse@arm.com, suzuki.poulose@arm.com, will@kernel.org, mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, tglx@linutronix.de, mingo@redhat.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211208_015628_218764_6F091A8B X-CRM114-Status: GOOD ( 55.91 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, 07 Dec 2021 14:17:56 +0000, Alexandru Elisei wrote: > > Hi, > > On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote: > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU > > device ioctl. If the VCPU is scheduled on a physical CPU which has a > > different PMU, the perf events needed to emulate a guest PMU won't be > > scheduled in and the guest performance counters will stop counting. Treat > > it as an userspace error and refuse to run the VCPU in this situation. > > > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but > > the flag is cleared when the KVM_RUN enters the non-preemptible section > > instead of in vcpu_put(); this has been done on purpose so the error > > condition is communicated as soon as possible to userspace, otherwise > > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag. > > > > Suggested-by: Marc Zyngier > > Signed-off-by: Alexandru Elisei > > --- > > I agonized for hours about the best name for the VCPU flag and the > > accessors. If someone has a better idea, please tell me and I'll change > > them. > > > > Documentation/virt/kvm/devices/vcpu.rst | 6 +++++- > > arch/arm64/include/asm/kvm_host.h | 12 ++++++++++++ > > arch/arm64/include/uapi/asm/kvm.h | 3 +++ > > arch/arm64/kvm/arm.c | 19 +++++++++++++++++++ > > arch/arm64/kvm/pmu-emul.c | 1 + > > 5 files changed, 40 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst > > index c82be5cbc268..9ae47b7c3652 100644 > > --- a/Documentation/virt/kvm/devices/vcpu.rst > > +++ b/Documentation/virt/kvm/devices/vcpu.rst > > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system. > > > > Note that KVM will not make any attempts to run the VCPU on the physical CPUs > > associated with the PMU specified by this attribute. This is entirely left to > > -userspace. > > +userspace. However, attempting to run the VCPU on a physical CPU not supported > > +by the PMU will fail and KVM_RUN will return with > > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting > > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and > > +the cpu field to the processor id. > > > > 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL > > ================================= > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index 2a5f7f38006f..0c453f2e48b6 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch { > > u64 last_steal; > > gpa_t base; > > } steal; > > + > > + cpumask_var_t supported_cpus; > > }; > > > > /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ > > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch { > > #define KVM_ARM64_EXCEPT_MASK (7 << 9) /* Target EL/MODE */ > > #define KVM_ARM64_DEBUG_STATE_SAVE_SPE (1 << 12) /* Save SPE context if active */ > > #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */ > > +#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 14) /* Physical CPU not in supported_cpus */ > > > > #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \ > > KVM_GUESTDBG_USE_SW_BP | \ > > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch { > > #define vcpu_has_ptrauth(vcpu) false > > #endif > > > > +#define vcpu_on_unsupported_cpu(vcpu) \ > > + ((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU) > > + > > +#define vcpu_set_on_unsupported_cpu(vcpu) \ > > + ((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU) > > + > > +#define vcpu_clear_on_unsupported_cpu(vcpu) \ > > + ((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU) > > + > > #define vcpu_gp_regs(v) (&(v)->arch.ctxt.regs) > > > > /* > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h > > index 1d0a0a2a9711..d49f714f48e6 100644 > > --- a/arch/arm64/include/uapi/asm/kvm.h > > +++ b/arch/arm64/include/uapi/asm/kvm.h > > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags { > > #define KVM_PSCI_RET_INVAL PSCI_RET_INVALID_PARAMS > > #define KVM_PSCI_RET_DENIED PSCI_RET_DENIED > > > > +/* run->fail_entry.hardware_entry_failure_reason codes. */ > > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED (1ULL << 0) > > + > > #endif > > > > #endif /* __ARM_KVM_H__ */ > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index e4727dc771bf..1124c3efdd94 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) > > > > vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO; > > > > + if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL)) > > + return -ENOMEM; > > + cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask); Nit: can we just assign the cpu_possible_mask pointer instead, and only perform the allocation when assigning a specific PMU? > > + > > /* Set up the timer */ > > kvm_timer_vcpu_init(vcpu); > > > > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) > > if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm))) > > static_branch_dec(&userspace_irqchip_in_use); > > > > + free_cpumask_var(vcpu->arch.supported_cpus); > > kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache); > > kvm_timer_vcpu_terminate(vcpu); > > kvm_pmu_vcpu_destroy(vcpu); > > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) > > if (vcpu_has_ptrauth(vcpu)) > > vcpu_ptrauth_disable(vcpu); > > kvm_arch_vcpu_load_debug_state_flags(vcpu); > > + > > + if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus)) > > + vcpu_set_on_unsupported_cpu(vcpu); > > } > > > > void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) > > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) > > */ > > preempt_disable(); > > > > + if (unlikely(vcpu_on_unsupported_cpu(vcpu))) { > > + vcpu_clear_on_unsupported_cpu(vcpu); > > + run->exit_reason = KVM_EXIT_FAIL_ENTRY; > > + run->fail_entry.hardware_entry_failure_reason > > + = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED; > > + run->fail_entry.cpu = smp_processor_id(); > Can you move this hunk to kvm_vcpu_exit_request()? It certainly would fit better there, as we have checks for other exit reasons to userspace. > I just realised that this is wrong for the same reason that KVM doesn't > clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened > after the vcpu_load that set the flag and before preemption is disabled > could mean that now the thread is executing on a different physical CPU > than the physical CPU that caused the flag to be set. To make things worse, > this CPU might even be in supported_cpus, which would be extremely > confusing for someone trying to descipher what went wrong. > > I see three solutions here: > > 1. Drop setting the fail_entry.cpu field. > > 2. Make vcpu_put clear the flag, which means that if the flag is set here > then the VCPU is definitely executing on the wrong physical CPU and > smp_processor_id() will be useful. This looks reasonable to me. > > 3. Carry the unsupported CPU ID information in a new field in struct > kvm_vcpu_arch. > > I honestly don't have a preference. Maybe slightly towards solution number > 2, as it makes the code symmetrical and removes the subtletly around when > the VCPU flag is cleared. But this would be done at the expense of > userspace possibly finding out a lot later (or never) that something went > wrong. I don't really get your argument about "userspace possibly finding out a lot later...". Yes, if the vcpu gets migrated to a 'good' CPU after a sequence of put/load, userspace will be lucky. But that's the rule of the game. If userspace pins the vcpu to the wrong CPU type, then the information will be consistent. M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel