From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-170.mta0.migadu.com (out-170.mta0.migadu.com [91.218.175.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 725CC1EE00D for ; Mon, 3 Mar 2025 22:10:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.170 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741039819; cv=none; b=N4wg7Wh5iXIxkJzreVWf8p7REG7U829Eo5hvrj05S9/L7eZ+49++P0rjy7nSW4zNZuzcHXXZ68STc56oRftz8TRXfIdxl/b9tWzYl0r7+n8oTV4ZXkmBaT12r+wI6EwxnTVd7DGQtLuxuxpjChIr/XxfHT4YIgrmv86Q9VPxMGA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741039819; c=relaxed/simple; bh=Fq+aPCle+1rzMj5GZ0Hr68ErMZvpgCUwbKsGXlweaOA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=mb2u+KTKUXVep79pjU5YQlLSA9BHFJ0NZYo0MfJ++NtAKC0uOgLfBbW2Xz0rP/0HCkxI4DPZELgVbPOTAXqpa+a3B6H0L9HLZpqzHB1397uGemCcQaDRNsL/N10d8BzojrOJ2GF1WfNthDYz6VFnmzR9O8zNTmW3ndv8EkgePUE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=XnGLIOrR; arc=none smtp.client-ip=91.218.175.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="XnGLIOrR" Date: Mon, 3 Mar 2025 22:10:07 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1741039811; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9yFPBBYOAYAtyEgP77LACsnC4woQs7RcjDSHEfoMl1w=; b=XnGLIOrR78h89TYst5HZrtxIIBxdd+8cPuZgw+LfrRXGmbmKZYiRPOPxz9VCWooqTqgxXh 54p+YbOSeq1O3vrdyaZbsxJsTONtHR8h04QXr+e4rxusDy1hgVOdRVRF1gr/i4qq68nfuD qtl6Ffgu1+Sifk/0vzsln+oMjuOq48w= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Maxim Levitsky Cc: Sean Christopherson , Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 09/13] KVM: nSVM: Handle nested TLB flush requests through TLB_CONTROL Message-ID: References: <20250205182402.2147495-1-yosry.ahmed@linux.dev> <20250205182402.2147495-10-yosry.ahmed@linux.dev> <2dfc8e02c16e78989bee94893cc48d531cdfa909.camel@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2dfc8e02c16e78989bee94893cc48d531cdfa909.camel@redhat.com> X-Migadu-Flow: FLOW_OUT On Fri, Feb 28, 2025 at 09:06:18PM -0500, Maxim Levitsky wrote: > On Wed, 2025-02-05 at 18:23 +0000, Yosry Ahmed wrote: > > Handle L1's requests to flush L2's TLB through the TLB_CONTROL field of > > VMCB12. This is currently redundant because a full flush is executed on > > every nested transition, but is a step towards removing that. > > > > TLB_CONTROL_FLUSH_ALL_ASID flushes all ASIDs from L1's perspective, > > including its own, so do a guest TLB flush on both transitions. Never > > propagate TLB_CONTROL_FLUSH_ALL_ASID from the guest to the actual VMCB, > > as this gives the guest the power to flush the entire physical TLB > > (including translations for the host and other VMs). > > > > For other cases, the TLB flush is only done when entering L2. The nested > > NPT MMU is also sync'd because TLB_CONTROL also flushes NPT > > guest-physical mappings. > > > > All TLB_CONTROL values can be handled by KVM regardless of FLUSHBYASID > > support on the underlying CPU, so keep advertising FLUSHBYASID to the > > guest unconditionally. > > > > Signed-off-by: Yosry Ahmed > > --- > > arch/x86/kvm/svm/nested.c | 42 ++++++++++++++++++++++++++++++++------- > > arch/x86/kvm/svm/svm.c | 6 +++--- > > 2 files changed, 38 insertions(+), 10 deletions(-) > > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > > index 0735177b95a1d..e2c59eb2907e8 100644 > > --- a/arch/x86/kvm/svm/nested.c > > +++ b/arch/x86/kvm/svm/nested.c > > @@ -484,19 +484,36 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm, > > > > static void nested_svm_entry_tlb_flush(struct kvm_vcpu *vcpu) > > { > > + struct vcpu_svm *svm = to_svm(vcpu); > > + > > /* Handle pending Hyper-V TLB flush requests */ > > kvm_hv_nested_transtion_tlb_flush(vcpu, npt_enabled); > > > > + /* > > + * If L1 requested a TLB flush for L2, flush L2's TLB on nested entry > > + * and sync the nested NPT MMU, as TLB_CONTROL also flushes NPT > > + * guest-physical mappings. We technically only need to flush guest_mode > > + * page tables. > > + * > > + * KVM_REQ_TLB_FLUSH_GUEST will flush L2's ASID even if the underlying > > + * CPU does not support FLUSHBYASID (by assigning a new ASID), so we > > + * can handle all TLB_CONTROL values from L1 regardless. > > + * > > + * Note that TLB_CONTROL_FLUSH_ASID_LOCAL is handled exactly like > > + * TLB_CONTROL_FLUSH_ASID. We can technically flush less TLB entries, > > + * but this would require significantly more complexity. > > + */ > > I think it might make sense to note that we in essence support only one non zero ASID > in L1, the one that it picks for the nested guest. > > > Thus when asked to TLB_CONTROL_FLUSH_ALL_ASID > we need to flush the L2's real asid and L1 asid only. This is described by the comment in nested_svm_exit_tlb_flush(). Do you mean that we should also mention that here? I guess one way to make things clearer is to describe the behavior for all values of TLB_CONTROL here, and in nested_svm_exit_tlb_flush() just say /* see nested_svm_entry_tlb_flush() */. Would that improve things? > > > > + if (svm->nested.ctl.tlb_ctl != TLB_CONTROL_DO_NOTHING) { > > + if (nested_npt_enabled(svm)) > > + kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); > > + } > > + > > /* > > * TODO: optimize unconditional TLB flush/MMU sync. A partial list of > > * things to fix before this can be conditional: > > * > > - * - Honor L1's request to flush an ASID on nested VMRUN > > - * - Sync nested NPT MMU on VMRUN that flushes L2's ASID[*] > > * - Don't crush a pending TLB flush in vmcb02 on nested VMRUN > > - * > > - * [*] Unlike nested EPT, SVM's ASID management can invalidate nested > > - * NPT guest-physical mappings on VMRUN. > > */ > > kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > > @@ -504,9 +521,18 @@ static void nested_svm_entry_tlb_flush(struct kvm_vcpu *vcpu) > > > > static void nested_svm_exit_tlb_flush(struct kvm_vcpu *vcpu) > > { > > + struct vcpu_svm *svm = to_svm(vcpu); > > + > > /* Handle pending Hyper-V TLB flush requests */ > > kvm_hv_nested_transtion_tlb_flush(vcpu, npt_enabled); > > > > + /* > > + * If L1 had requested a full TLB flush when entering L2, also flush its > > + * TLB entries when exiting back to L1. > > + */ > > + if (svm->nested.ctl.tlb_ctl == TLB_CONTROL_FLUSH_ALL_ASID) > > + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); > > Makes sense. > > > + > > /* See nested_svm_entry_tlb_flush() */ > > kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > > @@ -825,7 +851,8 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, > > nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr); > > > > svm_switch_vmcb(svm, &svm->nested.vmcb02); > > - nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base); > > + nested_vmcb02_prepare_control(svm, vmcb12->save.rip, > > + vmcb12->save.cs.base); > > nested_vmcb02_prepare_save(svm, vmcb12); > > > > ret = nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3, > > @@ -1764,7 +1791,8 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, > > nested_copy_vmcb_control_to_cache(svm, ctl); > > > > svm_switch_vmcb(svm, &svm->nested.vmcb02); > > - nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, svm->vmcb->save.cs.base); > > + nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, > > + svm->vmcb->save.cs.base); > > > > /* > > * While the nested guest CR3 is already checked and set by > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > index 8342c7eadbba8..5e7b1c9bfa605 100644 > > --- a/arch/x86/kvm/svm/svm.c > > +++ b/arch/x86/kvm/svm/svm.c > > @@ -5242,9 +5242,9 @@ static __init void svm_set_cpu_caps(void) > > kvm_cpu_cap_set(X86_FEATURE_VMCBCLEAN); > > > > /* > > - * KVM currently flushes TLBs on *every* nested SVM transition, > > - * and so for all intents and purposes KVM supports flushing by > > - * ASID, i.e. KVM is guaranteed to honor every L1 ASID flush. > > + * KVM currently handles all TLB_CONTROL values set by L1, even > > + * if the underlying CPU does not. See > > + * nested_svm_transition_tlb_flush(). > > */ > > kvm_cpu_cap_set(X86_FEATURE_FLUSHBYASID); > > > > Patch looks OK, but I can't be 100% sure about this. > > Reviewed-by: Maxim Levitsky > > Best regards, > Maxim Levitsky > >