From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A43EA15C3; Thu, 18 Jul 2024 01:42:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721266939; cv=none; b=IDJvlexrD4xaPlmrBvxcmzbxsAxFHA5YlOg7lG2+ymi9BaGYj6PgZir6NHo8UvYEuj6usavw4DxCA5gmEt1wCMPV4bSviELZvo82/5GKihmRBJjy3N36rMDU0VcBJRlJKzal+Hl91z6+R80WNqecLjfg3LnXyD5YLbEq7szQgW0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721266939; c=relaxed/simple; bh=fYtsXOKoarU/BaH7mJyp7Q0as/Oahc5IxEnsQfZojes=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=QE4ao53VhGJncKVM868IXzOzLiyjmezjDDnNZYNWLvrSjTbSJgv6QTnfLizBBt6ctK4gyeekBluHie3RPvbc/GO1FyNCJXSjgfx5XOzmp7WMnhbhXJcvQHjXBrFrcoW66Nd96MdpnIVHdeXRKipmv5IF6TAsdQ2N6EvW9gwZkEc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Cw6cbdTj; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Cw6cbdTj" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721266937; x=1752802937; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=fYtsXOKoarU/BaH7mJyp7Q0as/Oahc5IxEnsQfZojes=; b=Cw6cbdTjb24WhrQYsRfHiF/+13YzSfEfg106E5pdFwce8C2v44nGmxnZ 1vSIyZUoyquwuGaxbKGpScquIU4IoZ3U6bHQeNGzTUmQGTFw7HN9PGW6r +Mpj2D4y3uEaQhFX4JfrEe5bKeLKY1weoUxJP4iAtSrGIMkwuXADMxTZ0 d0Sj6pshxY/WeLkxxRiXNiREJFwNBkXhQAaWrdawksRs3/REEyvxeNepY ZP4d9rFRX3uYWEaNYcI+GAfSMQZ0bLrinqyzhRQJxZ1SFXXRo+t5kf0w/ ZfwMs4nTmJivhJ+zZIWDP3WML4l1zNiEcOgJ6yvnL2K3C1LX1pkSISD2h Q==; X-CSE-ConnectionGUID: LoCrI9p9R4SE9Q+WlGhngg== X-CSE-MsgGUID: ES6zXrjFSsK/Zjd+B9IXOQ== X-IronPort-AV: E=McAfee;i="6700,10204,11136"; a="36250883" X-IronPort-AV: E=Sophos;i="6.09,216,1716274800"; d="scan'208";a="36250883" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jul 2024 18:42:16 -0700 X-CSE-ConnectionGUID: Ea567sx6QYK0hJ4Sq/AGBQ== X-CSE-MsgGUID: 2tu23JPnQcOQm0AKbe4UeA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,216,1716274800"; d="scan'208";a="54779185" Received: from ysheng4-mobl.ccr.corp.intel.com (HELO [10.238.2.30]) ([10.238.2.30]) by fmviesa003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jul 2024 18:42:13 -0700 Message-ID: Date: Thu, 18 Jul 2024 09:42:11 +0800 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v19 117/130] KVM: TDX: Silently ignore INIT/SIPI To: Isaku Yamahata Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, rick.p.edgecombe@intel.com References: <4a4225de42be0f7568c5ecb5c22f2029f8e91d62.1708933498.git.isaku.yamahata@intel.com> <20240716225504.GE1900928@ls.amr.corp.intel.com> Content-Language: en-US From: Binbin Wu In-Reply-To: <20240716225504.GE1900928@ls.amr.corp.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 7/17/2024 6:55 AM, Isaku Yamahata wrote: > On Mon, Jun 17, 2024 at 09:20:36AM +0800, > Binbin Wu wrote: > >> >> On 2/26/2024 4:26 PM, isaku.yamahata@intel.com wrote: >>> From: Isaku Yamahata >>> >>> The TDX module API doesn't provide API for VMM to inject INIT IPI and SIPI. >>> Instead it defines the different protocols to boot application processors. >>> Ignore INIT and SIPI events for the TDX guest. >>> >>> There are two options. 1) (silently) ignore INIT/SIPI request or 2) return >>> error to guest TDs somehow. Given that TDX guest is paravirtualized to >>> boot AP, the option 1 is chosen for simplicity. >>> >>> Signed-off-by: Isaku Yamahata >>> --- >>> arch/x86/include/asm/kvm-x86-ops.h | 1 + >>> arch/x86/include/asm/kvm_host.h | 2 ++ >>> arch/x86/kvm/lapic.c | 19 +++++++++++------- >>> arch/x86/kvm/svm/svm.c | 1 + >>> arch/x86/kvm/vmx/main.c | 32 ++++++++++++++++++++++++++++-- >>> arch/x86/kvm/vmx/tdx.c | 4 ++-- >>> 6 files changed, 48 insertions(+), 11 deletions(-) >>> >>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h >>> index 22d93d4124c8..85c04aad6ab3 100644 >>> --- a/arch/x86/include/asm/kvm-x86-ops.h >>> +++ b/arch/x86/include/asm/kvm-x86-ops.h >>> @@ -149,6 +149,7 @@ KVM_X86_OP_OPTIONAL(migrate_timers) >>> KVM_X86_OP(msr_filter_changed) >>> KVM_X86_OP(complete_emulated_msr) >>> KVM_X86_OP(vcpu_deliver_sipi_vector) >>> +KVM_X86_OP(vcpu_deliver_init) >>> KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons); >>> KVM_X86_OP_OPTIONAL(get_untagged_addr) >>> KVM_X86_OP_OPTIONAL_RET0(gmem_max_level) >>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h >>> index bb8be091f996..2686c080820b 100644 >>> --- a/arch/x86/include/asm/kvm_host.h >>> +++ b/arch/x86/include/asm/kvm_host.h >>> @@ -1836,6 +1836,7 @@ struct kvm_x86_ops { >>> int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err); >>> void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector); >>> + void (*vcpu_deliver_init)(struct kvm_vcpu *vcpu); >>> /* >>> * Returns vCPU specific APICv inhibit reasons >>> @@ -2092,6 +2093,7 @@ void kvm_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); >>> void kvm_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); >>> int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, int seg); >>> void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector); >>> +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu); >>> int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index, >>> int reason, bool has_error_code, u32 error_code); >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >>> index 8025c7f614e0..431074679e83 100644 >>> --- a/arch/x86/kvm/lapic.c >>> +++ b/arch/x86/kvm/lapic.c >>> @@ -3268,6 +3268,16 @@ int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len) >>> return 0; >>> } >>> +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu) >>> +{ >>> + kvm_vcpu_reset(vcpu, true); >>> + if (kvm_vcpu_is_bsp(vcpu)) >>> + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; >>> + else >>> + vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; >>> +} >>> +EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_init); >>> + >>> int kvm_apic_accept_events(struct kvm_vcpu *vcpu) >>> { >>> struct kvm_lapic *apic = vcpu->arch.apic; >>> @@ -3299,13 +3309,8 @@ int kvm_apic_accept_events(struct kvm_vcpu *vcpu) >>> return 0; >>> } >>> - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) { >>> - kvm_vcpu_reset(vcpu, true); >>> - if (kvm_vcpu_is_bsp(apic->vcpu)) >>> - vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; >>> - else >>> - vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; >>> - } >>> + if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) >>> + static_call(kvm_x86_vcpu_deliver_init)(vcpu); >>> if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events)) { >>> if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { >>> /* evaluate pending_events before reading the vector */ >>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >>> index f76dd52d29ba..27546d993809 100644 >>> --- a/arch/x86/kvm/svm/svm.c >>> +++ b/arch/x86/kvm/svm/svm.c >>> @@ -5037,6 +5037,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { >>> .complete_emulated_msr = svm_complete_emulated_msr, >>> .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector, >>> + .vcpu_deliver_init = kvm_vcpu_deliver_init, >>> .vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons, >>> }; >>> diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c >>> index 4f3b872cd401..84d2dc818cf7 100644 >>> --- a/arch/x86/kvm/vmx/main.c >>> +++ b/arch/x86/kvm/vmx/main.c >>> @@ -320,6 +320,14 @@ static void vt_enable_smi_window(struct kvm_vcpu *vcpu) >>> } >>> #endif >>> +static bool vt_apic_init_signal_blocked(struct kvm_vcpu *vcpu) >>> +{ >>> + if (is_td_vcpu(vcpu)) >>> + return true; >> Since for TD, INIT is always blocked, then in kvm_apic_accept_events(), the >> code path to handle INIT/SIPI delivery will not be called, i.e, the OPs >> .vcpu_deliver_init() and .vcpu_deliver_sipi_vector() are never called for >> TD. >> Seems no need to add the new interface  vcpu_deliver_init or the new wrapper >> vt_vcpu_deliver_sipi_vector(). >> >> And consider the INIT/SIPI for TD: >> - Normally, for TD, INIT ans SIPI should not be set in APIC's >> pending_events. >>   Maybe we can call KVM_BUG_ON() in vt_apic_init_signal_blocked() for TD? >> - If INIT and SIPI are allowed be set in APIC's pending_events for somehow, >> the current code has a problem, it will never clear INIT bit in APIC's >> pending_events. >>   Then kvm_apic_accept_events() needs to execute more check code if INIT was >> once set. >>   INIT bit should be cleared with this assumption. > > KVM_SET_MP_STATE and KVM_SET_VCPU_EVENTS can set INIT/SIPI by the user space. > If we change those two IOCTLs to reject INIT/SIPI for TDX, we can go for the > above "Normally" option. Because I didn't want to touch the common functions, I > ended in extra callbacks. The user space shouldn't use those two IOCTLs for > TDX, though. Even though INIT/SIPI can be set by userspace to APIC's pending_events, for TD, INIT is always blocked, then in kvm_apic_accept_events(), the code path to handle INIT/SIPI delivery will not be called, i.e, the OPs .vcpu_deliver_init() and .vcpu_deliver_sipi_vector() are never called for TD. Can we just drop the new OPs .vcpu_deliver_init() and the new wrapper vt_vcpu_deliver_sipi_vector()?