From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB7B73750CD for ; Tue, 3 Mar 2026 02:22:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772504552; cv=none; b=tMHRXMgvNflYTgJzZakDoNGi73kdM4fNknDe0m/HMAnda/qlvGgdBMqYOr//5Zm+rI9lvTqEabUsLQpOHL/cfApzV8AZjOgB6nQviPm09JYg352NyKkGil02jg+B4/Hy05QCbzAOxW9PMPc4z8dJnMgj1L1UUTGESN8ar5cLLpg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772504552; c=relaxed/simple; bh=gf+2BiY/yfU9/1YNczOQ/IYDeZ7anIWZ83BCs8n4Yr4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=F86M7uFYWVf1qBUGvBLgRSN94CbiSw83u617s9WZtaLaeQkDZjg5ORiihcYB5GwEhggB0S8KyRcWk6q6qBlZfTnGSgM0gCCjKXUkRK07Wex8lXl+2DILvZmecjMfrcei9QMaBovVahx0hsJp/MZf+lEjvNIt1XYKV0mqMC4Nk5o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=V8ZjeqdM; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="V8ZjeqdM" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-35980affbf3so1595516a91.0 for ; Mon, 02 Mar 2026 18:22:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1772504549; x=1773109349; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MQvQrkte+OWOvYaFeGeQ3JIRUHTtvoopnhw6ppPf9PY=; b=V8ZjeqdMQvIqeMEt38AgNsqAXgUzlkFgRiW/yPcBuYI1rNx1UySxQ3PKc7lNMRIHAq 3um8PQ820Pc2mEDG4tjOfltJfOts9F9vjOKhDHNCDmNvUoxbZrO/mdWe2nhWBvYp5O3K LketIIKF7K+8rqTEXTYDNrWEKtKbpCMIAdLe8sXLeLSwSr1yEbW5yx/P9jzRYHZZsisL xnI5DixGaiUEm0pcViSKB0OJgiP17pM41c7h4nmzUG5tII6b0nPKnRbxK20xpXZO1aim Z1qEQq9QZtUb6rdtXvG2atc8c+cZOsZOCu4Qez4QwRF8blKt9L470BT0/9LViJQTq6Zt wgSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772504549; x=1773109349; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MQvQrkte+OWOvYaFeGeQ3JIRUHTtvoopnhw6ppPf9PY=; b=qoqmaKteexFnVRmTBHglSmgAle8DZSE1gZW6pqcFwz/G/XwnabHCuSDRW40vbC7tqZ rkdipp1yQ4VjH0iN1ZJE8/My4He1P3BofO9zU/vheIe51lsBMJERgdJNuRLfGjsOE27f ggnlD4VYxYSObpSZ1ejQ7r6sh55Vyz+E6G2fDp7q0j8Vlk4L+59/IV5HZzrXwQA4A3xj y9m9RZp5tUW5fxuNGRhvR90693dqPKObBwZBQ+CIpGDxiCyW2tdGPdZGrwZcWSIhmiwE kykFiog6Kh+KJNCRQ6icT27gI8yqTzidugaF0Dvww6FfYji9irOdfDp0WmyAUtEvoPHE vBKw== X-Forwarded-Encrypted: i=1; AJvYcCXTU12nqj9lYsEyX/hd1JimWuXG7ZgmfxRSwShKJcGnMqMtrAKBoE3pdArPT3FQlm/skrWHbnHPq6xvzJE=@vger.kernel.org X-Gm-Message-State: AOJu0Yz0cH1W38DEpF5y9SLG/qZaiWGu7wx58BhGQSwq+7TJgl2DDpOd 1KZahThxuK/FBMYOP504lYWHDh46sTLZ5AQSilI9sa0LIcTtj1filgOD951ROxDw9PXK4Pn0zRx vedAaiQ== X-Received: from pjbdw2.prod.google.com ([2002:a17:90b:942:b0:358:f01f:25f3]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3d46:b0:359:120f:d3aa with SMTP id 98e67ed59e1d1-35965c40c42mr12513063a91.14.1772504549141; Mon, 02 Mar 2026 18:22:29 -0800 (PST) Date: Mon, 2 Mar 2026 18:22:27 -0800 In-Reply-To: <20260228033328.2285047-5-chengkev@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260228033328.2285047-1-chengkev@google.com> <20260228033328.2285047-5-chengkev@google.com> Message-ID: Subject: Re: [PATCH V4 4/4] KVM: SVM: Raise #UD if VMMCALL instruction is not intercepted From: Sean Christopherson To: Kevin Cheng Cc: pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, yosry@kernel.org, Vitaly Kuznetsov Content-Type: text/plain; charset="us-ascii" +Vitaly On Sat, Feb 28, 2026, Kevin Cheng wrote: > The AMD APM states that if VMMCALL instruction is not intercepted, the > instruction raises a #UD exception. > > Create a vmmcall exit handler that generates a #UD if a VMMCALL exit > from L2 is being handled by L0, which means that L1 did not intercept > the VMMCALL instruction. The exception to this is if the exiting > instruction was for Hyper-V L2 TLB flush hypercalls as they are handled > by L0. *sigh* Except this changelog doesn't capture *any* of the subtlety. And were it not for an internal bug discussion, I would have literally no clue WTF is going on. There's not generic missed #UD bug, because this code in recalc_intercepts() effectively disables the VMMCALL intercept in vmcb02 if the intercept isn't set in vmcb12. /* * We want to see VMMCALLs from a nested guest only when Hyper-V L2 TLB * flush feature is enabled. */ if (!nested_svm_l2_tlb_flush_enabled(&svm->vcpu)) vmcb_clr_intercept(c, INTERCEPT_VMMCALL); I.e. the only bug *knowingly* being fixed, maybe, is an edge case where Hyper-V TLB flushes are enabled for L2 and the hypercall is something other than one of the blessed Hyper-V hypercalls. But in that case, it's not at all clear to me that synthesizing a #UD into L2 is correct. I can't find anything in the TLFS (not surprising), so I guess anything goes? Vitaly, The scenario in question is where HV_X64_NESTED_DIRECT_FLUSH is enabled, L1 doesn't intercept VMMCALL, and L2 executes VMMCALL with something other than one of the Hyper-V TLB flush hypercalls. The proposed change is to synthesize #UD (which is what happens if HV_X64_NESTED_DIRECT_FLUSH isn't enable). Does that sound sane? Should KVM instead return an error. As for bugs that are *unknowingly* being fixed, intercepting VMMCALL and manually injecting a #UD effectively fixes a bad interaction with KVM's asinine KVM_X86_QUIRK_FIX_HYPERCALL_INSN. If KVM doesn't intercept VMMCALL while L2 is active (L1 doesn't wants to intercept VMMCALL and the Hyper-V L2 TLB flush hypercall is disabled), then L2 will hang on the VMMCALL as KVM will intercept the #UD, then "emulate" VMMCALL by trying to fixup the opcode and restarting the instruction. That can be "fixed" by disabling the quirk, or by hacking the fixup like so: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index db3f393192d9..3f6d9950f8f8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10506,17 +10506,22 @@ static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt) * If the quirk is disabled, synthesize a #UD and let the guest pick up * the pieces. */ - if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_FIX_HYPERCALL_INSN)) { - ctxt->exception.error_code_valid = false; - ctxt->exception.vector = UD_VECTOR; - ctxt->have_exception = true; - return X86EMUL_PROPAGATE_FAULT; - } + if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_FIX_HYPERCALL_INSN)) + goto inject_ud; kvm_x86_call(patch_hypercall)(vcpu, instruction); + if (is_guest_mode(vcpu) && !memcmp(instruction, ctxt->fetch.data, 3)) + goto inject_ud; + return emulator_write_emulated(ctxt, rip, instruction, 3, &ctxt->exception); + +inject_ud: + ctxt->exception.error_code_valid = false; + ctxt->exception.vector = UD_VECTOR; + ctxt->have_exception = true; + return X86EMUL_PROPAGATE_FAULT; } static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu) -- But that's extremely convoluted for no purpose that I can see. Not intercepting VMMCALL requires _more_ code and is overall more complex. So unless I'm missing something, I'm going to tack on this to fix the L2 infinite loop, and then figure out what to do about Hyper-V, pending Vitaly's input. diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 45d1496031a7..a55af647649c 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -156,13 +156,6 @@ void recalc_intercepts(struct vcpu_svm *svm) vmcb_clr_intercept(c, INTERCEPT_VINTR); } - /* - * We want to see VMMCALLs from a nested guest only when Hyper-V L2 TLB - * flush feature is enabled. - */ - if (!nested_svm_l2_tlb_flush_enabled(&svm->vcpu)) - vmcb_clr_intercept(c, INTERCEPT_VMMCALL); - for (i = 0; i < MAX_INTERCEPT; i++) c->intercepts[i] |= g->intercepts[i];