From: Sean Christopherson <seanjc@google.com>
To: mlevitsk@redhat.com
Cc: kvm@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>,
Paolo Bonzini <pbonzini@redhat.com>,
x86@kernel.org, Dave Hansen <dave.hansen@linux.intel.com>,
Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/3] x86: KVM: VMX: Wrap GUEST_IA32_DEBUGCTL read/write with access functions
Date: Wed, 7 May 2025 10:18:36 -0700 [thread overview]
Message-ID: <aBuV7JmMU3TcsqFW@google.com> (raw)
In-Reply-To: <14eab14d368e68cb9c94c655349f94f44a9a15b4.camel@redhat.com>
On Thu, May 01, 2025, mlevitsk@redhat.com wrote:
> On Tue, 2025-04-22 at 16:33 -0700, Sean Christopherson wrote:
> > > @@ -2653,11 +2654,17 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> > > if (vmx->nested.nested_run_pending &&
> > > (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
> > > kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
> > > - vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
> > > + new_debugctl = vmcs12->guest_ia32_debugctl;
> > > } else {
> > > kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
> > > - vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl);
> > > + new_debugctl = vmx->nested.pre_vmenter_debugctl;
> > > }
> > > +
> > > + if (CC(!vmx_set_guest_debugctl(vcpu, new_debugctl, false))) {
> >
> > The consistency check belongs in nested_vmx_check_guest_state(), only needs to
> > check the VM_ENTRY_LOAD_DEBUG_CONTROLS case, and should be posted as a separate
> > patch.
>
> I can move it there. Can you explain why though you want this? Is it because of the
> order of checks specified in the PRM?
To be consistent with how KVM checks guest state. The two checks in prepare_vmcs02()
are special cases. vmx_guest_state_valid() consumes a huge variety of state, and
so replicating all of its logic for vmcs12 isn't worth doing. The check on the
kvm_set_msr() for guest_ia32_perf_global_ctrl exists purely so that KVM doesn't
simply ignore the return value.
And to a lesser degree, because KVM assumes that guest state has been sanitized
after nested_vmx_check_guest_state() is called. Violating that risks introducing
bugs, e.g. consuming vmcs12->guest_ia32_debugctl before it's been vetted could
theoretically be problematic.
> Currently GUEST_IA32_DEBUGCTL of the host is *written* in prepare_vmcs02.
> Should I also move this write to nested_vmx_check_guest_state?
No. nested_vmx_check_guest_state() verifies the incoming vmcs12 state,
prepare_vmcs02() merges the vmcs12 state with KVM's desires and fills vmcs02.
> Or should I write the value blindly in prepare_vmcs02 and then check the value
> of 'vmx->msr_ia32_debugctl' in nested_vmx_check_guest_state and fail if the value
> contains reserved bits?
I don't follow. nested_vmx_check_guest_state() is called before prepare_vmcs02().
> > > +bool vmx_set_guest_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
> > > +{
> > > + u64 invalid = data & ~vmx_get_supported_debugctl(vcpu, host_initiated);
> > > +
> > > + if (invalid & (DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR)) {
> > > + kvm_pr_unimpl_wrmsr(vcpu, MSR_IA32_DEBUGCTLMSR, data);
> > > + data &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > > + invalid &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > > + }
> > > +
> > > + if (invalid)
> > > + return false;
> > > +
> > > + if (is_guest_mode(vcpu) && (get_vmcs12(vcpu)->vm_exit_controls &
> > > + VM_EXIT_SAVE_DEBUG_CONTROLS))
> > > + get_vmcs12(vcpu)->guest_ia32_debugctl = data;
> > > +
> > > + if (intel_pmu_lbr_is_enabled(vcpu) && !to_vmx(vcpu)->lbr_desc.event &&
> > > + (data & DEBUGCTLMSR_LBR))
> > > + intel_pmu_create_guest_lbr_event(vcpu);
> > > +
> > > + __vmx_set_guest_debugctl(vcpu, data);
> > > + return true;
> >
> > Return 0/-errno, not true/false.
>
> There are plenty of functions in this file and KVM that return boolean.
That doesn't make them "right". For helpers that are obvious predicates, then
absolutely use a boolean return value. The names for nested_vmx_check_eptp()
and vmx_control_verify() aren't very good, e.g. they should be
nested_vmx_is_valid_eptp() and vmx_is_valid_control(), but the intent is good.
But for flows like modifying guest state, KVM should return 0/-errno.
> e.g:
>
> static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
> static inline bool vmx_control_verify(u32 control, u32 low, u32 high)
> static bool nested_evmcs_handle_vmclear(struct kvm_vcpu *vcpu, gpa_t vmptr)
> static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
> struct vmcs12 *vmcs12)
These two should return 0/-errno.
> static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
> static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
Probably should return 0/-errno, but nested_get_vmcs12_pages() is a bit of a mess.
> ...
>
>
> I personally think that functions that emulate hardware should return boolean
> values or some hardware specific status code (e.g VMX failure code) because
> the real hardware never returns -EINVAL and such.
Real hardware absolutely "returns" granular error codes. KVM even has informal
mappings between some of them, e.g. -EINVAL == #GP, -EFAULT == #PF, -EOPNOTSUPP == #UD,
BUG() == 3-strike #MC.
And hardware has many more ways to report errors to software. E.g. VMLAUNCH can
#UD, #GP(0), VM-Exit, VMfailInvalid, or VMFailValid with 30+ unique reasons. #MC
has a crazy number of possible error encodings. And so on and so forth.
Software visible error codes aside, comparing individual KVM functions to an
overall CPU is wildly misguided. A more appropriate comparison would be between
a KVM function and the ucode for a single instruction/operation. I highly, highly
doubt ucode flows are limited to binary yes/no outputs.
next prev parent reply other threads:[~2025-05-07 17:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-16 0:25 [PATCH 0/3] KVM: x86: allow DEBUGCTL.DEBUGCTLMSR_FREEZE_IN_SMM passthrough Maxim Levitsky
2025-04-16 0:25 ` [PATCH 1/3] x86: KVM: VMX: Wrap GUEST_IA32_DEBUGCTL read/write with access functions Maxim Levitsky
2025-04-22 23:33 ` Sean Christopherson
2025-05-01 20:35 ` mlevitsk
2025-05-07 17:18 ` Sean Christopherson [this message]
2025-05-13 0:34 ` mlevitsk
2025-05-14 14:28 ` Sean Christopherson
2025-04-23 9:51 ` Mi, Dapeng
2025-05-01 20:34 ` mlevitsk
2025-05-07 5:17 ` Mi, Dapeng
2025-04-16 0:25 ` [PATCH 2/3] x86: KVM: VMX: cache guest written value of MSR_IA32_DEBUGCTL Maxim Levitsky
2025-04-16 0:25 ` [PATCH 3/3] x86: KVM: VMX: preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while in the guest mode Maxim Levitsky
2025-04-22 23:41 ` Sean Christopherson
2025-05-01 20:41 ` mlevitsk
2025-05-01 20:53 ` mlevitsk
2025-05-07 5:27 ` Mi, Dapeng
2025-05-07 14:31 ` mlevitsk
2025-05-07 23:03 ` Sean Christopherson
2025-05-08 13:35 ` Sean Christopherson
2025-05-15 0:19 ` mlevitsk
2025-04-23 10:10 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aBuV7JmMU3TcsqFW@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.