public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: mlevitsk@redhat.com
Cc: kvm@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	 Paolo Bonzini <pbonzini@redhat.com>,
	x86@kernel.org,  Dave Hansen <dave.hansen@linux.intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	 linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/3] x86: KVM: VMX: Wrap GUEST_IA32_DEBUGCTL read/write with access functions
Date: Wed, 7 May 2025 10:18:36 -0700	[thread overview]
Message-ID: <aBuV7JmMU3TcsqFW@google.com> (raw)
In-Reply-To: <14eab14d368e68cb9c94c655349f94f44a9a15b4.camel@redhat.com>

On Thu, May 01, 2025, mlevitsk@redhat.com wrote:
> On Tue, 2025-04-22 at 16:33 -0700, Sean Christopherson wrote:
> > > @@ -2653,11 +2654,17 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> > >  	if (vmx->nested.nested_run_pending &&
> > >  	    (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
> > >  		kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
> > > -		vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
> > > +		new_debugctl = vmcs12->guest_ia32_debugctl;
> > >  	} else {
> > >  		kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
> > > -		vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl);
> > > +		new_debugctl = vmx->nested.pre_vmenter_debugctl;
> > >  	}
> > > +
> > > +	if (CC(!vmx_set_guest_debugctl(vcpu, new_debugctl, false))) {
> > 
> > The consistency check belongs in nested_vmx_check_guest_state(), only needs to
> > check the VM_ENTRY_LOAD_DEBUG_CONTROLS case, and should be posted as a separate
> > patch.
> 
> I can move it there. Can you explain why though you want this? Is it because of the
> order of checks specified in the PRM?

To be consistent with how KVM checks guest state.  The two checks in prepare_vmcs02()
are special cases.  vmx_guest_state_valid() consumes a huge variety of state, and
so replicating all of its logic for vmcs12 isn't worth doing.  The check on the
kvm_set_msr() for guest_ia32_perf_global_ctrl exists purely so that KVM doesn't
simply ignore the return value.

And to a lesser degree, because KVM assumes that guest state has been sanitized
after nested_vmx_check_guest_state() is called.  Violating that risks introducing
bugs, e.g. consuming vmcs12->guest_ia32_debugctl before it's been vetted could
theoretically be problematic.

> Currently GUEST_IA32_DEBUGCTL of the host is *written* in prepare_vmcs02. 
> Should I also move this write to nested_vmx_check_guest_state?

No.  nested_vmx_check_guest_state() verifies the incoming vmcs12 state,
prepare_vmcs02() merges the vmcs12 state with KVM's desires and fills vmcs02.

> Or should I write the value blindly in prepare_vmcs02 and then check the value
> of 'vmx->msr_ia32_debugctl' in nested_vmx_check_guest_state and fail if the value
> contains reserved bits? 

I don't follow.  nested_vmx_check_guest_state() is called before prepare_vmcs02().

> > > +bool vmx_set_guest_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
> > > +{
> > > +	u64 invalid = data & ~vmx_get_supported_debugctl(vcpu, host_initiated);
> > > +
> > > +	if (invalid & (DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR)) {
> > > +		kvm_pr_unimpl_wrmsr(vcpu, MSR_IA32_DEBUGCTLMSR, data);
> > > +		data &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > > +		invalid &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > > +	}
> > > +
> > > +	if (invalid)
> > > +		return false;
> > > +
> > > +	if (is_guest_mode(vcpu) && (get_vmcs12(vcpu)->vm_exit_controls &
> > > +					VM_EXIT_SAVE_DEBUG_CONTROLS))
> > > +		get_vmcs12(vcpu)->guest_ia32_debugctl = data;
> > > +
> > > +	if (intel_pmu_lbr_is_enabled(vcpu) && !to_vmx(vcpu)->lbr_desc.event &&
> > > +	    (data & DEBUGCTLMSR_LBR))
> > > +		intel_pmu_create_guest_lbr_event(vcpu);
> > > +
> > > +	__vmx_set_guest_debugctl(vcpu, data);
> > > +	return true;
> > 
> > Return 0/-errno, not true/false.
> 
> There are plenty of functions in this file and KVM that return boolean.

That doesn't make them "right".  For helpers that are obvious predicates, then
absolutely use a boolean return value.  The names for nested_vmx_check_eptp()
and vmx_control_verify() aren't very good, e.g. they should be
nested_vmx_is_valid_eptp() and vmx_is_valid_control(), but the intent is good.

But for flows like modifying guest state, KVM should return 0/-errno.

> e.g: 
> 
> static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
> static inline bool vmx_control_verify(u32 control, u32 low, u32 high)
> static bool nested_evmcs_handle_vmclear(struct kvm_vcpu *vcpu, gpa_t vmptr)
> static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
> 						 struct vmcs12 *vmcs12)

These two should return 0/-errno.

 
> static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
> static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)

Probably should return 0/-errno, but nested_get_vmcs12_pages() is a bit of a mess.

> ...
> 
> 
> I personally think that functions that emulate hardware should return boolean
> values or some hardware specific status code (e.g VMX failure code) because
> the real hardware never returns -EINVAL and such.

Real hardware absolutely "returns" granular error codes.  KVM even has informal
mappings between some of them, e.g. -EINVAL == #GP, -EFAULT == #PF, -EOPNOTSUPP == #UD,
BUG() == 3-strike #MC.

And hardware has many more ways to report errors to software. E.g. VMLAUNCH can
#UD, #GP(0), VM-Exit, VMfailInvalid, or VMFailValid with 30+ unique reasons.  #MC
has a crazy number of possible error encodings.  And so on and so forth.

Software visible error codes aside, comparing individual KVM functions to an
overall CPU is wildly misguided.  A more appropriate comparison would be between
a KVM function and the ucode for a single instruction/operation.  I highly, highly
doubt ucode flows are limited to binary yes/no outputs.

  reply	other threads:[~2025-05-07 17:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-16  0:25 [PATCH 0/3] KVM: x86: allow DEBUGCTL.DEBUGCTLMSR_FREEZE_IN_SMM passthrough Maxim Levitsky
2025-04-16  0:25 ` [PATCH 1/3] x86: KVM: VMX: Wrap GUEST_IA32_DEBUGCTL read/write with access functions Maxim Levitsky
2025-04-22 23:33   ` Sean Christopherson
2025-05-01 20:35     ` mlevitsk
2025-05-07 17:18       ` Sean Christopherson [this message]
2025-05-13  0:34         ` mlevitsk
2025-05-14 14:28           ` Sean Christopherson
2025-04-23  9:51   ` Mi, Dapeng
2025-05-01 20:34     ` mlevitsk
2025-05-07  5:17       ` Mi, Dapeng
2025-04-16  0:25 ` [PATCH 2/3] x86: KVM: VMX: cache guest written value of MSR_IA32_DEBUGCTL Maxim Levitsky
2025-04-16  0:25 ` [PATCH 3/3] x86: KVM: VMX: preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while in the guest mode Maxim Levitsky
2025-04-22 23:41   ` Sean Christopherson
2025-05-01 20:41     ` mlevitsk
2025-05-01 20:53       ` mlevitsk
2025-05-07  5:27         ` Mi, Dapeng
2025-05-07 14:31           ` mlevitsk
2025-05-07 23:03         ` Sean Christopherson
2025-05-08 13:35           ` Sean Christopherson
2025-05-15  0:19             ` mlevitsk
2025-04-23 10:10   ` Mi, Dapeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBuV7JmMU3TcsqFW@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox