From: Manali Shukla <manali.shukla@amd.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, linux-kselftest@vger.kernel.org,
pbonzini@redhat.com, shuah@kernel.org, nikunj@amd.com,
thomas.lendacky@amd.com, vkuznets@redhat.com, bp@alien8.de,
babu.moger@amd.com, neeraj.upadhyay@amd.com,
Manali Shukla <manali.shukla@amd.com>
Subject: Re: [PATCH v6 2/3] KVM: SVM: Add Idle HLT intercept support
Date: Thu, 27 Feb 2025 21:35:52 +0530 [thread overview]
Message-ID: <9e35b27f-affe-4345-8a87-07f4f285b63f@amd.com> (raw)
In-Reply-To: <Z8B3l8VGA2RHRI1j@google.com>
On 2/27/2025 8:02 PM, Sean Christopherson wrote:
> On Thu, Feb 27, 2025, Manali Shukla wrote:
>> On 2/26/2025 3:37 AM, Sean Christopherson wrote:
>>>> @@ -5225,6 +5230,8 @@ static __init void svm_set_cpu_caps(void)
>>>> if (vnmi)
>>>> kvm_cpu_cap_set(X86_FEATURE_VNMI);
>>>>
>>>> + kvm_cpu_cap_check_and_set(X86_FEATURE_IDLE_HLT);
>>>
>>> I am 99% certain this is wrong. Or at the very least, severly lacking an
>>> explanation of why it's correct. If L1 enables Idle HLT but not HLT interception,
>>> then it is KVM's responsibility to NOT exit to L1 if there is a pending V_IRQ or
>>> V_NMI.
>>>
>>> Yeah, it's buggy. But, it's buggy in part because *existing* KVM support is buggy.
>>> If L1 disables HLT exiting, but it's enabled in KVM, then KVM will run L2 with
>>> HLT exiting and so it becomes KVM's responsibility to check for valid L2 wake events
>>> prior to scheduling out the vCPU if L2 executes HLT. E.g. nVMX handles this by
>>> reading vmcs02.GUEST_INTERRUPT_STATUS.RVI as part of vmx_has_nested_events(). I
>>> don't see the equivalent in nSVM.
>>>
>>> Amusingly, that means Idle HLT is actually a bug fix to some extent. E.g. if there
>>> is a pending V_IRQ/V_NMI in vmcb02, then running with Idle HLT will naturally do
>>> the right thing, i.e. not hang the vCPU.
>>>
>>> Anyways, for now, I think the easiest and best option is to simply skip full nested
>>> support for the moment.
>>>
>>
>> Got it, I see the issue you're talking about. I'll need to look into it a bit more to
>> fully understand it. So yeah, we can hold off on full nested support for idle HLT
>> intercept for now.
>>
>> Since we are planning to disable Idle HLT support on nested guests, should we do
>> something like this ?
>>
>> @@ -167,10 +167,15 @@ void recalc_intercepts(struct vcpu_svm *svm)
>> if (!nested_svm_l2_tlb_flush_enabled(&svm->vcpu))
>> vmcb_clr_intercept(c, INTERCEPT_VMMCALL);
>>
>> + if (!guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_IDLE_HLT))
>> + vmcb_clr_intercept(c, INTERCEPT_IDLE_HLT);
>> +
>>
>> When recalc_intercepts copies the intercept values from vmc01 to vmcb02, it also copies
>> the IDLE HLT intercept bit, which is set to 1 in vmcb01. Normally, this isn't a problem
>> because the HLT intercept takes priority when it's on. But if the HLT intercept gets
>> turned off for some reason, the IDLE HLT intercept will stay on, which is not what we
>> want.
>
> Why don't we want that?
The idle-HLT intercept remains '1' for the L2 guest. Now, when L2 executes HLT and there
is no pending event available, it will still do idle-HLT exit, although Idle HLT
was never explicitly enabled on L2 guest.
I found this behavior by modifying the ipi_hlt_test a bit.
static void l2_guest_code(void)
{
uint64_t icr_val;
int i;
x2apic_enable();
icr_val = (APIC_DEST_SELF | APIC_INT_ASSERT | INTR_VECTOR);
for (i = 0; i < NUM_ITERATIONS; i++) {
cli();
x2apic_write_reg(APIC_ICR, icr_val);
safe_halt();
GUEST_ASSERT(READ_ONCE(irq_received));
WRITE_ONCE(irq_received, false);
asm volatile("hlt");
}
GUEST_DONE();
}
static void l1_svm_code(struct svm_test_data *svm)
{
unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
struct vmcb *vmcb = svm->vmcb;
generic_svm_setup(svm, l2_guest_code,
&l2_guest_stack[L2_GUEST_STACK_SIZE]);
vmcb->control.intercept &= ~BIT(INTERCEPT_HLT);
run_guest(vmcb, svm->vmcb_gpa);
GUEST_DONE();
}
Let me know if I am missing something.
-Manali
next prev parent reply other threads:[~2025-02-27 16:06 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-28 12:48 [PATCH v6 0/3] Add support for the Idle HLT intercept feature Manali Shukla
2025-01-28 12:48 ` [PATCH v6 1/3] x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept Manali Shukla
2025-01-28 12:48 ` [PATCH v6 2/3] KVM: SVM: Add Idle HLT intercept support Manali Shukla
2025-02-25 22:07 ` Sean Christopherson
2025-02-27 8:21 ` Manali Shukla
2025-02-27 14:32 ` Sean Christopherson
2025-02-27 16:05 ` Manali Shukla [this message]
2025-02-27 16:11 ` Sean Christopherson
2025-02-28 14:58 ` Manali Shukla
2025-01-28 12:48 ` [PATCH v6 3/3] KVM: selftests: Add self IPI HLT test Manali Shukla
2025-01-29 5:26 ` Neeraj Upadhyay
2025-02-26 0:38 ` Sean Christopherson
2025-02-28 15:05 ` Manali Shukla
2025-02-10 5:06 ` [PATCH v6 0/3] Add support for the Idle HLT intercept feature Manali Shukla
2025-02-17 4:43 ` Manali Shukla
2025-02-28 17:06 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9e35b27f-affe-4345-8a87-07f4f285b63f@amd.com \
--to=manali.shukla@amd.com \
--cc=babu.moger@amd.com \
--cc=bp@alien8.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=neeraj.upadhyay@amd.com \
--cc=nikunj@amd.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=shuah@kernel.org \
--cc=thomas.lendacky@amd.com \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox