Re: [PATCH] KVM: nVMX: Fix bug of injecting L2 exception into L1

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Liran Alon <LIRAN.ALON@ORACLE.COM>
To: Paolo Bonzini <pbonzini@redhat.com>,
	rkrcmar@redhat.com, kvm@vger.kernel.org
Cc: jmattson@google.com, wanpeng.li@hotmail.com,
	idan.brown@ORACLE.COM,
	Krish Sadhukhan <krish.sadhukhan@ORACLE.COM>
Subject: Re: [PATCH] KVM: nVMX: Fix bug of injecting L2 exception into L1
Date: Tue, 21 Nov 2017 00:46:01 +0200	[thread overview]
Message-ID: <5A135B29.9080805@ORACLE.COM> (raw)
In-Reply-To: <2d56b269-0d21-10a3-80ce-19e989d6903b@redhat.com>



On 20/11/17 23:47, Paolo Bonzini wrote:
> On 19/11/2017 17:25, Liran Alon wrote:
>> L2 RDMSR could be handled as described below:
>> 1) L2 exits to L0 on RDMSR and calls handle_rdmsr().
>> 2) handle_rdmsr() calls kvm_inject_gp() which sets
>> KVM_REQ_EVENT, exception.pending=true and exception.injected=false.
>> 3) vcpu_enter_guest() consumes KVM_REQ_EVENT and calls
>> inject_pending_event() which calls vmx_check_nested_events()
>> which sees that exception.pending=true but
>> nested_vmx_check_exception() returns 0 and therefore does nothing at
>> this point. However let's assume it later sees vmx-preemption-timer
>> expired and therefore exits from L2 to L1 by calling
>> nested_vmx_vmexit().> 4) nested_vmx_vmexit() calls prepare_vmcs12()
>> which calls vmcs12_save_pending_event() but it does nothing as
>> exception.injected is false. Also prepare_vmcs12() calls
>> kvm_clear_exception_queue() which does nothing as
>> exception.injected is already false.
>> 5) We now return from vmx_check_nested_events() with 0 while still
>> having exception.pending=true!
>> 6) Therefore inject_pending_event() continues
>> and we inject L2 exception to L1!...
>
> But this is buggy as well, because the #GP is lost, isn't it?

I don't think so.

Since commit 664f8e26b00c ("KVM: X86: Fix loss of exception which has 
not yet been injected"), there is a fundamental difference between a 
pending exception and an injected exception.
A pending exception means that no side-effects of the exception have 
been applied yet. Including incrementing the RIP after the instruction 
which cause exception. In our case for example, handle_wrmsr() calls 
kvm_inject_gp() and returns without calling 
kvm_skip_emulated_instruction() which increments the RIP.

Therefore, when we exit from L2 to L1 on vmx-preemption-timer, we can 
safely clear exception.pending because when L1 will resume L2, the 
exception will be raised again (the same WRMSR instruction will be run 
again which will raise #GP again).
This is also why vmcs12_save_pending_event() only makes sure to save in 
VMCS12 idt-vectoring-info the "injected" events and not the "pending" 
events (interrupt.pending is misleading name and I would rename it in 
upcoming patch to interrupt.injected. See explanation below). And this 
is also why exception.pending is intercepted but exception.injected is not.

I can confirm this patch works because I have wrote a kvm-unit-test 
which reproduce this issue. And after the fix the #GP is not lost and 
raised to L2 directly correctly.
(I haven't posted the unit-test yet because it is very dependent on 
correct vmx-preemption-timer timer config that varies between environments).

>
> Is the bug that the preemption timer vmexit should only be injected if
> there are no pending exceptions?  In fact, the same bug could also
> happened for NMIs or interrupts, I think.
As explained above, I don't think so.

In general there is a bit of mess in KVM code regarding clean separation 
between a "pending" event and an "injected" event. NMIs & Exceptions are 
now separated nicely. However, interrupt.pending is actually 
interrupt.injected as it is signaled after the side-effects have 
occurred (bit moved from IRR to ISR for example).

I am going to post another series (which is a v2 of a previous series I 
posted here) tomorrow that will attempt to clean this and on the way fix 
a couple of bugs in this area.

>
> So, something like (101% untested):
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 5b436f4e6e3e..64eecb8b126d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11137,8 +11137,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>   	bool block_nested_events =
>   	    vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
>
> -	if (vcpu->arch.exception.pending &&
> -		nested_vmx_check_exception(vcpu, &exit_qual)) {
> +	if (vcpu->arch.exception.pending) {
> +		if (!nested_vmx_check_exception(vcpu, &exit_qual))
> +			return 0;
>   		if (block_nested_events)
>   			return -EBUSY;
>   		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> @@ -11146,15 +11147,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>   		return 0;
>   	}
>
> -	if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
> -	    vmx->nested.preemption_timer_expired) {
> -		if (block_nested_events)
> -			return -EBUSY;
> -		nested_vmx_vmexit(vcpu, EXIT_REASON_PREEMPTION_TIMER, 0, 0);
> -		return 0;
> -	}
> -
> -	if (vcpu->arch.nmi_pending && nested_exit_on_nmi(vcpu)) {
> +	if (vcpu->arch.nmi_pending) {
> +		if (!nested_exit_on_nmi(vcpu))
> +			return 0;
>   		if (block_nested_events)
>   			return -EBUSY;
>   		nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> @@ -11169,14 +11164,23 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>   		return 0;
>   	}
>
> -	if ((kvm_cpu_has_interrupt(vcpu) || external_intr) &&
> -	    nested_exit_on_intr(vcpu)) {
> +	if (kvm_cpu_has_interrupt(vcpu) || external_intr) {
> +		if (!nested_exit_on_intr(vcpu))
> +			return 0;
>   		if (block_nested_events)
>   			return -EBUSY;
>   		nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
>   		return 0;
>   	}
>
> +	if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
> +	    vmx->nested.preemption_timer_expired) {
> +		if (block_nested_events)
> +			return -EBUSY;
> +		nested_vmx_vmexit(vcpu, EXIT_REASON_PREEMPTION_TIMER, 0, 0);
> +		return 0;
> +	}
> +
>   	vmx_complete_nested_posted_interrupt(vcpu);
>   	return 0;
>   }
>
> Paolo
>
>> This commit will fix above issue by changing step (4) to
>> clear exception.pending in kvm_clear_exception_queue().
>>
>> Fixes: 664f8e26b00c ("KVM: X86: Fix loss of exception which has not
>> yet been injected")
>>
>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
>> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
>> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
>> ---
>>   arch/x86/kvm/vmx.c | 1 -
>>   arch/x86/kvm/x86.h | 1 +
>>   2 files changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 7c3522a989d0..bee08782c781 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -11035,7 +11035,6 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>>   		if (vmx->nested.nested_run_pending)
>>   			return -EBUSY;
>>   		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
>> -		vcpu->arch.exception.pending = false;
>>   		return 0;
>>   	}
>>
>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>> index d0b95b7a90b4..6d112d8f799c 100644
>> --- a/arch/x86/kvm/x86.h
>> +++ b/arch/x86/kvm/x86.h
>> @@ -12,6 +12,7 @@
>>
>>   static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu)
>>   {
>> +	vcpu->arch.exception.pending = false;
>>   	vcpu->arch.exception.injected = false;
>>   }
>>
>>
>
> Should kvm_clear_interrupt_queue do the same?
interrupts currently only have interrupt.pending (which as I said above, 
it is actually interrupt.injected). Therefore 
kvm_clear_interrupt_queue() clears all there is...
>
> Thanks,
>
> Paolo
>

Regards,
-Liran

next prev parent reply	other threads:[~2017-11-20 22:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-19 16:25 [PATCH] KVM: nVMX: Fix bug of injecting L2 exception into L1 Liran Alon
2017-11-20 21:47 ` Paolo Bonzini
2017-11-20 22:46   ` Liran Alon [this message]
2017-11-20 22:58     ` Paolo Bonzini
2018-01-09 10:08 ` Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2017-12-12  0:06 Liran Alon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5A135B29.9080805@ORACLE.COM \
    --to=liran.alon@oracle.com \
    --cc=idan.brown@ORACLE.COM \
    --cc=jmattson@google.com \
    --cc=krish.sadhukhan@ORACLE.COM \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox