From: Liran Alon <LIRAN.ALON@ORACLE.COM>
To: Paolo Bonzini <pbonzini@redhat.com>,
rkrcmar@redhat.com, kvm@vger.kernel.org
Cc: jmattson@google.com, wanpeng.li@hotmail.com,
idan.brown@ORACLE.COM,
Krish Sadhukhan <krish.sadhukhan@ORACLE.COM>
Subject: Re: [PATCH] KVM: nVMX: Fix bug of injecting L2 exception into L1
Date: Tue, 21 Nov 2017 00:46:01 +0200 [thread overview]
Message-ID: <5A135B29.9080805@ORACLE.COM> (raw)
In-Reply-To: <2d56b269-0d21-10a3-80ce-19e989d6903b@redhat.com>
On 20/11/17 23:47, Paolo Bonzini wrote:
> On 19/11/2017 17:25, Liran Alon wrote:
>> L2 RDMSR could be handled as described below:
>> 1) L2 exits to L0 on RDMSR and calls handle_rdmsr().
>> 2) handle_rdmsr() calls kvm_inject_gp() which sets
>> KVM_REQ_EVENT, exception.pending=true and exception.injected=false.
>> 3) vcpu_enter_guest() consumes KVM_REQ_EVENT and calls
>> inject_pending_event() which calls vmx_check_nested_events()
>> which sees that exception.pending=true but
>> nested_vmx_check_exception() returns 0 and therefore does nothing at
>> this point. However let's assume it later sees vmx-preemption-timer
>> expired and therefore exits from L2 to L1 by calling
>> nested_vmx_vmexit().> 4) nested_vmx_vmexit() calls prepare_vmcs12()
>> which calls vmcs12_save_pending_event() but it does nothing as
>> exception.injected is false. Also prepare_vmcs12() calls
>> kvm_clear_exception_queue() which does nothing as
>> exception.injected is already false.
>> 5) We now return from vmx_check_nested_events() with 0 while still
>> having exception.pending=true!
>> 6) Therefore inject_pending_event() continues
>> and we inject L2 exception to L1!...
>
> But this is buggy as well, because the #GP is lost, isn't it?
I don't think so.
Since commit 664f8e26b00c ("KVM: X86: Fix loss of exception which has
not yet been injected"), there is a fundamental difference between a
pending exception and an injected exception.
A pending exception means that no side-effects of the exception have
been applied yet. Including incrementing the RIP after the instruction
which cause exception. In our case for example, handle_wrmsr() calls
kvm_inject_gp() and returns without calling
kvm_skip_emulated_instruction() which increments the RIP.
Therefore, when we exit from L2 to L1 on vmx-preemption-timer, we can
safely clear exception.pending because when L1 will resume L2, the
exception will be raised again (the same WRMSR instruction will be run
again which will raise #GP again).
This is also why vmcs12_save_pending_event() only makes sure to save in
VMCS12 idt-vectoring-info the "injected" events and not the "pending"
events (interrupt.pending is misleading name and I would rename it in
upcoming patch to interrupt.injected. See explanation below). And this
is also why exception.pending is intercepted but exception.injected is not.
I can confirm this patch works because I have wrote a kvm-unit-test
which reproduce this issue. And after the fix the #GP is not lost and
raised to L2 directly correctly.
(I haven't posted the unit-test yet because it is very dependent on
correct vmx-preemption-timer timer config that varies between environments).
>
> Is the bug that the preemption timer vmexit should only be injected if
> there are no pending exceptions? In fact, the same bug could also
> happened for NMIs or interrupts, I think.
As explained above, I don't think so.
In general there is a bit of mess in KVM code regarding clean separation
between a "pending" event and an "injected" event. NMIs & Exceptions are
now separated nicely. However, interrupt.pending is actually
interrupt.injected as it is signaled after the side-effects have
occurred (bit moved from IRR to ISR for example).
I am going to post another series (which is a v2 of a previous series I
posted here) tomorrow that will attempt to clean this and on the way fix
a couple of bugs in this area.
>
> So, something like (101% untested):
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 5b436f4e6e3e..64eecb8b126d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11137,8 +11137,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
> bool block_nested_events =
> vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
>
> - if (vcpu->arch.exception.pending &&
> - nested_vmx_check_exception(vcpu, &exit_qual)) {
> + if (vcpu->arch.exception.pending) {
> + if (!nested_vmx_check_exception(vcpu, &exit_qual))
> + return 0;
> if (block_nested_events)
> return -EBUSY;
> nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> @@ -11146,15 +11147,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
> return 0;
> }
>
> - if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
> - vmx->nested.preemption_timer_expired) {
> - if (block_nested_events)
> - return -EBUSY;
> - nested_vmx_vmexit(vcpu, EXIT_REASON_PREEMPTION_TIMER, 0, 0);
> - return 0;
> - }
> -
> - if (vcpu->arch.nmi_pending && nested_exit_on_nmi(vcpu)) {
> + if (vcpu->arch.nmi_pending) {
> + if (!nested_exit_on_nmi(vcpu))
> + return 0;
> if (block_nested_events)
> return -EBUSY;
> nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> @@ -11169,14 +11164,23 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
> return 0;
> }
>
> - if ((kvm_cpu_has_interrupt(vcpu) || external_intr) &&
> - nested_exit_on_intr(vcpu)) {
> + if (kvm_cpu_has_interrupt(vcpu) || external_intr) {
> + if (!nested_exit_on_intr(vcpu))
> + return 0;
> if (block_nested_events)
> return -EBUSY;
> nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
> return 0;
> }
>
> + if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
> + vmx->nested.preemption_timer_expired) {
> + if (block_nested_events)
> + return -EBUSY;
> + nested_vmx_vmexit(vcpu, EXIT_REASON_PREEMPTION_TIMER, 0, 0);
> + return 0;
> + }
> +
> vmx_complete_nested_posted_interrupt(vcpu);
> return 0;
> }
>
> Paolo
>
>> This commit will fix above issue by changing step (4) to
>> clear exception.pending in kvm_clear_exception_queue().
>>
>> Fixes: 664f8e26b00c ("KVM: X86: Fix loss of exception which has not
>> yet been injected")
>>
>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
>> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
>> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
>> ---
>> arch/x86/kvm/vmx.c | 1 -
>> arch/x86/kvm/x86.h | 1 +
>> 2 files changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 7c3522a989d0..bee08782c781 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -11035,7 +11035,6 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>> if (vmx->nested.nested_run_pending)
>> return -EBUSY;
>> nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
>> - vcpu->arch.exception.pending = false;
>> return 0;
>> }
>>
>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>> index d0b95b7a90b4..6d112d8f799c 100644
>> --- a/arch/x86/kvm/x86.h
>> +++ b/arch/x86/kvm/x86.h
>> @@ -12,6 +12,7 @@
>>
>> static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu)
>> {
>> + vcpu->arch.exception.pending = false;
>> vcpu->arch.exception.injected = false;
>> }
>>
>>
>
> Should kvm_clear_interrupt_queue do the same?
interrupts currently only have interrupt.pending (which as I said above,
it is actually interrupt.injected). Therefore
kvm_clear_interrupt_queue() clears all there is...
>
> Thanks,
>
> Paolo
>
Regards,
-Liran
next prev parent reply other threads:[~2017-11-20 22:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-19 16:25 [PATCH] KVM: nVMX: Fix bug of injecting L2 exception into L1 Liran Alon
2017-11-20 21:47 ` Paolo Bonzini
2017-11-20 22:46 ` Liran Alon [this message]
2017-11-20 22:58 ` Paolo Bonzini
2018-01-09 10:08 ` Paolo Bonzini
-- strict thread matches above, loose matches on Subject: below --
2017-12-12 0:06 Liran Alon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A135B29.9080805@ORACLE.COM \
--to=liran.alon@oracle.com \
--cc=idan.brown@ORACLE.COM \
--cc=jmattson@google.com \
--cc=krish.sadhukhan@ORACLE.COM \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=wanpeng.li@hotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox