From: Wanpeng Li <wanpeng.li@linux.intel.com>
To: Bandan Das <bsd@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>,
Wanpeng Li <wanpeng.li@linux.intel.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Gleb Natapov <gleb@kernel.org>, Hu Robert <robert.hu@intel.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: nVMX: Fix IRQs inject to L2 which belong to L1 since race
Date: Fri, 4 Jul 2014 14:17:01 +0800 [thread overview]
Message-ID: <20140704061701.GB3453@kernel> (raw)
In-Reply-To: <jpgy4wb9g5t.fsf@redhat.com>
On Thu, Jul 03, 2014 at 01:15:26AM -0400, Bandan Das wrote:
>Jan Kiszka <jan.kiszka@siemens.com> writes:
>
>> On 2014-07-02 08:54, Wanpeng Li wrote:
>>> This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=72381
>>>
>>> If we didn't inject a still-pending event to L1 since nested_run_pending,
>>> KVM_REQ_EVENT should be requested after the vmexit in order to inject the
>>> event to L1. However, current log blindly request a KVM_REQ_EVENT even if
>>> there is no still-pending event to L1 which blocked by nested_run_pending.
>>> There is a race which lead to an interrupt will be injected to L2 which
>>> belong to L1 if L0 send an interrupt to L1 during this window.
>>>
>>> VCPU0 another thread
>>>
>>> L1 intr not blocked on L2 first entry
>>> vmx_vcpu_run req event
>>> kvm check request req event
>>> check_nested_events don't have any intr
>>> not nested exit
>>> intr occur (8254, lapic timer etc)
>>> inject_pending_event now have intr
>>> inject interrupt
>>>
>>> This patch fix this race by introduced a l1_events_blocked field in nested_vmx
>>> which indicates there is still-pending event which blocked by nested_run_pending,
>>> and smart request a KVM_REQ_EVENT if there is a still-pending event which blocked
>>> by nested_run_pending.
>>
>> There are more, unrelated reasons why KVM_REQ_EVENT could be set. Why
>> aren't those able to trigger this scenario?
>>
>> In any case, unconditionally setting KVM_REQ_EVENT seems strange and
>> should be changed.
>
>
>Ugh! I think I am hitting another one but this one's probably because
>we are not setting KVM_REQ_EVENT for something we should.
>
>Before this patch, I was able to hit this bug everytime with
>"modprobe kvm_intel ept=0 nested=1 enable_shadow_vmcs=0" and then booting
>L2. I can verify that I was indeed hitting the race in inject_pending_event.
>
>After this patch, I believe I am hitting another bug - this happens
>after I boot L2, as above, and then start a Linux kernel compilation
>and then wait and watch :) It's a pain to debug because this happens
>almost once in three times; it never happens if I run with ept=1, however,
>I think that's only because the test completes sooner. But I can confirm
>that I don't see it if I always set REQ_EVENT if nested_run_pending is set instead of
>the approach this patch takes.
>(Any debug hints help appreciated!)
>
>So, I am not sure if this is the right fix. Rather, I think the safer thing
>to do is to have the interrupt pending check for injection into L1 at
>the "same site" as the call to kvm_queue_interrupt() just like we had before
>commit b6b8a1451fc40412c57d1. Is there any advantage to having all the
>nested events checks together ?
>
How about revert commit b6b8a1451 and try if the bug which you mentioned
is still there?
Regards,
Wanpeng Li
>PS - Actually, a much easier fix (or rather hack) is to return 1 in
>vmx_interrupt_allowed() (as I mentioned elsewhere) only if
>!is_guest_mode(vcpu) That way, the pending interrupt interrupt
>can be taken care of correctly during the next vmexit.
>
>Bandan
>
>> Jan
>>
>>>
>>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>>> ---
>>> arch/x86/kvm/vmx.c | 20 +++++++++++++++-----
>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index f4e5aed..fe69c49 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -372,6 +372,7 @@ struct nested_vmx {
>>> u64 vmcs01_tsc_offset;
>>> /* L2 must run next, and mustn't decide to exit to L1. */
>>> bool nested_run_pending;
>>> + bool l1_events_blocked;
>>> /*
>>> * Guest pages referred to in vmcs02 with host-physical pointers, so
>>> * we must keep them pinned while L2 runs.
>>> @@ -7380,8 +7381,10 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>> * we did not inject a still-pending event to L1 now because of
>>> * nested_run_pending, we need to re-enable this bit.
>>> */
>>> - if (vmx->nested.nested_run_pending)
>>> + if (to_vmx(vcpu)->nested.l1_events_blocked) {
>>> + to_vmx(vcpu)->nested.l1_events_blocked = false;
>>> kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> + }
>>>
>>> vmx->nested.nested_run_pending = 0;
>>>
>>> @@ -8197,15 +8200,20 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>>>
>>> if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
>>> vmx->nested.preemption_timer_expired) {
>>> - if (vmx->nested.nested_run_pending)
>>> + if (vmx->nested.nested_run_pending) {
>>> + vmx->nested.l1_events_blocked = true;
>>> return -EBUSY;
>>> + }
>>> nested_vmx_vmexit(vcpu, EXIT_REASON_PREEMPTION_TIMER, 0, 0);
>>> return 0;
>>> }
>>>
>>> if (vcpu->arch.nmi_pending && nested_exit_on_nmi(vcpu)) {
>>> - if (vmx->nested.nested_run_pending ||
>>> - vcpu->arch.interrupt.pending)
>>> + if (vmx->nested.nested_run_pending) {
>>> + vmx->nested.l1_events_blocked = true;
>>> + return -EBUSY;
>>> + }
>>> + if (vcpu->arch.interrupt.pending)
>>> return -EBUSY;
>>> nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
>>> NMI_VECTOR | INTR_TYPE_NMI_INTR |
>>> @@ -8221,8 +8229,10 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>>>
>>> if ((kvm_cpu_has_interrupt(vcpu) || external_intr) &&
>>> nested_exit_on_intr(vcpu)) {
>>> - if (vmx->nested.nested_run_pending)
>>> + if (vmx->nested.nested_run_pending) {
>>> + vmx->nested.l1_events_blocked = true;
>>> return -EBUSY;
>>> + }
>>> nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
>>> }
>>>
>>>
next prev parent reply other threads:[~2014-07-04 6:16 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-02 6:54 [PATCH] KVM: nVMX: Fix IRQs inject to L2 which belong to L1 since race Wanpeng Li
2014-07-02 7:20 ` Hu, Robert
2014-07-02 9:03 ` Jan Kiszka
2014-07-02 9:13 ` Hu, Robert
2014-07-02 9:16 ` Jan Kiszka
2014-07-02 9:01 ` Jan Kiszka
2014-07-03 2:59 ` Wanpeng Li
2014-07-03 5:15 ` Bandan Das
2014-07-03 6:59 ` Wanpeng Li
2014-07-03 17:27 ` Bandan Das
2014-07-04 2:52 ` Wanpeng Li
2014-07-04 5:43 ` Jan Kiszka
2014-07-04 6:08 ` Wanpeng Li
2014-07-04 7:19 ` Jan Kiszka
2014-07-04 7:39 ` Wanpeng Li
2014-07-04 7:46 ` Paolo Bonzini
2014-07-04 7:59 ` Wanpeng Li
2014-07-04 8:14 ` Paolo Bonzini
2014-07-04 7:42 ` Paolo Bonzini
2014-07-04 9:33 ` Jan Kiszka
2014-07-04 9:38 ` Paolo Bonzini
2014-07-04 10:52 ` Jan Kiszka
2014-07-04 11:07 ` Jan Kiszka
2014-07-04 11:28 ` Paolo Bonzini
2014-07-04 6:17 ` Wanpeng Li [this message]
2014-07-04 7:21 ` Jan Kiszka
2014-07-07 0:56 ` Bandan Das
2014-07-07 8:46 ` Wanpeng Li
2014-07-07 13:03 ` Paolo Bonzini
2014-07-07 17:31 ` Bandan Das
2014-07-07 17:34 ` Paolo Bonzini
2014-07-07 17:38 ` Bandan Das
2014-07-07 23:14 ` Wanpeng Li
2014-07-08 4:35 ` Bandan Das
2014-07-07 23:38 ` Wanpeng Li
2014-07-08 5:49 ` Paolo Bonzini
2014-07-02 16:27 ` Bandan Das
2014-07-03 5:11 ` Wanpeng Li
2014-07-03 5:29 ` Bandan Das
2014-07-03 7:33 ` Jan Kiszka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140704061701.GB3453@kernel \
--to=wanpeng.li@linux.intel.com \
--cc=bsd@redhat.com \
--cc=gleb@kernel.org \
--cc=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=robert.hu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).