From: Gleb Natapov <gleb@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: [PATCH RFC] KVM: Fix race in apic->pending_events processing
Date: Tue, 28 May 2013 18:00:57 +0300 [thread overview]
Message-ID: <20130528150057.GA6891@redhat.com> (raw)
In-Reply-To: <51A4B5CA.9070109@redhat.com>
On Tue, May 28, 2013 at 03:48:58PM +0200, Paolo Bonzini wrote:
> Il 28/05/2013 14:56, Gleb Natapov ha scritto:
> >> > else
> >> > vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
> >> > }
> >> > - if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) &&
> >> > + /*
> >> > + * Note that we may get another INIT+SIPI sequence right here; process
> >> > + * the INIT first. Assumes that there are only KVM_APIC_INIT/SIPI.
> >> > + */
> >> > + if (cmpxchg(&apic->pending_events, KVM_APIC_SIPI, 0) == KVM_APIC_SIPI &&
> >> > vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
> > Because pending_events can be INIT/SIPI at this point and it should be
> > interpreted as: do SIPI and ignore INIT (atomically).
>
> My patch does "do another INIT (which will have no effect) and do SIPI
> after that INIT", which is different but has almost the same effect.
> If pending_events is INIT/SIPI, it ignores the SIPI for now and lets
> the next iteration of kvm_apic_accept_events do both. The difference
> would be that in a carefully-timed sequence of interrupts
>
You assume that the next processing will actually happen, but this is
not necessary the case.
> INIT-INIT-SIPI-INIT-SIPI
>
> your version would do many SIPIs, while mine would do just one.
>
> Hmm... there is a reference to this in 25.2 "Other causes of VM exits":
> "If a logical processor is in the wait-for-SIPI state, INIT signals are
> blocked. They do not cause VM exits in this case." It is not for the
> physical processor, but it makes sense to have the same thing. Is this
> the reason why you did the cmpxchg at the end?
>
No, but if helps us model proper behaviour for nested guest +1 to it :)
But until we handle INIT in nested virt it is not something that
dictates the solution.
> But then, there's another way to mask INITs in the wait-for-SIPI
> state. Considering that KVM_MP_STATE_INIT_RECEIVED is really a
> wait-for-SIPI, you can do:
>
Haven't checked it for races (especially races between multiple CPUS
sending INIT), but looks more complicated to me.
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index e1adbb4..36bc308 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -720,8 +720,12 @@ out:
> break;
>
> case APIC_DM_INIT:
> - if (!trig_mode || level) {
> + if ((!trig_mode || level) &&
> + vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) {
> result = 1;
> +
> + /* check mp_state before writing apic->pending_events */
> + smp_mb();
> /* assumes that there are only KVM_APIC_INIT/SIPI */
> apic->pending_events = (1UL << KVM_APIC_INIT);
> /* make sure pending_events is visible before sending
> @@ -1865,13 +1869,17 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
> if (!kvm_vcpu_has_lapic(vcpu))
> return;
>
> - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) {
> + if (test_bit(KVM_APIC_INIT, &apic->pending_events)) {
> kvm_lapic_reset(vcpu);
> kvm_vcpu_reset(vcpu);
> if (kvm_vcpu_is_bsp(apic->vcpu))
> vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> else
> vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
> +
> + /* write mp_state before toggling KVM_APIC_INIT */
> + smp_mb__before_clear_bit();
> + clear_bit(KVM_APIC_INIT, &apic->pending_events);
> }
> if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) &&
> vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
>
> I don't have a particular preference.
>
> Paolo
--
Gleb.
next prev parent reply other threads:[~2013-05-28 15:01 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-26 13:00 [PATCH RFC] KVM: Fix race in apic->pending_events processing Gleb Natapov
2013-05-28 10:56 ` Paolo Bonzini
2013-05-28 12:56 ` Gleb Natapov
2013-05-28 13:48 ` Paolo Bonzini
2013-05-28 15:00 ` Gleb Natapov [this message]
2013-05-28 16:33 ` Paolo Bonzini
2013-05-30 1:20 ` Gleb Natapov
2013-05-30 5:41 ` Paolo Bonzini
2013-05-30 6:01 ` Gleb Natapov
2013-05-30 6:31 ` Paolo Bonzini
2013-05-30 7:09 ` Gleb Natapov
2013-05-30 7:30 ` Paolo Bonzini
2013-05-30 12:34 ` Gleb Natapov
2013-05-30 12:58 ` Paolo Bonzini
2013-05-30 13:10 ` Gleb Natapov
2013-05-30 13:23 ` Paolo Bonzini
2013-05-30 13:35 ` Gleb Natapov
2013-05-30 14:15 ` Paolo Bonzini
2013-05-31 4:36 ` Gleb Natapov
2013-05-31 8:48 ` Paolo Bonzini
2013-05-31 9:18 ` Gleb Natapov
2013-05-31 9:48 ` Paolo Bonzini
2013-06-02 13:14 ` Gleb Natapov
2013-06-02 14:32 ` Paolo Bonzini
2013-06-02 17:33 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130528150057.GA6891@redhat.com \
--to=gleb@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.