kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Christoffer Dall <cdall@linaro.org>
Cc: marc.zyngier@arm.com, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org
Subject: Re: [PATCH v2 4/9] KVM: arm/arm64: replace vcpu->arch.pause with a vcpu request
Date: Wed, 5 Apr 2017 13:37:10 +0200	[thread overview]
Message-ID: <91ab3b0b-c30e-7a19-f60f-882d26388ef5@redhat.com> (raw)
In-Reply-To: <20170405070959.GA1526@cbox>

[-- Attachment #1: Type: text/plain, Size: 5661 bytes --]



On 05/04/2017 09:09, Christoffer Dall wrote:
>>>  - In the explanation you wrote, you use the term 'we' a lot, but when
>>>    talking about SMP barriers, I think it only makes sense to talk about
>>>    actions and observations between multiple CPUs and we have to be
>>>    specific about which CPU observes or does what with respect to the
>>>    other.  Maybe I'm being a stickler here, but there something here
>>>    which is making me uneasy.
>> The write1-mb-if(read2) / write2-mb-if(read1) pattern is pretty common,
>> so I think it is justified to cut the ordering on the reasoning and just
>> focus on what the two memory locations and conditions mean.
> ok, but the pattern above was not common to me (and I'm pretty sure I'm
> not the only fool in the bunch here), so if we can reference something
> that explains that this is a known pattern which has been tried and
> proven, that would be even better.

I found https://lwn.net/Articles/573436/ which shows this example:

  CPU 0					CPU 1
  ---------------------			----------------------
  WRITE_ONCE(x, 1);			WRITE_ONCE(y, 1);
  smp_mb();				smp_mb();
  r2 = READ_ONCE(y);			r4 = READ_ONCE(x);

And says that it is a bug if r2 == 0 && r4 == 0.  This is exactly what 
happens in KVM:

  CPU 0					CPU 1
  ---------------------			----------------------
  vcpu->mode = IN_GUEST_MODE;		kvm_make_request(REQ, vcpu);
  smp_mb();				smp_mb();
  r2 = kvm_request_pending(vcpu)	r4 = (vcpu->mode == IN_GUEST_MODE)
  if (r2)				if (r4)
	abort entry				kick();

If r2 sees no request and r4 doesn't kick there would be a bug.
But why can't this happen?

- if no request is pending at the time of the read to r2, CPU 1 must
not have executed kvm_make_request yet.  In CPU 0, kvm_request_pending
must happen after vcpu->mode is set to IN_GUEST_MODE, therefore CPU 1
will read IN_GUEST_MODE and kick.

- if no kick happens in CPU 1, CPU 0 must not have set vcpu->mode yet.
In CPU 1, vcpu->mode is read after setting the request bit, therefore
CPU 0 will see the request bit and abort the guest entry.

>>>  - Finally, it feels very hard to prove the correctness of this, and
>>>    equally hard to test it (given how long we've been running with
>>>    apparently racy code).  I would hope that we could abstract some of
>>>    this into architecture generic things, that someone who eat memory
>>>    barriers for breakfast could help us verify, but again, maybe this is
>>>    Radim's series I'm asking for here.
>>
>> What I can do here is to suggest copying the paradigms from x86, which
>> is quite battle tested (Windows hammers it really hard).
>
> That sounds reasonable, but I think part of the problem was that we
> simply didn't understand what the paradigms were (see the
> kvm_make_all_cpus_request above as an example), so Drew's action about
> documenting what this all is and the constraints of using it is really
> important for me to do that.

Yes, totally agreed on that.

>> For QEMU I did use model checking in the past for some similarly hairy
>> synchronization code, but that is really just "executable documentation"
>> because the model is not written in C.
>>
> I played with using blast on some of the KVM/ARM code a long time ago,
> and while I was able to find a bug with it, it was sort of an obvious
> bug, and the things I was able to do with it was pretty limited to the
> problems I could imagine myself anyhow.  Perhaps this is what you mean
> with executable documentation.

I prepared three examples of a spin model for KVM vCPU kicking, and
the outcome was actually pretty surprising: the mode check seems not
to be necessary.

I haven't covered all x86 cases so I'm not going to remove it right
ahead, but for ARM it really seems like EXITING_GUEST_MODE is nothing
but an optimization of consecutive kvm_vcpu_kicks.

All three models can use C preprocessor #defines to inject bugs:

- kvm-arm-pause.promela: the "paused" mechanism; the model proves that
  the "paused" test in the interrupt-disabled region is necessary

- kvm-req.promela: the requests mechanism; the model proves that
  the requests check in the interrupt-disabled region is necessary

- kvm-x86-pi.promela: the x86 posted interrupt mechanism (simplified
  a bit); the model proves that KVM must disable interrupts before
  checking for interrupts injected while outside guest mode
  (commit b95234c84004, "kvm: x86: do not use KVM_REQ_EVENT for APICv
  interrupt injection", 2017-02-15)

So it seems like there are no races after all in KVM/ARM code, though
the code can still be cleaned up.  And I have been convinced of the wrong
thing all the time. :)

But why is KVM/ARM using KVM_REQ_VCPU_EXIT
just fine without checking for requests (kvm-req.promela)?  Because,
as mentioned earlier in the thread, KVM/ARM is using kvm_make_all_vcpus_request
simply to kick all VCPUs.  The paused variable _is_ checked after disabling
interrupts, and that is fine.

After this experiment, I think I like Drew's KVM_REQ_PAUSE more than I did
yesterday.  However, yet another alternative is to leave pause/power_off as
they are, while taking some inspiration from his patch to do some cleanups:

1) change the "if"

                if (ret <= 0 || need_new_vmid_gen(vcpu->kvm) ||
                        vcpu->arch.power_off || vcpu->arch.pause) {

to test kvm_requests_pending instead of pause/power_off

2) clear KVM_REQ_VCPU_EXIT before the other "if":

                if (vcpu->arch.power_off || vcpu->arch.pause)
                        vcpu_sleep(vcpu);


In any case, the no-wakeup behavior of kvm_make_all_vcpus_request suits
either use of requests (KVM_REQ_PAUSE or "fixed" KVM_REQ_VCPU_EXIT).

Paolo

[-- Attachment #2: kvm-arm-pause.promela --]
[-- Type: text/plain, Size: 2286 bytes --]

/* To run the model checker:
 *
 *      spin -a kvm-arm-pause.promela
 *      gcc -O2 pan.c
 *      ./a.out -a -f
 *
 * Remove the tests using -DREMOVE_MODE_TEST, -DREMOVE_PAUSED_TEST
 * right after -a.  The mode test is not necessary, the paused test is.
 */
#define OUTSIDE_GUEST_MODE      0
#define IN_GUEST_MODE           1
#define EXITING_GUEST_MODE      2

bool kick;
bool paused;
int vcpu_mode = OUTSIDE_GUEST_MODE;

active proctype vcpu_run()
{
    do
        :: true -> {
            /* In paused state, sleep with interrupts on */
            if
                :: !paused -> skip;
            fi;

            /* IPIs are eaten until interrupts are turned off.  */
            kick = 0;

            /* Interrupts are now off. */
            vcpu_mode = IN_GUEST_MODE;

            if
#ifndef REMOVE_MODE_TEST
                :: vcpu_mode != IN_GUEST_MODE -> skip;
#endif
#ifndef REMOVE_PAUSED_TEST
                :: paused -> skip;
#endif
                :: else -> {
                    do
                        /* Stay in guest mode until an IPI comes */
                        :: kick -> break;
                    od;
                }
            fi;
            vcpu_mode = OUTSIDE_GUEST_MODE;

            /* Turn on interrupts */
        }
    od
}

active proctype vcpu_kick()
{
    int old;

    do
        :: true -> {
            paused = 1;
            /* cmpxchg */
            atomic {
                old = vcpu_mode;
                if
                    :: vcpu_mode == IN_GUEST_MODE ->
                        vcpu_mode = EXITING_GUEST_MODE;
                    :: else -> skip;
                fi;
            }

            if
                :: old == IN_GUEST_MODE -> kick = 1;
                :: else -> skip;
            fi;

            if
               :: vcpu_mode == OUTSIDE_GUEST_MODE -> paused = 0;
            fi; 
        }
    od;
}

never {
    do
       /* After an arbitrarily long prefix */
       :: 1 -> skip;

       /* if we get a pause request */
       :: paused -> break;
    od; 

accept:
    /* we must eventually leave guest mode (this condition is reversed!) */
    do
       :: vcpu_mode != OUTSIDE_GUEST_MODE
    od; 
}

[-- Attachment #3: kvm-req.promela --]
[-- Type: text/plain, Size: 2358 bytes --]

/* To run the model checker:
 *
 *      spin -a kvm-req.promela
 *      gcc -O2 pan.c
 *      ./a.out -a -f
 *
 * Remove the tests using -DREMOVE_MODE_TEST, -DREMOVE_REQ_TEST
 * right after -a.  The mode test is not necessary, the vcpu_req test is.
 */
#define OUTSIDE_GUEST_MODE      0
#define IN_GUEST_MODE           1
#define EXITING_GUEST_MODE      2

bool kick;
int vcpu_req;
int vcpu_mode = OUTSIDE_GUEST_MODE;

active proctype vcpu_run()
{
    do
        :: true -> {
            /* Requests are processed with interrupts on */
            vcpu_req = 0;

            /* IPIs are eaten until interrupts are turned off.  */
            kick = 0;

            /* Interrupts are now off. */
            vcpu_mode = IN_GUEST_MODE;

            if
#ifndef REMOVE_MODE_TEST
                :: vcpu_mode != IN_GUEST_MODE -> skip;
#endif
#ifndef REMOVE_REQ_TEST
                :: vcpu_req -> skip;
#endif
                :: else -> {
                    do
                        /* Stay in guest mode until an IPI comes */
                        :: kick -> break;
                    od;
                }
            fi;
            vcpu_mode = OUTSIDE_GUEST_MODE;

            /* Turn on interrupts */
        }
    od
}

active proctype vcpu_kick()
{
    int old;

    do
        :: true -> {
            vcpu_req = 1;
            if
                :: old == 0 -> {
                    /* cmpxchg */
                    atomic {
                        old = vcpu_mode;
                        if
                            :: vcpu_mode == IN_GUEST_MODE ->
                                vcpu_mode = EXITING_GUEST_MODE;
                            :: else -> skip;
                        fi;
                    }

                    if
                        :: old == IN_GUEST_MODE -> kick = 1;
                        :: else -> skip;
                    fi;
                }
                :: else -> skip;
            fi;
        }
    od;
}

never {
    do
       /* After an arbitrarily long prefix */
       :: 1 -> skip;

       /* we get in guest mode */
       :: vcpu_mode == IN_GUEST_MODE -> break;
    od; 

accept:
    /* and never leave it (this condition is reversed!) */
    do
       :: vcpu_mode != OUTSIDE_GUEST_MODE
    od; 
}

[-- Attachment #4: kvm-x86-pi.promela --]
[-- Type: text/plain, Size: 3000 bytes --]

/* To run the model checker:
 *
 *      spin -a kvm-x86-pi.promela
 *      gcc -O2 pan.c
 *      ./a.out -a -f
 *
 * Remove the test using -DREMOVE_MODE_TEST, move the PIR->IRR sync
 * before local_irq_disable() with SYNC_WITH_INTERRUPTS_ENABLED.  The
 * mode test is not necessary, while sync_pir_to_irr must be placed
 * after interrupts are disabled.
 */
#define OUTSIDE_GUEST_MODE      0
#define IN_GUEST_MODE           1
#define EXITING_GUEST_MODE      2

bool kick;
bool posted_interrupt;
int vcpu_pir;
int vcpu_mode = OUTSIDE_GUEST_MODE;

active proctype vcpu_run()
{
    do
        :: true -> {
#ifdef SYNC_WITH_INTERRUPTS_ENABLED
            /* Guest interrupts are injected with interrupts off */
            vcpu_pir = 0;
#endif

            /* Both kinds of IPI are eaten until interrupts are turned off.  */
            atomic {
                kick = 0;
                posted_interrupt = 0;
            }

            /* Interrupts are now off. */
            vcpu_mode = IN_GUEST_MODE;

#ifndef SYNC_WITH_INTERRUPTS_ENABLED
            /* Guest interrupts are injected with interrupts off */
            vcpu_pir = 0;
#endif

            if
#ifndef REMOVE_MODE_TEST
                :: vcpu_mode != IN_GUEST_MODE -> skip;
#endif

                :: else -> {
                    do
                        /* Stay in guest mode until an IPI comes */
                        :: kick -> break;

                        /* The processor handles the posted interrupt IPI */
                        :: posted_interrupt -> vcpu_pir = 0;
                    od;
                }
            fi;
            vcpu_mode = OUTSIDE_GUEST_MODE;

            /* Turn on interrupts */
        }
    od
}

active proctype vcpu_posted_interrupt()
{
    int old;

    do
        :: vcpu_pir == 0 -> {
            vcpu_pir = 1;
            if
                :: vcpu_mode == IN_GUEST_MODE ->
                    /* If in guest mode, we can send a posted interrupt IPI */
                    posted_interrupt = 1;

                :: else -> {
                    /* Else, do a kvm_vcpu_kick.  */
                    atomic {
                        old = vcpu_mode;
                        if
                            :: vcpu_mode == IN_GUEST_MODE ->
                                vcpu_mode = EXITING_GUEST_MODE;
                            :: else -> skip;
                        fi;
                    }

                    if
                        :: old == IN_GUEST_MODE -> kick = 1;
                        :: else -> skip;
                    fi;
                }
            fi;
        }
    od;
}

never {
    do
       /* After an arbitrarily long prefix */
       :: 1 -> skip;

       /* if we get an interrupt */
       :: vcpu_pir -> break;
    od; 

accept:
    /* we must eventually inject it (this condition is reversed!) */
    do
       :: vcpu_pir
    od; 
}

[-- Attachment #5: Type: text/plain, Size: 151 bytes --]

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  reply	other threads:[~2017-04-05 11:37 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 16:06 [PATCH v2 0/9] KVM: arm/arm64: race fixes and vcpu requests Andrew Jones
2017-03-31 16:06 ` [PATCH v2 1/9] KVM: add kvm_request_pending Andrew Jones
2017-04-04 15:30   ` Christoffer Dall
2017-04-04 16:41     ` Andrew Jones
2017-04-05 13:10       ` Radim Krčmář
2017-04-05 17:39         ` Christoffer Dall
2017-04-05 18:30           ` Paolo Bonzini
2017-04-05 20:20           ` Radim Krčmář
2017-04-06 12:02             ` Andrew Jones
2017-04-06 14:37               ` Christoffer Dall
2017-04-06 15:08                 ` Andrew Jones
2017-04-07 15:33                   ` Paolo Bonzini
2017-04-08 18:19                     ` Christoffer Dall
2017-04-06 14:25             ` Christoffer Dall
2017-04-07 13:15               ` Radim Krčmář
2017-04-08 18:23                 ` Christoffer Dall
2017-04-08 19:32                   ` Paolo Bonzini
2017-04-11 21:06                     ` Radim Krčmář
2017-03-31 16:06 ` [PATCH v2 2/9] KVM: Add documentation for VCPU requests Andrew Jones
2017-04-04 15:24   ` Christoffer Dall
2017-04-04 17:06     ` Andrew Jones
2017-04-04 17:23       ` Christoffer Dall
2017-04-04 17:36         ` Paolo Bonzini
2017-04-05 14:11         ` Radim Krčmář
2017-04-05 17:45           ` Christoffer Dall
2017-04-05 18:29             ` Paolo Bonzini
2017-04-05 20:46               ` Radim Krčmář
2017-04-06 14:29                 ` Christoffer Dall
2017-04-07 11:44                   ` Paolo Bonzini
2017-04-06 14:27               ` Christoffer Dall
2017-04-06 10:18   ` Christian Borntraeger
2017-04-06 12:08     ` Andrew Jones
2017-04-06 12:29     ` Radim Krčmář
2017-03-31 16:06 ` [PATCH v2 3/9] KVM: arm/arm64: prepare to use vcpu requests Andrew Jones
2017-04-04 15:34   ` Christoffer Dall
2017-04-04 17:06     ` Andrew Jones
2017-03-31 16:06 ` [PATCH v2 4/9] KVM: arm/arm64: replace vcpu->arch.pause with a vcpu request Andrew Jones
2017-04-04 13:39   ` Marc Zyngier
2017-04-04 14:47     ` Andrew Jones
2017-04-04 14:51       ` Paolo Bonzini
2017-04-04 15:05         ` Marc Zyngier
2017-04-04 17:07         ` Andrew Jones
2017-04-04 16:04   ` Christoffer Dall
2017-04-04 16:24     ` Paolo Bonzini
2017-04-04 17:19       ` Christoffer Dall
2017-04-04 17:35         ` Paolo Bonzini
2017-04-04 17:57           ` Christoffer Dall
2017-04-04 18:15             ` Paolo Bonzini
2017-04-04 18:38               ` Christoffer Dall
2017-04-04 18:18           ` Andrew Jones
2017-04-04 18:59             ` Paolo Bonzini
2017-04-04 17:57     ` Andrew Jones
2017-04-04 19:04       ` Christoffer Dall
2017-04-04 20:10         ` Paolo Bonzini
2017-04-05  7:09           ` Christoffer Dall
2017-04-05 11:37             ` Paolo Bonzini [this message]
2017-04-06 14:14               ` Christoffer Dall
2017-04-07 11:47                 ` Paolo Bonzini
2017-04-08  8:35                   ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 5/9] KVM: arm/arm64: replace vcpu->arch.power_off " Andrew Jones
2017-04-04 17:37   ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 6/9] KVM: arm/arm64: use a vcpu request on irq injection Andrew Jones
2017-04-04 17:42   ` Christoffer Dall
2017-04-04 18:27     ` Andrew Jones
2017-04-04 18:59     ` Paolo Bonzini
2017-04-04 18:51   ` Paolo Bonzini
2017-03-31 16:06 ` [PATCH v2 7/9] KVM: arm/arm64: PMU: remove request-less vcpu kick Andrew Jones
2017-04-04 17:46   ` Christoffer Dall
2017-04-04 18:29     ` Andrew Jones
2017-04-04 19:35       ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on Andrew Jones
2017-04-04 19:42   ` Christoffer Dall
2017-04-05  8:35     ` Andrew Jones
2017-04-05  8:50       ` Christoffer Dall
2017-04-05  9:12         ` Andrew Jones
2017-04-05  9:30           ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 9/9] KVM: arm/arm64: avoid race by caching MPIDR Andrew Jones
2017-04-04 19:44   ` Christoffer Dall
2017-04-05  8:50     ` Andrew Jones
2017-04-05 11:03       ` Christoffer Dall
2017-04-05 11:14         ` Andrew Jones
2017-04-03 15:28 ` [PATCH v2 0/9] KVM: arm/arm64: race fixes and vcpu requests Christoffer Dall
2017-04-03 17:11   ` Paolo Bonzini
2017-04-04  7:27   ` Andrew Jones
2017-04-04 16:05     ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91ab3b0b-c30e-7a19-f60f-882d26388ef5@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=cdall@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).