From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
kvm-devel <kvm@vger.kernel.org>, stable <stable@vger.kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: x86: kvm: Revert "remove sched notifier for cross-cpu migrations"
Date: Wed, 25 Mar 2015 12:08:14 +0100 [thread overview]
Message-ID: <20150325110814.GE21522@potion.brq.redhat.com> (raw)
In-Reply-To: <CALCETrVR6zb6Nvb+tV9qHvDRtx3fhsinBrd9YwNgT6cvAg4a3Q@mail.gmail.com>
2015-03-24 15:33-0700, Andy Lutomirski:
> On Tue, Mar 24, 2015 at 8:34 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> > What is the problem?
>
> The kvmclock spec says that the host will increment a version field to
> an odd number, then update stuff, then increment it to an even number.
> The host is buggy and doesn't do this, and the result is observable
> when one vcpu reads another vcpu's kvmclock data.
>
> Since there's no good way for a guest kernel to keep its vdso from
> reading a different vcpu's kvmclock data, this is a real corner-case
> bug. This patch allows the vdso to retry when this happens. I don't
> think it's a great solution, but it should mostly work.
Great explanation, thank you.
Reverting the patch protects us from any migration, but I don't think we
need to care about changing VCPUs as long as we read a consistent data
from kvmclock. (VCPU can change outside of this loop too, so it doesn't
matter if we return a value not fit for this VCPU.)
I think we could drop the second __getcpu if our kvmclock was being
handled better; maybe with a patch like the one below:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cc2c759f69a3..8658599e0024 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1658,12 +1658,24 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
&guest_hv_clock, sizeof(guest_hv_clock))))
return 0;
- /*
- * The interface expects us to write an even number signaling that the
- * update is finished. Since the guest won't see the intermediate
- * state, we just increase by 2 at the end.
+ /* A guest can read other VCPU's kvmclock; specification says that
+ * version is odd if data is being modified and even after it is
+ * consistent.
+ * We write three times to be sure.
+ * 1) update version to odd number
+ * 2) write modified data (version is still odd)
+ * 3) update version to even number
+ *
+ * TODO: optimize
+ * - only two writes should be enough -- version is first
+ * - the second write could update just version
*/
- vcpu->hv_clock.version = guest_hv_clock.version + 2;
+ guest_hv_clock.version += 1;
+ kvm_write_guest_cached(v->kvm, &vcpu->pv_time,
+ &guest_hv_clock,
+ sizeof(guest_hv_clock));
+
+ vcpu->hv_clock.version = guest_hv_clock.version;
/* retain PVCLOCK_GUEST_STOPPED if set in guest copy */
pvclock_flags = (guest_hv_clock.flags & PVCLOCK_GUEST_STOPPED);
@@ -1684,6 +1696,11 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
kvm_write_guest_cached(v->kvm, &vcpu->pv_time,
&vcpu->hv_clock,
sizeof(vcpu->hv_clock));
+
+ vcpu->hv_clock.version += 1;
+ kvm_write_guest_cached(v->kvm, &vcpu->pv_time,
+ &vcpu->hv_clock,
+ sizeof(vcpu->hv_clock));
return 0;
}
next prev parent reply other threads:[~2015-03-25 11:08 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-23 23:21 x86: kvm: Revert "remove sched notifier for cross-cpu migrations" Marcelo Tosatti
2015-03-23 23:30 ` Andy Lutomirski
2015-03-24 15:34 ` Radim Krčmář
2015-03-24 22:33 ` Andy Lutomirski
2015-03-25 11:08 ` Radim Krčmář [this message]
2015-03-25 12:52 ` Radim Krčmář
2015-03-25 21:28 ` Marcelo Tosatti
2015-03-25 22:33 ` Andy Lutomirski
2015-03-25 22:41 ` Marcelo Tosatti
2015-03-25 22:48 ` Andy Lutomirski
2015-03-25 23:13 ` Marcelo Tosatti
2015-03-25 23:22 ` Andy Lutomirski
2015-03-26 11:29 ` Marcelo Tosatti
2015-03-26 18:51 ` Andy Lutomirski
2015-03-26 20:31 ` Radim Krcmar
2015-03-26 20:58 ` Andy Lutomirski
2015-03-26 22:22 ` Andy Lutomirski
2015-03-26 22:56 ` Marcelo Tosatti
2015-03-26 23:09 ` Andy Lutomirski
2015-03-26 23:22 ` Marcelo Tosatti
2015-03-26 23:28 ` Andy Lutomirski
2015-03-26 23:38 ` Marcelo Tosatti
2015-03-26 18:47 ` Andy Lutomirski
2015-03-26 20:10 ` Radim Krčmář
2015-03-26 20:52 ` Paolo Bonzini
2015-03-24 22:59 ` Marcelo Tosatti
2015-03-25 11:09 ` Radim Krčmář
2015-03-25 13:06 ` Radim Krčmář
2015-03-26 20:59 ` Radim Krčmář
2015-03-26 22:22 ` Marcelo Tosatti
2015-03-26 22:24 ` Andy Lutomirski
2015-03-26 22:40 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150325110814.GE21522@potion.brq.redhat.com \
--to=rkrcmar@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox