public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	kvm-devel <kvm@vger.kernel.org>, stable <stable@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: x86: kvm: Revert "remove sched notifier for cross-cpu migrations"
Date: Wed, 25 Mar 2015 12:08:14 +0100	[thread overview]
Message-ID: <20150325110814.GE21522@potion.brq.redhat.com> (raw)
In-Reply-To: <CALCETrVR6zb6Nvb+tV9qHvDRtx3fhsinBrd9YwNgT6cvAg4a3Q@mail.gmail.com>

2015-03-24 15:33-0700, Andy Lutomirski:
> On Tue, Mar 24, 2015 at 8:34 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> > What is the problem?
> 
> The kvmclock spec says that the host will increment a version field to
> an odd number, then update stuff, then increment it to an even number.
> The host is buggy and doesn't do this, and the result is observable
> when one vcpu reads another vcpu's kvmclock data.
> 
> Since there's no good way for a guest kernel to keep its vdso from
> reading a different vcpu's kvmclock data, this is a real corner-case
> bug.  This patch allows the vdso to retry when this happens.  I don't
> think it's a great solution, but it should mostly work.

Great explanation, thank you.

Reverting the patch protects us from any migration, but I don't think we
need to care about changing VCPUs as long as we read a consistent data
from kvmclock.  (VCPU can change outside of this loop too, so it doesn't
matter if we return a value not fit for this VCPU.)

I think we could drop the second __getcpu if our kvmclock was being
handled better;  maybe with a patch like the one below:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cc2c759f69a3..8658599e0024 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1658,12 +1658,24 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 		&guest_hv_clock, sizeof(guest_hv_clock))))
 		return 0;
 
-	/*
-	 * The interface expects us to write an even number signaling that the
-	 * update is finished. Since the guest won't see the intermediate
-	 * state, we just increase by 2 at the end.
+	/* A guest can read other VCPU's kvmclock; specification says that
+	 * version is odd if data is being modified and even after it is
+	 * consistent.
+	 * We write three times to be sure.
+	 *  1) update version to odd number
+	 *  2) write modified data (version is still odd)
+	 *  3) update version to even number
+	 *
+	 * TODO: optimize
+	 *  - only two writes should be enough -- version is first
+	 *  - the second write could update just version
 	 */
-	vcpu->hv_clock.version = guest_hv_clock.version + 2;
+	guest_hv_clock.version += 1;
+	kvm_write_guest_cached(v->kvm, &vcpu->pv_time,
+				&guest_hv_clock,
+				sizeof(guest_hv_clock));
+
+	vcpu->hv_clock.version = guest_hv_clock.version;
 
 	/* retain PVCLOCK_GUEST_STOPPED if set in guest copy */
 	pvclock_flags = (guest_hv_clock.flags & PVCLOCK_GUEST_STOPPED);
@@ -1684,6 +1696,11 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 	kvm_write_guest_cached(v->kvm, &vcpu->pv_time,
 				&vcpu->hv_clock,
 				sizeof(vcpu->hv_clock));
+
+	vcpu->hv_clock.version += 1;
+	kvm_write_guest_cached(v->kvm, &vcpu->pv_time,
+				&vcpu->hv_clock,
+				sizeof(vcpu->hv_clock));
 	return 0;
 }
 

  reply	other threads:[~2015-03-25 11:08 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23 23:21 x86: kvm: Revert "remove sched notifier for cross-cpu migrations" Marcelo Tosatti
2015-03-23 23:30 ` Andy Lutomirski
2015-03-24 15:34 ` Radim Krčmář
2015-03-24 22:33   ` Andy Lutomirski
2015-03-25 11:08     ` Radim Krčmář [this message]
2015-03-25 12:52       ` Radim Krčmář
2015-03-25 21:28         ` Marcelo Tosatti
2015-03-25 22:33           ` Andy Lutomirski
2015-03-25 22:41             ` Marcelo Tosatti
2015-03-25 22:48               ` Andy Lutomirski
2015-03-25 23:13                 ` Marcelo Tosatti
2015-03-25 23:22                   ` Andy Lutomirski
2015-03-26 11:29                     ` Marcelo Tosatti
2015-03-26 18:51                       ` Andy Lutomirski
2015-03-26 20:31                         ` Radim Krcmar
2015-03-26 20:58                           ` Andy Lutomirski
2015-03-26 22:22                             ` Andy Lutomirski
2015-03-26 22:56                             ` Marcelo Tosatti
2015-03-26 23:09                               ` Andy Lutomirski
2015-03-26 23:22                                 ` Marcelo Tosatti
2015-03-26 23:28                                   ` Andy Lutomirski
2015-03-26 23:38                                     ` Marcelo Tosatti
2015-03-26 18:47       ` Andy Lutomirski
2015-03-26 20:10         ` Radim Krčmář
2015-03-26 20:52           ` Paolo Bonzini
2015-03-24 22:59   ` Marcelo Tosatti
2015-03-25 11:09     ` Radim Krčmář
2015-03-25 13:06 ` Radim Krčmář
2015-03-26 20:59 ` Radim Krčmář
2015-03-26 22:22   ` Marcelo Tosatti
2015-03-26 22:24     ` Andy Lutomirski
2015-03-26 22:40       ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150325110814.GE21522@potion.brq.redhat.com \
    --to=rkrcmar@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox