Re: [PATCH] KVM: x86: do not go through vcpu in __get_kvmclock_ns

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: x86: do not go through vcpu in __get_kvmclock_ns
Date: Mon, 14 Nov 2016 15:05:52 -0200	[thread overview]
Message-ID: <20161114170550.GA6838@amt.cnet> (raw)
In-Reply-To: <20161114145239.GA2185@potion>

On Mon, Nov 14, 2016 at 03:52:40PM +0100, Radim Krčmář wrote:
> 2016-11-11 11:12+0100, Paolo Bonzini:
> > Going through the first VCPU is wrong if you follow a KVM_SET_CLOCK with
> > a KVM_GET_CLOCK immediately after, without letting the VCPU run and
> > call kvm_guest_time_update.
> > 
> > This is easily fixed however, because kvm_get_time_and_clockread provides
> > the information we want.
> > 
> > Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  arch/x86/kvm/x86.c | 18 ++++++++++--------
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 1ba08278a9a9..1c16c6d7df7a 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1620,6 +1620,11 @@ static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now)
> >  
> >  	return do_monotonic_boot(kernel_ns, cycle_now) == VCLOCK_TSC;
> >  }
> > +#else
> > +static inline bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now)
> > +{
> > +	return false;
> > +}
> >  #endif
> >  
> >  /*
> > @@ -1724,18 +1729,15 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
> >  
> >  static u64 __get_kvmclock_ns(struct kvm *kvm)
> >  {
> > -	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
> >  	struct kvm_arch *ka = &kvm->arch;
> > +	cycle_t cycle_now;
> >  	s64 ns;
> >  
> > -	if (vcpu->arch.hv_clock.flags & PVCLOCK_TSC_STABLE_BIT) {
> > -		u64 tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> > -		ns = __pvclock_read_cycles(&vcpu->arch.hv_clock, tsc);
> 
> This patch regresses the behavior as well, because the assumption that
> kvm_get_time_and_clockread() and __pvclock_read_cycles() count the same
> time doesn't hold.  See the end of the message for a quick test.
> 
> kvm_get_time_and_clockread() is actually the same as ktime_get_boot_ns()
> (if it works), so we'd be just obfucating the code. :)
> 
> I think that making kvmclock count as ktime_get_boot_ns() would be the
> best solution, but not possible this late in 4.9 ...
> 
> As a quick hack, I think it would be better to duplicate the update that
> would happen when running the VCPU before calling
> __pvclock_read_cycles(), i.e. paste something like this:
> 
>   if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu))
>   	kvm_guest_time_update(vcpu);
> 
> > -	} else {
> > -		ns = ktime_get_boot_ns() + ka->kvmclock_offset;
> > -	}
> > +	if (!ka->use_master_clock ||
> > +	    !kvm_get_time_and_clockread(&ns, &cycle_now))
> > +		ns = ktime_get_boot_ns();
> >  
> > -	return ns;
> > +	return ns + ka->kvmclock_offset;
> >  }
> 
> The hunk below should return the same value in pvclock_ns and kernel_ns
> if they can be used interchangeably.  boot_ns is expected to be a bit
> delayed, because it is read late.  boot_ns shows a bounded offset from
> kernel_ns, unlike the drifting pvclock_ns.
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 83990ad3710e..30d4d3d02ac7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6653,6 +6653,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		goto cancel_injection;
>  	}
>  
> +	if (vcpu->kvm->arch.use_master_clock) {
> +		s64 kernel_ns;
> +		cycle_t tsc_now, pvclock_ns, boot_ns;
> +
> +		kvm_get_time_and_clockread(&kernel_ns, &tsc_now);
> +		pvclock_ns = __pvclock_read_cycles(&vcpu->arch.hv_clock, kvm_read_l1_tsc(vcpu, tsc_now)) - vcpu->kvm->arch.kvmclock_offset;
> +		boot_ns = ktime_get_boot_ns();
> +
> +		printk("ns diff: %lld %lld\n", pvclock_ns - kernel_ns, boot_ns - kernel_ns);
> +	}
> +
>  	preempt_disable();
>  
>  	kvm_x86_ops->prepare_guest_switch(vcpu);
> 
> and a sample output:

KVM_GET_CLOCK should return what the guest sees at the moment 
KVM_GET_CLOCK is called, which should include 

        if (vcpu->arch.hv_clock.flags & PVCLOCK_TSC_STABLE_BIT) {
                u64 tsc = kvm_read_l1_tsc(vcpu, rdtsc());
                ns = __pvclock_read_cycles(&vcpu->arch.hv_clock, tsc);
        } else {
                ns = ktime_get_boot_ns() + ka->kvmclock_offset;

                >>> add (rdtsc() - tsc_timestamp),
                    if kvmclock is enabled
        }

The addition under >>> above.

     prev parent reply	other threads:[~2016-11-14 17:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-11 10:12 [PATCH] KVM: x86: do not go through vcpu in __get_kvmclock_ns Paolo Bonzini
2016-11-14 14:52 ` Radim Krčmář
2016-11-14 16:17   ` Paolo Bonzini
2016-11-14 17:05   ` Marcelo Tosatti [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161114170550.GA6838@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.