Re: [PATCH] kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: [PATCH] kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use
Date: Wed, 9 Nov 2016 19:43:32 -0200	[thread overview]
Message-ID: <20161109214329.GA5611@amt.cnet> (raw)
In-Reply-To: <1478710095-15022-1-git-send-email-pbonzini@redhat.com>

On Wed, Nov 09, 2016 at 05:48:15PM +0100, Paolo Bonzini wrote:
> Userspace can read the exact value of kvmclock by reading the TSC
> and fetching the timekeeping parameters out of guest memory.  This
> however is brittle and not necessary anymore with KVM 4.11.  Provide
> a mechanism that lets userspace know if the new KVM_GET_CLOCK
> semantics are in effect, and---since we are at it---if the clock
> is stable across all VCPUs.
> 
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  Documentation/virtual/kvm/api.txt | 11 +++++++++++
>  arch/x86/kvm/x86.c                | 10 +++++++---
>  include/uapi/linux/kvm.h          |  7 +++++++
>  3 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 739db9ab16b2..6bbceb9a3a19 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -777,6 +777,17 @@ Gets the current timestamp of kvmclock as seen by the current guest. In
>  conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
>  such as migration.
>  
> +When KVM_CAP_ADJUST_CLOCK is passed to KVM_CHECK_EXTENSION, it returns the
> +set of bits that KVM can return in struct kvm_clock_data's flag member.
> +
> +The only flag defined now is KVM_CLOCK_TSC_STABLE.  If set, the returned
> +value is the exact kvmclock value seen by all VCPUs at the instant
> +when KVM_GET_CLOCK was called.  If clear, the returned value is simply
> +CLOCK_MONOTONIC plus a constant offset; the offset can be modified
> +with KVM_SET_CLOCK.  KVM will try to make all VCPUs follow this clock,
> +but the exact value read by each VCPU could differ, because the host
> +TSC is not stable.
> +
>  struct kvm_clock_data {
>  	__u64 clock;  /* kvmclock current value */
>  	__u32 flags;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3017de0431bd..1ba08278a9a9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2596,7 +2596,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_PIT_STATE2:
>  	case KVM_CAP_SET_IDENTITY_MAP_ADDR:
>  	case KVM_CAP_XEN_HVM:
> -	case KVM_CAP_ADJUST_CLOCK:
>  	case KVM_CAP_VCPU_EVENTS:
>  	case KVM_CAP_HYPERV:
>  	case KVM_CAP_HYPERV_VAPIC:
> @@ -2623,6 +2622,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  #endif
>  		r = 1;
>  		break;
> +	case KVM_CAP_ADJUST_CLOCK:
> +		r = KVM_CLOCK_TSC_STABLE;
> +		break;
>  	case KVM_CAP_X86_SMM:
>  		/* SMBASE is usually relocated above 1M on modern chipsets,
>  		 * and SMM handlers might indeed rely on 4G segment limits,
> @@ -4103,9 +4105,11 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		struct kvm_clock_data user_ns;
>  		u64 now_ns;
>  
> -		now_ns = get_kvmclock_ns(kvm);
> +		local_irq_disable();
> +		now_ns = __get_kvmclock_ns(kvm);
>  		user_ns.clock = now_ns;
> -		user_ns.flags = 0;
> +		user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0;
> +		local_irq_enable();
>  		memset(&user_ns.pad, 0, sizeof(user_ns.pad));
>  
>  		r = -EFAULT;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 300ef255d1e0..4ee67cb99143 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -972,12 +972,19 @@ struct kvm_irqfd {
>  	__u8  pad[16];
>  };
>  
> +/* For KVM_CAP_ADJUST_CLOCK */
> +
> +/* Do not use 1, KVM_CHECK_EXTENSION returned it before we had flags.  */
> +#define KVM_CLOCK_TSC_STABLE		2
> +
>  struct kvm_clock_data {
>  	__u64 clock;
>  	__u32 flags;
>  	__u32 pad[9];
>  };
>  
> +/* For KVM_CAP_SW_TLB */
> +
>  #define KVM_MMU_FSL_BOOKE_NOHV		0
>  #define KVM_MMU_FSL_BOOKE_HV		1
>  
> -- 
> 1.8.3.1

Nevermind the previous comments - imagined you were using CLOCK_MONOTONIC_RAW.

Looks good to me modulo the race mentioned on the other email: lets see if 
KVM_GET_CLOCK at pre_save gets rid of that 100ms.

Why is advance_ns a hack? Would like to send over the delta between
kvmclock pre_save and EOF packet so destination can advance that 
as well (which requires separate migration entry).

next prev parent reply	other threads:[~2016-11-09 21:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-09 16:48 [PATCH] kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use Paolo Bonzini
2016-11-09 20:12 ` Marcelo Tosatti
2016-11-09 20:17   ` Marcelo Tosatti
2016-11-09 21:14     ` Marcelo Tosatti
2016-11-09 22:03     ` Paolo Bonzini
2016-11-09 21:43 ` Marcelo Tosatti [this message]
2016-11-19 19:34 ` Radim Krčmář

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161109214329.GA5611@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).