Re: [PATCH] kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: [PATCH] kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use
Date: Wed, 9 Nov 2016 19:43:32 -0200	[thread overview]
Message-ID: <20161109214329.GA5611@amt.cnet> (raw)
In-Reply-To: <1478710095-15022-1-git-send-email-pbonzini@redhat.com>

On Wed, Nov 09, 2016 at 05:48:15PM +0100, Paolo Bonzini wrote:
> Userspace can read the exact value of kvmclock by reading the TSC
> and fetching the timekeeping parameters out of guest memory.  This
> however is brittle and not necessary anymore with KVM 4.11.  Provide
> a mechanism that lets userspace know if the new KVM_GET_CLOCK
> semantics are in effect, and---since we are at it---if the clock
> is stable across all VCPUs.
> 
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  Documentation/virtual/kvm/api.txt | 11 +++++++++++
>  arch/x86/kvm/x86.c                | 10 +++++++---
>  include/uapi/linux/kvm.h          |  7 +++++++
>  3 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 739db9ab16b2..6bbceb9a3a19 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -777,6 +777,17 @@ Gets the current timestamp of kvmclock as seen by the current guest. In
>  conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
>  such as migration.
>  
> +When KVM_CAP_ADJUST_CLOCK is passed to KVM_CHECK_EXTENSION, it returns the
> +set of bits that KVM can return in struct kvm_clock_data's flag member.
> +
> +The only flag defined now is KVM_CLOCK_TSC_STABLE.  If set, the returned
> +value is the exact kvmclock value seen by all VCPUs at the instant
> +when KVM_GET_CLOCK was called.  If clear, the returned value is simply
> +CLOCK_MONOTONIC plus a constant offset; the offset can be modified
> +with KVM_SET_CLOCK.  KVM will try to make all VCPUs follow this clock,
> +but the exact value read by each VCPU could differ, because the host
> +TSC is not stable.
> +
>  struct kvm_clock_data {
>  	__u64 clock;  /* kvmclock current value */
>  	__u32 flags;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3017de0431bd..1ba08278a9a9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2596,7 +2596,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_PIT_STATE2:
>  	case KVM_CAP_SET_IDENTITY_MAP_ADDR:
>  	case KVM_CAP_XEN_HVM:
> -	case KVM_CAP_ADJUST_CLOCK:
>  	case KVM_CAP_VCPU_EVENTS:
>  	case KVM_CAP_HYPERV:
>  	case KVM_CAP_HYPERV_VAPIC:
> @@ -2623,6 +2622,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  #endif
>  		r = 1;
>  		break;
> +	case KVM_CAP_ADJUST_CLOCK:
> +		r = KVM_CLOCK_TSC_STABLE;
> +		break;
>  	case KVM_CAP_X86_SMM:
>  		/* SMBASE is usually relocated above 1M on modern chipsets,
>  		 * and SMM handlers might indeed rely on 4G segment limits,
> @@ -4103,9 +4105,11 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		struct kvm_clock_data user_ns;
>  		u64 now_ns;
>  
> -		now_ns = get_kvmclock_ns(kvm);
> +		local_irq_disable();
> +		now_ns = __get_kvmclock_ns(kvm);
>  		user_ns.clock = now_ns;
> -		user_ns.flags = 0;
> +		user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0;
> +		local_irq_enable();
>  		memset(&user_ns.pad, 0, sizeof(user_ns.pad));
>  
>  		r = -EFAULT;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 300ef255d1e0..4ee67cb99143 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -972,12 +972,19 @@ struct kvm_irqfd {
>  	__u8  pad[16];
>  };
>  
> +/* For KVM_CAP_ADJUST_CLOCK */
> +
> +/* Do not use 1, KVM_CHECK_EXTENSION returned it before we had flags.  */
> +#define KVM_CLOCK_TSC_STABLE		2
> +
>  struct kvm_clock_data {
>  	__u64 clock;
>  	__u32 flags;
>  	__u32 pad[9];
>  };
>  
> +/* For KVM_CAP_SW_TLB */
> +
>  #define KVM_MMU_FSL_BOOKE_NOHV		0
>  #define KVM_MMU_FSL_BOOKE_HV		1
>  
> -- 
> 1.8.3.1

Nevermind the previous comments - imagined you were using CLOCK_MONOTONIC_RAW.

Looks good to me modulo the race mentioned on the other email: lets see if 
KVM_GET_CLOCK at pre_save gets rid of that 100ms.

Why is advance_ns a hack? Would like to send over the delta between
kvmclock pre_save and EOF packet so destination can advance that 
as well (which requires separate migration entry).

next prev parent reply	other threads:[~2016-11-09 21:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-09 16:48 [PATCH] kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use Paolo Bonzini
2016-11-09 20:12 ` Marcelo Tosatti
2016-11-09 20:17   ` Marcelo Tosatti
2016-11-09 21:14     ` Marcelo Tosatti
2016-11-09 22:03     ` Paolo Bonzini
2016-11-09 21:43 ` Marcelo Tosatti [this message]
2016-11-19 19:34 ` Radim Krčmář

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161109214329.GA5611@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.