All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Avi Kivity <avi@redhat.com>
Cc: "Liu, Jinsong" <jinsong.liu@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [PATCH] KVM: emulate lapic tsc deadline timer for hvm
Date: Tue, 23 Aug 2011 07:47:24 -0300	[thread overview]
Message-ID: <20110823104723.GA2898@amt.cnet> (raw)
In-Reply-To: <4E521EF9.7020404@redhat.com>

On Mon, Aug 22, 2011 at 12:18:49PM +0300, Avi Kivity wrote:
> On 08/17/2011 07:19 AM, Liu, Jinsong wrote:
> > From a9670ddff84080c56183e2d678189e100f891174 Mon Sep 17 00:00:00 2001
> >From: Liu, Jinsong<jinsong.liu@intel.com>
> >Date: Wed, 17 Aug 2011 11:36:28 +0800
> >Subject: [PATCH] KVM: emulate lapic tsc deadline timer for hvm
> 
> kvm doesn't have hvm.
> 
> >This patch emulate lapic tsc deadline timer for hvm:
> >Enumerate tsc deadline timer capacibility by CPUID;
> >Enable tsc deadline timer mode by LAPIC MMIO;
> >Start tsc deadline timer by MSR;
> 
> >diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> >index 4258aac..28bcf48 100644
> >--- a/arch/x86/include/asm/cpufeature.h
> >+++ b/arch/x86/include/asm/cpufeature.h
> >@@ -120,6 +120,7 @@
> >  #define X86_FEATURE_X2APIC	(4*32+21) /* x2APIC */
> >  #define X86_FEATURE_MOVBE	(4*32+22) /* MOVBE instruction */
> >  #define X86_FEATURE_POPCNT      (4*32+23) /* POPCNT instruction */
> >+#define X86_FEATURE_TSC_DEADLINE_TIMER    (4*32+24) /* Tsc deadline timer */
> >  #define X86_FEATURE_AES		(4*32+25) /* AES instructions */
> >  #define X86_FEATURE_XSAVE	(4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
> >  #define X86_FEATURE_OSXSAVE	(4*32+27) /* "" XSAVE enabled in the OS */
> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >index 307e3cf..28f7128 100644
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -635,6 +635,7 @@ struct kvm_x86_ops {
> >  	int (*check_intercept)(struct kvm_vcpu *vcpu,
> >  			       struct x86_instruction_info *info,
> >  			       enum x86_intercept_stage stage);
> >+	u64 (*guest_to_host_tsc)(u64 guest_tsc);
> >  };
> 
> Please put this near the other tsc functions.  Add a comment
> explaining what value is returned under nested virtualization.
> 
> Please add the svm callback implementation.
> 
> >
> >--- a/arch/x86/include/asm/msr-index.h
> >+++ b/arch/x86/include/asm/msr-index.h
> >@@ -229,6 +229,8 @@
> >  #define MSR_IA32_APICBASE_ENABLE	(1<<11)
> >  #define MSR_IA32_APICBASE_BASE		(0xfffff<<12)
> >
> >+#define MSR_IA32_TSCDEADLINE		0x000006e0
> >+
> >  #define MSR_IA32_UCODE_WRITE		0x00000079
> >  #define MSR_IA32_UCODE_REV		0x0000008b
> 
> Need to add to msrs_to_save so live migration works.

MSR must be explicitly listed in qemu, also.

> >+		if (!apic->lapic_timer.tscdeadline)
> >+			return;
> >+
> >+		tsc_target = kvm_x86_ops->
> >+			guest_to_host_tsc(apic->lapic_timer.tscdeadline);
> >+		rdtscll(tsc_now);
> >+		tsc_delta = tsc_target - tsc_now;
> 
> This only works if we have a constant tsc, that's not true for large
> multiboard machines.  Need to do this with irqs disabled as well
> (reading both 'now' and 'tsc_now' in the same critical section).

Should look like this:

local_irq_disable();
u64 guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu);
if (guest_tsc <= tscdeadline)
        hrtimer_start(now);
else {
	ns = convert_to_ns(guest_tsc - tscdeadline);
	hrtimer_start(now + ns);
}
local_irq_enable();

Note the vcpus tsc can have different frequency than the hosts, so
vcpu_tsc_khz() should be used to convert to nanoseconds, not tsc_khz.

> >+		if (tsc_delta<  0)
> >+			tsc_delta = 0;
> >+
> >+		nsec = tsc_delta * 1000000L / tsc_khz;
> >+		hrtimer_start(&apic->lapic_timer.timer,
> >+			ktime_add_ns(now, nsec), HRTIMER_MODE_ABS);
> >+	}
> >  }
> >
> >@@ -883,6 +936,28 @@ void kvm_free_lapic(struct kvm_vcpu *vcpu)
> >   *----------------------------------------------------------------------
> >   */
> >
> >+u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu)
> >+{
> >+	struct kvm_lapic *apic = vcpu->arch.apic;
> >+
> >+	if (apic_lvtt_oneshot(apic) || apic_lvtt_period(apic))
> >+		return 0;
> 
> Why?

The hardware reset value of the IA32_TSC_DEADLINE MSR is 0. In other
timer modes (LVT bit 18 = 0), the IA32_TSC_DEADLINE MSR reads zero and
writes are ignored.

> 
> >+
> >+	return apic->lapic_timer.tscdeadline;
> >+}
> >+
> >+void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
> >+{
> >+	struct kvm_lapic *apic = vcpu->arch.apic;
> >+
> >+	if (apic_lvtt_oneshot(apic) || apic_lvtt_period(apic))
> >+		return;
> >+
> >+	hrtimer_cancel(&apic->lapic_timer.timer);
> >+	apic->lapic_timer.tscdeadline = data;
> >+	start_apic_timer(apic);
> 
> Shouldn't the msr value be updated even if we're outside tsc-deadline mode?

WARNING: multiple messages have this Message-ID (diff)
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Avi Kivity <avi@redhat.com>
Cc: "Liu, Jinsong" <jinsong.liu@intel.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [Qemu-devel] [PATCH] KVM: emulate lapic tsc deadline timer for hvm
Date: Tue, 23 Aug 2011 07:47:24 -0300	[thread overview]
Message-ID: <20110823104723.GA2898@amt.cnet> (raw)
In-Reply-To: <4E521EF9.7020404@redhat.com>

On Mon, Aug 22, 2011 at 12:18:49PM +0300, Avi Kivity wrote:
> On 08/17/2011 07:19 AM, Liu, Jinsong wrote:
> > From a9670ddff84080c56183e2d678189e100f891174 Mon Sep 17 00:00:00 2001
> >From: Liu, Jinsong<jinsong.liu@intel.com>
> >Date: Wed, 17 Aug 2011 11:36:28 +0800
> >Subject: [PATCH] KVM: emulate lapic tsc deadline timer for hvm
> 
> kvm doesn't have hvm.
> 
> >This patch emulate lapic tsc deadline timer for hvm:
> >Enumerate tsc deadline timer capacibility by CPUID;
> >Enable tsc deadline timer mode by LAPIC MMIO;
> >Start tsc deadline timer by MSR;
> 
> >diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> >index 4258aac..28bcf48 100644
> >--- a/arch/x86/include/asm/cpufeature.h
> >+++ b/arch/x86/include/asm/cpufeature.h
> >@@ -120,6 +120,7 @@
> >  #define X86_FEATURE_X2APIC	(4*32+21) /* x2APIC */
> >  #define X86_FEATURE_MOVBE	(4*32+22) /* MOVBE instruction */
> >  #define X86_FEATURE_POPCNT      (4*32+23) /* POPCNT instruction */
> >+#define X86_FEATURE_TSC_DEADLINE_TIMER    (4*32+24) /* Tsc deadline timer */
> >  #define X86_FEATURE_AES		(4*32+25) /* AES instructions */
> >  #define X86_FEATURE_XSAVE	(4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
> >  #define X86_FEATURE_OSXSAVE	(4*32+27) /* "" XSAVE enabled in the OS */
> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >index 307e3cf..28f7128 100644
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -635,6 +635,7 @@ struct kvm_x86_ops {
> >  	int (*check_intercept)(struct kvm_vcpu *vcpu,
> >  			       struct x86_instruction_info *info,
> >  			       enum x86_intercept_stage stage);
> >+	u64 (*guest_to_host_tsc)(u64 guest_tsc);
> >  };
> 
> Please put this near the other tsc functions.  Add a comment
> explaining what value is returned under nested virtualization.
> 
> Please add the svm callback implementation.
> 
> >
> >--- a/arch/x86/include/asm/msr-index.h
> >+++ b/arch/x86/include/asm/msr-index.h
> >@@ -229,6 +229,8 @@
> >  #define MSR_IA32_APICBASE_ENABLE	(1<<11)
> >  #define MSR_IA32_APICBASE_BASE		(0xfffff<<12)
> >
> >+#define MSR_IA32_TSCDEADLINE		0x000006e0
> >+
> >  #define MSR_IA32_UCODE_WRITE		0x00000079
> >  #define MSR_IA32_UCODE_REV		0x0000008b
> 
> Need to add to msrs_to_save so live migration works.

MSR must be explicitly listed in qemu, also.

> >+		if (!apic->lapic_timer.tscdeadline)
> >+			return;
> >+
> >+		tsc_target = kvm_x86_ops->
> >+			guest_to_host_tsc(apic->lapic_timer.tscdeadline);
> >+		rdtscll(tsc_now);
> >+		tsc_delta = tsc_target - tsc_now;
> 
> This only works if we have a constant tsc, that's not true for large
> multiboard machines.  Need to do this with irqs disabled as well
> (reading both 'now' and 'tsc_now' in the same critical section).

Should look like this:

local_irq_disable();
u64 guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu);
if (guest_tsc <= tscdeadline)
        hrtimer_start(now);
else {
	ns = convert_to_ns(guest_tsc - tscdeadline);
	hrtimer_start(now + ns);
}
local_irq_enable();

Note the vcpus tsc can have different frequency than the hosts, so
vcpu_tsc_khz() should be used to convert to nanoseconds, not tsc_khz.

> >+		if (tsc_delta<  0)
> >+			tsc_delta = 0;
> >+
> >+		nsec = tsc_delta * 1000000L / tsc_khz;
> >+		hrtimer_start(&apic->lapic_timer.timer,
> >+			ktime_add_ns(now, nsec), HRTIMER_MODE_ABS);
> >+	}
> >  }
> >
> >@@ -883,6 +936,28 @@ void kvm_free_lapic(struct kvm_vcpu *vcpu)
> >   *----------------------------------------------------------------------
> >   */
> >
> >+u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu)
> >+{
> >+	struct kvm_lapic *apic = vcpu->arch.apic;
> >+
> >+	if (apic_lvtt_oneshot(apic) || apic_lvtt_period(apic))
> >+		return 0;
> 
> Why?

The hardware reset value of the IA32_TSC_DEADLINE MSR is 0. In other
timer modes (LVT bit 18 = 0), the IA32_TSC_DEADLINE MSR reads zero and
writes are ignored.

> 
> >+
> >+	return apic->lapic_timer.tscdeadline;
> >+}
> >+
> >+void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
> >+{
> >+	struct kvm_lapic *apic = vcpu->arch.apic;
> >+
> >+	if (apic_lvtt_oneshot(apic) || apic_lvtt_period(apic))
> >+		return;
> >+
> >+	hrtimer_cancel(&apic->lapic_timer.timer);
> >+	apic->lapic_timer.tscdeadline = data;
> >+	start_apic_timer(apic);
> 
> Shouldn't the msr value be updated even if we're outside tsc-deadline mode?

  reply	other threads:[~2011-08-23 10:47 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-17  4:19 [PATCH] KVM: emulate lapic tsc deadline timer for hvm Liu, Jinsong
2011-08-17  4:19 ` [Qemu-devel] " Liu, Jinsong
2011-08-22  9:18 ` Avi Kivity
2011-08-22  9:18   ` [Qemu-devel] " Avi Kivity
2011-08-23 10:47   ` Marcelo Tosatti [this message]
2011-08-23 10:47     ` Marcelo Tosatti
2011-08-29  6:47     ` Tian, Kevin
2011-08-29  6:47       ` [Qemu-devel] " Tian, Kevin
2011-09-06 11:21     ` Liu, Jinsong
2011-09-06 11:21       ` [Qemu-devel] " Liu, Jinsong
2011-09-08 17:12     ` Liu, Jinsong
2011-09-08 17:12       ` [Qemu-devel] " Liu, Jinsong
2011-09-09 12:56       ` Marcelo Tosatti
2011-09-09 12:56         ` [Qemu-devel] " Marcelo Tosatti
2011-09-09 18:11         ` Liu, Jinsong
2011-09-09 18:11           ` [Qemu-devel] " Liu, Jinsong
2011-09-09 18:36           ` Marcelo Tosatti
2011-09-09 18:36             ` [Qemu-devel] " Marcelo Tosatti
2011-09-09 18:47             ` Liu, Jinsong
2011-09-09 18:47               ` [Qemu-devel] " Liu, Jinsong
2011-09-09 18:21         ` Liu, Jinsong
2011-09-09 18:21           ` [Qemu-devel] " Liu, Jinsong
2011-09-06 11:18   ` Liu, Jinsong
2011-09-06 11:18     ` [Qemu-devel] " Liu, Jinsong
2011-09-06 11:26     ` Avi Kivity
2011-09-06 11:26       ` [Qemu-devel] " Avi Kivity
2011-09-07 16:45   ` Liu, Jinsong
2011-09-07 16:45     ` [Qemu-devel] " Liu, Jinsong
2011-09-07 17:06     ` Avi Kivity
2011-09-07 17:06       ` [Qemu-devel] " Avi Kivity
2011-09-07 17:33       ` Liu, Jinsong
2011-09-07 17:33         ` [Qemu-devel] " Liu, Jinsong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110823104723.GA2898@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=jinsong.liu@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.