From: Glauber Costa <gcosta@redhat.com>
To: Glauber Costa <glommer@gmail.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
kvm-devel <kvm@vger.kernel.org>,
kraxel@redhat.com, chrisw@redhat.com
Subject: Re: kvm guest loops_per_jiffy miscalibration under host load
Date: Mon, 07 Jul 2008 18:35:52 -0300 [thread overview]
Message-ID: <48728C38.9060303@redhat.com> (raw)
In-Reply-To: <5d6222a80807071232m1031ebdao9f93997fcaefd958@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1749 bytes --]
Glauber Costa wrote:
> On Mon, Jul 7, 2008 at 4:21 PM, Anthony Liguori <aliguori@us.ibm.com> wrote:
>> Marcelo Tosatti wrote:
>>> On Mon, Jul 07, 2008 at 03:27:16PM -0300, Glauber Costa wrote:
>>>
>>>>> I agree. A paravirt solution solves the problem.
>>>>>
>>>> Please, look at the patch I've attached.
>>>>
>>>> It does __delay with host help. This may have the nice effect of not
>>>> busy waiting for long-enough delays, and may well.
>>>>
>>>> It is _completely_ PoC, just to show the idea. It's ugly, broken,
>>>> obviously have to go through pv-ops, etc.
>>>>
>>>> Also, I intend to add a lpj field in the kvm clock memory area. We
>>>> could do just this later, do both, etc.
>>>>
>>>> If we agree this is a viable solution, I'll start working on a patch
>>>>
>>> This stops interrupts from being processed during the delay. And also
>>> there are cases like this recently introduced break:
>>>
>>> /* Allow RT tasks to run */
>>> preempt_enable();
>>> rep_nop();
>>> preempt_disable();
>>>
>>> I think it would be better to just pass the lpj value via paravirt and
>>> let the guest busy-loop as usual.
>>>
>> I agree. VMI and Xen already pass a cpu_khz paravirt value. Something
>> similar would probably do the trick.
>
> yeah, there is a pv-op for this, so I won't have to mess with the
> clock interface. I'll draft a patch for it, and sent it.
>
>> It may be worthwhile having udelay() or spinlocks call into KVM if they've
>> been spinning long enough but I think that's a separate discussion.
>
> I think it is, but I'd have to back it up with numbers. measurements
> are on the way.
>> Regards,
>>
>> Anthony Liguori
>>
>>
>
>
>
How about this? RFC only for now
[-- Attachment #2: preset.patch --]
[-- Type: text/plain, Size: 2677 bytes --]
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 87edf1c..8514b04 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -78,6 +78,14 @@ static cycle_t kvm_clock_read(void)
return ret;
}
+static unsigned long kvm_get_cpu_khz(void)
+{
+ struct pvclock_vcpu_time_info *info;
+
+ info = &per_cpu(hv_clock, 0);
+ return pv_cpu_khz(info);
+}
+
static struct clocksource kvm_clock = {
.name = "kvm-clock",
.read = kvm_clock_read,
@@ -153,6 +161,7 @@ void __init kvmclock_init(void)
pv_time_ops.get_wallclock = kvm_get_wallclock;
pv_time_ops.set_wallclock = kvm_set_wallclock;
pv_time_ops.sched_clock = kvm_clock_read;
+ pv_time_ops.get_cpu_khz = kvm_get_cpu_khz;
#ifdef CONFIG_X86_LOCAL_APIC
pv_apic_ops.setup_secondary_clock = kvm_setup_secondary_clock;
#endif
diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
index 05fbe9a..2d325ae 100644
--- a/arch/x86/kernel/pvclock.c
+++ b/arch/x86/kernel/pvclock.c
@@ -97,6 +97,18 @@ static unsigned pvclock_get_time_values(struct pvclock_shadow_time *dst,
return dst->version;
}
+unsigned long pv_cpu_khz(struct pvclock_vcpu_time_info *info)
+{
+ u64 cpu_khz = 1000000ULL << 32;
+
+ do_div(cpu_khz, info->tsc_to_system_mul);
+ if (info->tsc_shift < 0)
+ cpu_khz <<= -info->tsc_shift;
+ else
+ cpu_khz >>= info->tsc_shift;
+ return cpu_khz;
+}
+
cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
{
struct pvclock_shadow_time shadow;
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 64f0038..074cabb 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -200,17 +200,10 @@ unsigned long long xen_sched_clock(void)
/* Get the CPU speed from Xen */
unsigned long xen_cpu_khz(void)
{
- u64 xen_khz = 1000000ULL << 32;
- const struct pvclock_vcpu_time_info *info =
+ struct pvclock_vcpu_time_info *info =
&HYPERVISOR_shared_info->vcpu_info[0].time;
- do_div(xen_khz, info->tsc_to_system_mul);
- if (info->tsc_shift < 0)
- xen_khz <<= -info->tsc_shift;
- else
- xen_khz >>= info->tsc_shift;
-
- return xen_khz;
+ return pv_cpu_khz(info);
}
static cycle_t xen_clocksource_read(void)
diff --git a/include/asm-x86/pvclock.h b/include/asm-x86/pvclock.h
index 85b1bba..41d816f 100644
--- a/include/asm-x86/pvclock.h
+++ b/include/asm-x86/pvclock.h
@@ -6,6 +6,7 @@
/* some helper functions for xen and kvm pv clock sources */
cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src);
+unsigned long pv_cpu_khz(struct pvclock_vcpu_time_info *info);
void pvclock_read_wallclock(struct pvclock_wall_clock *wall,
struct pvclock_vcpu_time_info *vcpu,
struct timespec *ts);
next prev parent reply other threads:[~2008-07-07 21:37 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-02 16:40 kvm guest loops_per_jiffy miscalibration under host load Marcelo Tosatti
2008-07-03 13:17 ` Glauber Costa
2008-07-04 22:51 ` Marcelo Tosatti
2008-07-07 1:56 ` Anthony Liguori
2008-07-07 18:27 ` Glauber Costa
2008-07-07 18:48 ` Marcelo Tosatti
2008-07-07 19:21 ` Anthony Liguori
2008-07-07 19:32 ` Glauber Costa
2008-07-07 21:35 ` Glauber Costa [this message]
2008-07-11 21:18 ` David S. Ahern
2008-07-12 14:10 ` Marcelo Tosatti
2008-07-12 19:28 ` David S. Ahern
2008-07-07 18:17 ` Daniel P. Berrange
-- strict thread matches above, loose matches on Subject: below --
2008-07-22 3:25 Marcelo Tosatti
2008-07-22 8:22 ` Jan Kiszka
2008-07-22 12:49 ` Marcelo Tosatti
2008-07-22 15:54 ` Jan Kiszka
2008-07-22 22:00 ` Dor Laor
2008-07-22 19:56 ` David S. Ahern
2008-07-23 2:57 ` David S. Ahern
2008-07-29 14:58 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48728C38.9060303@redhat.com \
--to=gcosta@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=chrisw@redhat.com \
--cc=glommer@gmail.com \
--cc=kraxel@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox