From mboxrd@z Thu Jan  1 00:00:00 1970
From: Michael Tokarev <mjt@tls.msk.ru>
Subject: Re: smp guest questions
Date: Thu, 18 Jun 2009 13:14:05 +0400
Message-ID: <4A3A055D.7040002@msgid.tls.msk.ru>
References: <4A38ABA3.2010401@msgid.tls.msk.ru> <4A38B43E.6010704@redhat.com> <4A38C997.5020005@msgid.tls.msk.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: KVM list <kvm@vger.kernel.org>,
	Marcelo Tosatti <mtosatti@redhat.com>
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from isrv.corpit.ru ([81.13.33.159]:34922 "EHLO isrv.corpit.ru"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753281AbZFRJOI (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 18 Jun 2009 05:14:08 -0400
In-Reply-To: <4A38C997.5020005@msgid.tls.msk.ru>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Replying to myself & top-posting for reference.

I can't reproduce the problem - neither of the
two issues with timers mentioned in my original
email quited below.

But there IS a race somewhere, that's for sure.

When I saw both - "pm-timer running at 200% rate"
and "hrtimer: interrupt too slow" (and I saw them
more than once on this configuration), - it was
during host system startup, when it starts all
the guest machines (several of them) and they
continue its own startup at the background, all
at once.  I.e, it happened more than once when
several kvm guests gets started all together.

Playing with it more I wasn't able to repeat the
issue, and can't trigger it with 4 guests on my
test machine at home either.  But it happened
again "when I wasn't watching", also during
massive guest startup.

Another issue happened during startup (or, rather,
AFTER such massive startup when one guest reported
the 200% rate of pm-timer, probably at the same time
when hrtimer message popped up) - another guest
locked up hard, kvm process were looping using 100%
cpu time and did not answer to monitor socket requests
(it was supposed to listen on a unix socket for monitor
commands).  *Probably* at the time when one guest were
in locked state, another guest reported that hrtimer
message - but I'm not 100% sure since I can only see
it by "--MARK--" messages in syslog of the died guest,
which are at 20-minute intervals.  Maybe some "random
glitch", I dunno ;)

In any way, since I can't provide more information
about all this despite all my attempts to reproduce
the situation.. I consider this issue closed, for now
anyway.  But let it be archived for future refefence :)

Thanks!

/mjt

Michael Tokarev wrote:
> Avi Kivity wrote:
>> On 06/17/2009 11:38 AM, Michael Tokarev wrote:
>>> After seeing words from Avi about that smp guests
>>> are ok now, I descided to try.  And immediately
>>> got a few questions.
>>>
>>> Running on a Phenom 9750 machine (PhenomI), AMD780G
>>> chipset.  Host is 2.6.29 x86-64, qemu-kvm 0.10.5,
>>> guests are linux with kvm paravirt bits enabled, also
>>> dynticks (on both host and guest).
>>>
>>>
>>> When booting a 2-CPU guest, I see in dmesg:
>>>
>>> PM-Timer running at invalid rate: 200% of normal - aborting.
>>>
>>> and indeed, in available_clocksource there's no pmtimer.
>>> Should I be concerned?  It does not look healthy.
>>>
>>
>> It's a bug, please post guest details (kernel version, bitness).
> 
> The guest kernel is also 2.6.29[.5], but this time it's x86-32
> (compiled for P4).  kvm userspace is also 32bits (historical) --
> only host kernel is 64bit for now.  I'll try to do some more
> experiments later today on a test machine (this is a production
> box) -- "hopefully" that same issue will occur on another
> machine :)
> 
>> Copying Marcelo.
>>
>>>
>>> Some time later, I see stuff like:
>>>
>>> hrtimer: interrupt too slow, forcing clock min delta to 47210997 ns
>>>
>>> Which reminds me issues I had with broken hpet (time goes
>>> back-n-forth with similar messages shown in dmesg, but
>>> about hpet not hrtimer).  Also does not look healthy.
>>>
>>>
>>> I haven't seen either of the two messages above on any of
>>> single-processor guests so far, at least with recent kernels
>>> and kvm userspace, only on smp (2 cpu for now).
>>
>> Please also post host /proc/cpuifo.
> 
> HOST cpuinfo (only for 4th core, other cores are similar):
> processor    : 3
> vendor_id    : AuthenticAMD
> cpu family    : 16
> model        : 2
> model name    : AMD Phenom(tm) 9750 Quad-Core Processor
> stepping    : 3
> cpu MHz        : 1200.000
> (yes ondemand cpufreq is in effect - nominal frequency is 2400.
> I had no issues with cpufreq on this box so far, including all
> the guests).
> cache size    : 512 KB
> physical id    : 0
> siblings    : 4
> core id        : 3
> cpu cores    : 4
> apicid        : 3
> initial apicid    : 3
> fpu        : yes
> fpu_exception    : yes
> cpuid level    : 5
> wp        : yes
> flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt 
> pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc pni 
> monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a 
> misalignsse 3dnowprefetch osvw ibs
> bogomips    : 4812.67
> TLB size    : 1024 4K pages
> clflush size    : 64
> cache_alignment    : 64
> address sizes    : 48 bits physical, 48 bits virtual
> power management: ts ttp tm stc 100mhzsteps hwpstate
> 
> 
> 
> cpuinfo on GUEST (also for only one CPU):
> 
> processor    : 1
> vendor_id    : AuthenticAMD
> cpu family    : 6
> model        : 2
> model name    : QEMU Virtual CPU version 0.10.5
> stepping    : 3
> cpu MHz        : 2405.894
> cache size    : 512 KB
> fdiv_bug    : no
> hlt_bug        : no
> f00f_bug    : no
> coma_bug    : no
> fpu        : yes
> fpu_exception    : yes
> cpuid level    : 2
> wp        : yes
> flags        : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
> pat pse36 clflush mmx fxsr sse sse2 syscall lm pni hypervisor
> bogomips    : 4811.78
> clflush size    : 64
> power management:
> 
> 
> Thanks!
> 
> /mjt
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html