On Sun, Sep 30, 2012 at 01:10:55PM +0200, Avi Kivity wrote: > On 09/28/2012 05:35 AM, Paul E. McKenney wrote: > > On Thu, Sep 27, 2012 at 12:40:44PM +0800, Fengguang Wu wrote: > >> On Wed, Sep 26, 2012 at 09:28:50PM -0700, Paul E. McKenney wrote: > >> > On Thu, Sep 27, 2012 at 10:54:00AM +0800, Fengguang Wu wrote: > >> > > On Wed, Sep 26, 2012 at 09:45:43AM -0700, Paul E. McKenney wrote: > >> > > > On Wed, Sep 26, 2012 at 04:15:01PM +0800, Fengguang Wu wrote: > > > > [ . . . ] > > > >> > > > But could you also please send your .config file and a description of > >> > > > >> > > .config attached. > >> > > > >> > > > the workload you are running? > >> > > > >> > > It's basically the below commands. The exact initrd is not relevant in > >> > > this case because it's a boot time warning before user space is > >> > > started. The stalls roughly happen 1 time on every 10 boots. > >> > > >> > Yow!!! > >> > > >> > You have severe cross-CPU time-synchronization problems. See for > >> > example the first dmesg, with the relevant part extracted right here. > >> > One CPU believes that it is about 37 seconds past boot, and the other > >> > CPU beleives that it is about 137 seconds past boot. Given that large > >> > of a time difference, an RCU CPU stall warning is expected behavior. > >> > >> Good spot! Yeah I noticed that huge timestamp gap, however didn't take > >> it seriously enough.. > >> > >> > Get your two CPUs in agreement about what time it is, and I bet that > >> > the CPU stall warnings will go away. > >> > >> Possibly KVM related? Because the warnings show up in many test boxes > >> running KVM and so is not likely some hardware specific issue. > > > > I vaguely recall seeing something recently. But let's ask the KVM and > > timekeeping guys. > > >From the logs it looks like hpet (why not kvmclock?) is used for the > clock, it should not generate such drifts since it is a global clock. > Can you verify current_clocksource on a boot that actually failed (in > case the clocksource is switched during runtime)? I've checked out the dmesg that's cited by Paul, attached. Yes it contains lines [ 4.970051] Switching to clocksource hpet and then [ 7.250353] Switching to clocksource tsc And there is no kvm-clock lines. Oh well for this particular kernel: # CONFIG_KVM_CLOCK is not set I'm not sure how this happen, maybe some kconfig that CONFIG_KVM_CLOCK depends on is randconfig'ed to off.. Thanks, Fengguang