From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tokarev Subject: Re: kvm guest: hrtimer: interrupt too slow Date: Thu, 08 Oct 2009 12:09:57 +0400 Message-ID: <4ACD9E55.4040604@msgid.tls.msk.ru> References: <4AC207B1.7020901@msgid.tls.msk.ru> <20091003231205.GA15015@amt.cnet> <20091007231733.GG5903@nowhere> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Marcelo Tosatti , Thomas Gleixner , kvm , Ingo Molnar To: Frederic Weisbecker Return-path: Received: from isrv.corpit.ru ([81.13.33.159]:36709 "EHLO isrv.corpit.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753138AbZJHIKf (ORCPT ); Thu, 8 Oct 2009 04:10:35 -0400 In-Reply-To: <20091007231733.GG5903@nowhere> Sender: kvm-owner@vger.kernel.org List-ID: [] >>> hrtimer: interrupt too slow, forcing clock min delta to 461487495 ns [] > All that does not make sense anymore in a guest. The hang detection > and warnings, the recalibrations of the min_clock_deltas are completely > wrong in this context. > Not only does it spuriously warn, but the minimum timer is increasing > slowly and the guest progressively suffers from higher and higher > latencies. Well, it's not "slowly", -- that huge jump shown above is typical. If my calculations are correct, that's about 0.5 sec min_delta. > That's really bad. *nod* :) > Your patch lowers the immediate impact and makes this illness evolving > smoother by scaling down the recalibration to the min_clock_delta. > This appeases the bug but doesn't solve it. I fear it could be even > worse because it makes it more discreet. Well, long-term it's not worse still. New code has a chance to hitting the same values for min_delta in a long run, but this chance is so small and the time spent is so long that it can be forgotten about completely. > May be can we instead increase the minimum threshold of loop in the > hrtimer interrupt before considering it as a hang? Hmm, but a too high > number could make this check useless, depending of the number of pending > timers, which is a finite number. > > Well, actually I'm not confident anymore in this check. Or actually we > should change it. May be we can rebase it on the time spent on the hrtimer > interrupt (and check it every 10 loops of reprocessing in hrtimer_interrupts). > > Would a mimimum threshold of 5 seconds spent in hrtimer_interrupt() be > a reasonable check to perform? > We should probably base our check on such kind of high boundary. > What we want is an ultimate rescue against hard hangs anyway, not > something that can solve the hang source itself. After the min_clock_delta > recalibration, the system will be unstable (eg: high latencies). > So if this must behave as a hammer, let's ensure we really need this hammer, > even if we need to wait for few seconds before it triggers. By the way, all other cases I've seen this message (hrtimer: interrupt too slow..) triggering, the problems were elsewhere and re-calibrating timer was not a good idea anyway, because the problem was elsewhere and changing timer didn't solve it. Back into the vm issue at hand. I (almost) understand what's happening in the discussion above, but I does not see how it is possible to have such a *huge* delays explained by scheduling on a different CPU etc. The delays are measured in *seconds*, not nano- or micro-secs etc. I can imagine, say, swapping on host that causes the whole guest to be swapped out for a while during the timer interrupt handling for example. But it is NOT what's happening here, at least not that I can see it. Yes host had some swapping: pswpin 17535 pswpout 41602 but it's not massive and I know when exactly it happened - when I was testing something else. Right now free(1) reports: total used free shared buffers cached Mem: 8155280 8105704 49576 0 1209136 27440 -/+ buffers/cache: 6869128 1286152 Swap: 8388856 124112 8264744 (and f*ng vmstat that, again, does not show swapping activity at all) So, I think, the problem is somewhere elsewhere. By the way, I *think* it only happens with kvm_clock, and does not happen with acpi_pm clocksource. Is it worth to check? Thanks! /mjt