From: Marcelo Tosatti <mtosatti@redhat.com>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Frederic Weisbecker <fweisbec@gmail.com>,
kvm <kvm@vger.kernel.org>
Subject: Re: kvm guest: hrtimer: interrupt too slow
Date: Thu, 8 Oct 2009 16:52:32 -0300 [thread overview]
Message-ID: <20091008195232.GA8350@amt.cnet> (raw)
In-Reply-To: <4ACDF1EF.3070005@msgid.tls.msk.ru>
On Thu, Oct 08, 2009 at 06:06:39PM +0400, Michael Tokarev wrote:
> Thomas Gleixner wrote:
>> On Thu, 8 Oct 2009, Michael Tokarev wrote:
>>
>>> Thomas Gleixner wrote:
>>>> On Thu, 8 Oct 2009, Michael Tokarev wrote:
>>>>> Yesterday I was "lucky" enough to actually watch what's
>>>>> going on when the delay actually happens.
>>>>>
>>>>> I run desktop environment on a kvm virtual machine here.
>>>>> The server is on diskless terminal, and the rest, incl.
>>>>> the window manager etc, is started from a VM.
>>>>>
>>>>> And yesterday, during normal system load (nothing extra,
>>>>> and not idle either, and all the other guests were running
>>>>> under normal load too), I had a stall of everyhing on this
>>>>> X session for about 2..3, maybe 5 secounds.
>>>>>
>>>>> It felt like completely stuck machine. Nothing were moving
>>>>> on the screen, no reaction to the keyboard etc.
>>>>>
>>>>> And after several seconds it returned to normal. With
>>>>> the familiar message in dmesg -- increasing hrtimer etc,
>>>>> to the next 50%. (Without a patch from Marcelo at this
>>>>> time it shuold increase min_delta to a large number).
>>>>>
>>>>> To summarize: there's something, well, more interesting
>>>>> going on here. In addition to the scheduling issues that
>>>>> causes timers to be calculated on the "wrong" CPU etc as
>>>> Care to elaborate ?
>>> Such huge delays (in terms of seconds, not ms or ns) - I don't
>>> understand how such delays can be explained by sheduling to the
>>> different cpu etc. That's what I mean. I know very little about
>>> all this low-level stuff so I may be completely out of context,
>>> but such explanation does not look right to me, simple as that.
>>> By "scheduling mistakes" we can get mistakes in range of millisecs,
>>> but not secs.
>>
>> I'm really missing the big picture here.
>>
>> What means "causes timers to be calculated on the "wrong" CPU etc" ?
>> And what do you consider a "scheduling mistake" ?
>
> From the initial diagnostics by Marcelo:
>
> > It seems the way hrtimer_interrupt_hanging calculates min_delta is
> > wrong (especially to virtual machines). The guest vcpu can be scheduled
> > out during the execution of the hrtimer callbacks (and the callbacks
> > themselves can do operations that translate to blocking operations in
> > the hypervisor).
> >
> > So high min_delta values can be calculated if, for example, a single
> > hrtimer_interrupt run takes two host time slices to execute, while some
> > other higher priority task runs for N slices in between.
>
> From this I conclude that the huge min_delta is due to some other task(s)
> on the host being run while this guest is in hrtimer callback. But I
> fail to see why that process on the host takes SO MUCH time, to warrant
> resulting min_delta to 0.5s, or to cause delays for 3..5 seconds in
> guest. It's ok to have delays in range of several extra milliseconds,
> but for *seconds* is too much.
>
> Note again that neither host nor guest are not under high load when
> this jump happens. Also note that there's no high-priority processes
> running on the host, all are of the same priority level, including
> all the guests.
>
> Note also that so far I only see it on SMP guests, never on UP
> guests. And only on guests with kvm_clock, not with acpi_pm
> clocksource.
>
> What I'm trying to say is that it looks like there's something
> else wrong here in the guest code. Huge stalls, huge delays
> while in hrtimer callback (i think it jappens always when such
> delay is happening, it's just noticed by hrtimer code) -- that's
> the root cause of all this, (probably) wrong logic in hrtimer
> calibration just shows the results of something that's wrong
> elsewhere.
True.
Would be useful to collect sar (sar -B -b -u) output every one second
in both host/guest. You already mentioned load was low, but this should
give more details.
Was there swapping going on?
next prev parent reply other threads:[~2009-10-08 19:53 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-29 13:12 kvm guest: hrtimer: interrupt too slow Michael Tokarev
2009-09-29 13:47 ` Avi Kivity
2009-09-29 13:58 ` Michael Tokarev
2009-10-05 10:47 ` Avi Kivity
2009-10-03 23:12 ` Marcelo Tosatti
[not found] ` <4AC88E7E.8050909@msgid.tls.msk.ru>
2009-10-05 0:50 ` Marcelo Tosatti
2009-10-05 9:31 ` Michael Tokarev
2009-10-06 13:30 ` Michael Tokarev
2009-10-07 23:17 ` Frederic Weisbecker
2009-10-08 0:54 ` Marcelo Tosatti
2009-10-08 7:54 ` Michael Tokarev
2009-10-08 8:06 ` Thomas Gleixner
2009-10-08 8:14 ` Michael Tokarev
2009-10-08 9:29 ` Thomas Gleixner
2009-10-08 14:06 ` Michael Tokarev
2009-10-08 15:06 ` Thomas Gleixner
2009-10-08 19:52 ` Marcelo Tosatti [this message]
2009-10-09 21:22 ` Michael Tokarev
2009-10-09 22:27 ` Frederic Weisbecker
2009-10-09 22:34 ` Michael Tokarev
2009-10-10 9:18 ` Michael Tokarev
2009-10-10 9:24 ` Frederic Weisbecker
2009-10-10 17:37 ` Marcelo Tosatti
2009-10-08 8:05 ` Thomas Gleixner
2009-10-08 19:22 ` Marcelo Tosatti
2009-10-08 20:25 ` Thomas Gleixner
2009-10-08 21:02 ` Michael Tokarev
2009-10-10 17:32 ` [PATCH] tune hrtimer_interrupt hang logic Marcelo Tosatti
2009-10-08 8:09 ` kvm guest: hrtimer: interrupt too slow Michael Tokarev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091008195232.GA8350@amt.cnet \
--to=mtosatti@redhat.com \
--cc=fweisbec@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=mjt@tls.msk.ru \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.