public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* kvm linux guest hanging for minutes at a time
@ 2011-08-07 14:06 Thomas Fjellstrom
  2011-08-09  9:16 ` Avi Kivity
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Fjellstrom @ 2011-08-07 14:06 UTC (permalink / raw)
  To: kvm@vger.kernel.org

Occasionally when there's heavy cpu and/or io load, a kvm guest will lock up 
for minutes at a time, last occurrence was for about 12 minutes or so, and the 
guest itself reported:

[1992982.639514] Clocksource tsc unstable (delta = -747307707123 ns)

in dmesg after it came back. The only other hint as to what is going on is 
that the irq count for "local timer requests", virtio-input and virtio-
requests spikes rather high. Also one of the cpu cores on the host was pegged  
the entire time.

The last thing to cause a hang was an "aptitude upgrade" in the guest, which 
was a bit behind, so it had to update over 300 packages.

The host is running 2.6.38-1-amd64 (2.6.38+32) from debian, qemu-kvm 0.14.0, 
and the guest was running 2.6.38-2-amd64 (not sure on the + number).

Is this a known problem, thats hopefully fixed in newer kernels and qemu/kvm 
packages?

Thanks

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-07 14:06 kvm linux guest hanging for minutes at a time Thomas Fjellstrom
@ 2011-08-09  9:16 ` Avi Kivity
  2011-08-09 12:03   ` Thomas Fjellstrom
  0 siblings, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2011-08-09  9:16 UTC (permalink / raw)
  To: thomas; +Cc: kvm@vger.kernel.org

On 08/07/2011 05:06 PM, Thomas Fjellstrom wrote:
> Occasionally when there's heavy cpu and/or io load, a kvm guest will lock up
> for minutes at a time, last occurrence was for about 12 minutes or so, and the
> guest itself reported:
>
> [1992982.639514] Clocksource tsc unstable (delta = -747307707123 ns)
>
> in dmesg after it came back. The only other hint as to what is going on is
> that the irq count for "local timer requests", virtio-input and virtio-
> requests spikes rather high. Also one of the cpu cores on the host was pegged
> the entire time.
>
> The last thing to cause a hang was an "aptitude upgrade" in the guest, which
> was a bit behind, so it had to update over 300 packages.
>
> The host is running 2.6.38-1-amd64 (2.6.38+32) from debian, qemu-kvm 0.14.0,
> and the guest was running 2.6.38-2-amd64 (not sure on the + number).
>
> Is this a known problem, thats hopefully fixed in newer kernels and qemu/kvm
> packages?
>

Your guest isn't using kvmclock for some reason.  Is it compiled in the 
guest kernel?  What are the contents of 
/sys/devices/system/clocksource/clocksource0/available_clocksource and 
/sys/devices/system/clocksource/clocksource0/current_clocksource (in the 
guest filesystem)?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09  9:16 ` Avi Kivity
@ 2011-08-09 12:03   ` Thomas Fjellstrom
  2011-08-09 12:36     ` Avi Kivity
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Fjellstrom @ 2011-08-09 12:03 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org

On August 9, 2011, Avi Kivity wrote:
> On 08/07/2011 05:06 PM, Thomas Fjellstrom wrote:
> > Occasionally when there's heavy cpu and/or io load, a kvm guest will lock
> > up for minutes at a time, last occurrence was for about 12 minutes or
> > so, and the guest itself reported:
> > 
> > [1992982.639514] Clocksource tsc unstable (delta = -747307707123 ns)
> > 
> > in dmesg after it came back. The only other hint as to what is going on
> > is that the irq count for "local timer requests", virtio-input and
> > virtio- requests spikes rather high. Also one of the cpu cores on the
> > host was pegged the entire time.
> > 
> > The last thing to cause a hang was an "aptitude upgrade" in the guest,
> > which was a bit behind, so it had to update over 300 packages.
> > 
> > The host is running 2.6.38-1-amd64 (2.6.38+32) from debian, qemu-kvm
> > 0.14.0, and the guest was running 2.6.38-2-amd64 (not sure on the +
> > number).
> > 
> > Is this a known problem, thats hopefully fixed in newer kernels and
> > qemu/kvm packages?
> 
> Your guest isn't using kvmclock for some reason.  Is it compiled in the
> guest kernel?  What are the contents of
> /sys/devices/system/clocksource/clocksource0/available_clocksource and
> /sys/devices/system/clocksource/clocksource0/current_clocksource (in the
> guest filesystem)?

Hi, it seems it is using kvm-clock:

$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
kvm-clock hpet acpi_pm
$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
kvm-clock

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 12:03   ` Thomas Fjellstrom
@ 2011-08-09 12:36     ` Avi Kivity
  2011-08-09 14:31       ` Thomas Fjellstrom
  0 siblings, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2011-08-09 12:36 UTC (permalink / raw)
  To: thomas; +Cc: kvm@vger.kernel.org

On 08/09/2011 03:03 PM, Thomas Fjellstrom wrote:
> >
> >  Your guest isn't using kvmclock for some reason.  Is it compiled in the
> >  guest kernel?  What are the contents of
> >  /sys/devices/system/clocksource/clocksource0/available_clocksource and
> >  /sys/devices/system/clocksource/clocksource0/current_clocksource (in the
> >  guest filesystem)?
>
> Hi, it seems it is using kvm-clock:
>
> $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> kvm-clock hpet acpi_pm
> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> kvm-clock
>


Yikes.  Please trace such a hang according to 
http://www.linux-kvm.org/page/Tracing.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 12:36     ` Avi Kivity
@ 2011-08-09 14:31       ` Thomas Fjellstrom
  2011-08-09 14:37         ` Avi Kivity
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Fjellstrom @ 2011-08-09 14:31 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org

On August 9, 2011, Avi Kivity wrote:
> On 08/09/2011 03:03 PM, Thomas Fjellstrom wrote:
> > >  Your guest isn't using kvmclock for some reason.  Is it compiled in
> > >  the guest kernel?  What are the contents of
> > >  /sys/devices/system/clocksource/clocksource0/available_clocksource and
> > >  /sys/devices/system/clocksource/clocksource0/current_clocksource (in
> > >  the guest filesystem)?
> > 
> > Hi, it seems it is using kvm-clock:
> > 
> > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > kvm-clock hpet acpi_pm
> > $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > kvm-clock
> 
> Yikes.  Please trace such a hang according to
> http://www.linux-kvm.org/page/Tracing.

Does it matter that I have several vms running? Is there a way to limit it to 
tracing the single kvm process that's been locking up?

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 14:31       ` Thomas Fjellstrom
@ 2011-08-09 14:37         ` Avi Kivity
  2011-08-09 14:46           ` Thomas Fjellstrom
  2011-08-09 15:33           ` Nick
  0 siblings, 2 replies; 14+ messages in thread
From: Avi Kivity @ 2011-08-09 14:37 UTC (permalink / raw)
  To: thomas; +Cc: kvm@vger.kernel.org

On 08/09/2011 05:31 PM, Thomas Fjellstrom wrote:
> On August 9, 2011, Avi Kivity wrote:
> >  On 08/09/2011 03:03 PM, Thomas Fjellstrom wrote:
> >  >  >   Your guest isn't using kvmclock for some reason.  Is it compiled in
> >  >  >   the guest kernel?  What are the contents of
> >  >  >   /sys/devices/system/clocksource/clocksource0/available_clocksource and
> >  >  >   /sys/devices/system/clocksource/clocksource0/current_clocksource (in
> >  >  >   the guest filesystem)?
> >  >
> >  >  Hi, it seems it is using kvm-clock:
> >  >
> >  >  $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> >  >  kvm-clock hpet acpi_pm
> >  >  $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> >  >  kvm-clock
> >
> >  Yikes.  Please trace such a hang according to
> >  http://www.linux-kvm.org/page/Tracing.
>
> Does it matter that I have several vms running? Is there a way to limit it to
> tracing the single kvm process that's been locking up?
>

You can use "trace-cmd record -F ... qemu ..." but that misses out on 
events the run from workqueues.

Best to stop those other guests.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 14:37         ` Avi Kivity
@ 2011-08-09 14:46           ` Thomas Fjellstrom
  2011-08-09 14:49             ` Avi Kivity
  2011-08-10 15:03             ` Philipp Hahn
  2011-08-09 15:33           ` Nick
  1 sibling, 2 replies; 14+ messages in thread
From: Thomas Fjellstrom @ 2011-08-09 14:46 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org

On August 9, 2011, Avi Kivity wrote:
> On 08/09/2011 05:31 PM, Thomas Fjellstrom wrote:
> > On August 9, 2011, Avi Kivity wrote:
> > >  On 08/09/2011 03:03 PM, Thomas Fjellstrom wrote:
> > >  >  >   Your guest isn't using kvmclock for some reason.  Is it
> > >  >  >   compiled in the guest kernel?  What are the contents of
> > >  >  >   /sys/devices/system/clocksource/clocksource0/available_clocksou
> > >  >  >   rce and
> > >  >  >   /sys/devices/system/clocksource/clocksource0/current_clocksour
> > >  >  >   ce (in the guest filesystem)?
> > >  >  
> > >  >  Hi, it seems it is using kvm-clock:
> > >  >  
> > >  >  $ cat
> > >  >  /sys/devices/system/clocksource/clocksource0/available_clocksource
> > >  >  kvm-clock hpet acpi_pm
> > >  >  $ cat
> > >  >  /sys/devices/system/clocksource/clocksource0/current_clocksource
> > >  >  kvm-clock
> > >  
> > >  Yikes.  Please trace such a hang according to
> > >  http://www.linux-kvm.org/page/Tracing.
> > 
> > Does it matter that I have several vms running? Is there a way to limit
> > it to tracing the single kvm process that's been locking up?
> 
> You can use "trace-cmd record -F ... qemu ..." but that misses out on
> events the run from workqueues.
> 
> Best to stop those other guests.

I would prefer not to do that, those other guests are my web server, mail 
server, and database server. I have no idea if I can reproduce the problem in 
a reasonable time frame.

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 14:46           ` Thomas Fjellstrom
@ 2011-08-09 14:49             ` Avi Kivity
  2011-08-09 14:54               ` Thomas Fjellstrom
  2011-08-09 15:01               ` Thomas Fjellstrom
  2011-08-10 15:03             ` Philipp Hahn
  1 sibling, 2 replies; 14+ messages in thread
From: Avi Kivity @ 2011-08-09 14:49 UTC (permalink / raw)
  To: thomas; +Cc: kvm@vger.kernel.org

On 08/09/2011 05:46 PM, Thomas Fjellstrom wrote:
> >  >  Does it matter that I have several vms running? Is there a way to limit
> >  >  it to tracing the single kvm process that's been locking up?
> >
> >  You can use "trace-cmd record -F ... qemu ..." but that misses out on
> >  events the run from workqueues.
> >
> >  Best to stop those other guests.
>
> I would prefer not to do that, those other guests are my web server, mail
> server, and database server. I have no idea if I can reproduce the problem in
> a reasonable time frame.
>

Okay then, please use -F.

Note, please be sure to note the time the guest hangs so we can 
correlate it with the trace.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 14:49             ` Avi Kivity
@ 2011-08-09 14:54               ` Thomas Fjellstrom
  2011-08-09 15:01               ` Thomas Fjellstrom
  1 sibling, 0 replies; 14+ messages in thread
From: Thomas Fjellstrom @ 2011-08-09 14:54 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org

On August 9, 2011, Avi Kivity wrote:
> On 08/09/2011 05:46 PM, Thomas Fjellstrom wrote:
> > >  >  Does it matter that I have several vms running? Is there a way to
> > >  >  limit it to tracing the single kvm process that's been locking up?
> > >  
> > >  You can use "trace-cmd record -F ... qemu ..." but that misses out on
> > >  events the run from workqueues.
> > >  
> > >  Best to stop those other guests.
> > 
> > I would prefer not to do that, those other guests are my web server, mail
> > server, and database server. I have no idea if I can reproduce the
> > problem in a reasonable time frame.
> 
> Okay then, please use -F.
> 
> Note, please be sure to note the time the guest hangs so we can
> correlate it with the trace.

The fun part is the last thing to cause a hang was a 'aptitude dist-upgrade', 
which updated the kernel, and removed the running kernel, so if it has to be 
restarted, I won't be able to run again with the same kernel, unless I can find 
the package for the old kernel some place.

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 14:49             ` Avi Kivity
  2011-08-09 14:54               ` Thomas Fjellstrom
@ 2011-08-09 15:01               ` Thomas Fjellstrom
  1 sibling, 0 replies; 14+ messages in thread
From: Thomas Fjellstrom @ 2011-08-09 15:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org

On August 9, 2011, Avi Kivity wrote:
> On 08/09/2011 05:46 PM, Thomas Fjellstrom wrote:
> > >  >  Does it matter that I have several vms running? Is there a way to
> > >  >  limit it to tracing the single kvm process that's been locking up?
> > >  
> > >  You can use "trace-cmd record -F ... qemu ..." but that misses out on
> > >  events the run from workqueues.
> > >  
> > >  Best to stop those other guests.
> > 
> > I would prefer not to do that, those other guests are my web server, mail
> > server, and database server. I have no idea if I can reproduce the
> > problem in a reasonable time frame.
> 
> Okay then, please use -F.
> 
> Note, please be sure to note the time the guest hangs so we can
> correlate it with the trace.

Probably a stupid question, but what is the full syntax for the command? I 
only have kvm processes, and qemu is set to give the threads qemu:instance-
name type names.

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 14:37         ` Avi Kivity
  2011-08-09 14:46           ` Thomas Fjellstrom
@ 2011-08-09 15:33           ` Nick
  2011-08-11  6:38             ` Avi Kivity
  1 sibling, 1 reply; 14+ messages in thread
From: Nick @ 2011-08-09 15:33 UTC (permalink / raw)
  To: Avi Kivity; +Cc: thomas, kvm@vger.kernel.org

Hi,

Just joined this list, looking for leads to solve a similar-sounding problem
(guest processes hanging for seconds or minutes when host IO load is high).
I'll say more in a separate email, but I caught the end of this thread and
wanted to ask about kvm-clock.

Naively I'd have thought that using the wrong clock would not actually *cause*
hangs like this. Or is that what you're implying?

Thanks

Nick

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 14:46           ` Thomas Fjellstrom
  2011-08-09 14:49             ` Avi Kivity
@ 2011-08-10 15:03             ` Philipp Hahn
  1 sibling, 0 replies; 14+ messages in thread
From: Philipp Hahn @ 2011-08-10 15:03 UTC (permalink / raw)
  To: Avi Kivity, kvm@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]

Hi,

> > > On August 9, 2011, Avi Kivity wrote:
> > > >  Yikes.  Please trace such a hang according to
> > > >  http://www.linux-kvm.org/page/Tracing.

I have observed the same problem multiple times now and a college of mine 
also. He's able to reproduce this problem every time. For me it manifests on 
the serial console hanging after reboot, for him it is after the reboot when 
a process does a sleep().
The problem occurrs with clocksource=kvm-clock.
After the reboot tsc is disabled with the message "Clocksource tsc unstable 
(delta = 137303562 ns)" and is no longer shown in available_clocksource.
When booting with clocksouce=hpet, the system does not hang.

Our trace-cmd data is available from 
<http://download.univention.de/download/temp/kvm-clock/kvm-clock.tar.bz2>
including the trace.dat. show-task-states_portmap.txt and 
show-task-states_syslogd.txt contain the captures output of Alt-SysRq-t.

Both host and guest are running an 2.6.32.x amd64 kernel, qemu is 0.14.1. The 
system is currently not in production and can be used for experiments.

Sincerely
Philipp Hahn

PS: This issue is tracked in out German bugtracker at 
<https://forge.univention.org/bugzilla/show_bug.cgi?id=23258>
-- 
Philipp Hahn           Open Source Software Engineer      hahn@univention.de
Univention GmbH        Linux for Your Business        fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen                 fax: +49 421 22 232-99
                                                   http://www.univention.de/

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-09 15:33           ` Nick
@ 2011-08-11  6:38             ` Avi Kivity
  2011-08-11 16:37               ` Thomas Fjellstrom
  0 siblings, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2011-08-11  6:38 UTC (permalink / raw)
  To: Nick; +Cc: thomas, kvm@vger.kernel.org

On 08/09/2011 06:33 PM, Nick wrote:
> Hi,
>
> Just joined this list, looking for leads to solve a similar-sounding problem
> (guest processes hanging for seconds or minutes when host IO load is high).
> I'll say more in a separate email, but I caught the end of this thread and
> wanted to ask about kvm-clock.
>
> Naively I'd have thought that using the wrong clock would not actually *cause*
> hangs like this. Or is that what you're implying?
>
>

Using the wrong clock easily causes hangs.  The system schedules a 
wakeup in 3 ms, wrong clock causes it to wakeup in 3 years, you get a 
hang for (3 years - 3 ms).

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm linux guest hanging for minutes at a time
  2011-08-11  6:38             ` Avi Kivity
@ 2011-08-11 16:37               ` Thomas Fjellstrom
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Fjellstrom @ 2011-08-11 16:37 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Nick, kvm@vger.kernel.org

On August 11, 2011, Avi Kivity wrote:
> On 08/09/2011 06:33 PM, Nick wrote:
> > Hi,
> > 
> > Just joined this list, looking for leads to solve a similar-sounding
> > problem (guest processes hanging for seconds or minutes when host IO
> > load is high). I'll say more in a separate email, but I caught the end
> > of this thread and wanted to ask about kvm-clock.
> > 
> > Naively I'd have thought that using the wrong clock would not actually
> > *cause* hangs like this. Or is that what you're implying?
> 
> Using the wrong clock easily causes hangs.  The system schedules a
> wakeup in 3 ms, wrong clock causes it to wakeup in 3 years, you get a
> hang for (3 years - 3 ms).

I am wondering though why the system time would be used to schedule wakeups 
when something a little more like the posix CLOCK_MONOTONIC would make more 
sense. You (at least I don't think you do) really don't want host clock 
changes to interfere with a guest so much that it sleeps forever.

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-08-11 16:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-07 14:06 kvm linux guest hanging for minutes at a time Thomas Fjellstrom
2011-08-09  9:16 ` Avi Kivity
2011-08-09 12:03   ` Thomas Fjellstrom
2011-08-09 12:36     ` Avi Kivity
2011-08-09 14:31       ` Thomas Fjellstrom
2011-08-09 14:37         ` Avi Kivity
2011-08-09 14:46           ` Thomas Fjellstrom
2011-08-09 14:49             ` Avi Kivity
2011-08-09 14:54               ` Thomas Fjellstrom
2011-08-09 15:01               ` Thomas Fjellstrom
2011-08-10 15:03             ` Philipp Hahn
2011-08-09 15:33           ` Nick
2011-08-11  6:38             ` Avi Kivity
2011-08-11 16:37               ` Thomas Fjellstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox