public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Hangs
@ 2008-11-14 15:34 Chris Jones
  2008-11-16 16:36 ` Hangs Chris Jones
  2008-11-18 21:34 ` Hangs Marcelo Tosatti
  0 siblings, 2 replies; 26+ messages in thread
From: Chris Jones @ 2008-11-14 15:34 UTC (permalink / raw)
  To: kvm

I've have setup a couple virtual machines and they work great... for anywhere
between 2-24 hours.  But then, for no reason I can determine, they just go 100%
busy and stop responding.

My basic setup is:
   Ubuntu 8.10 server running on both host and guests.
   kvm version is the one from the Ubuntu distribution (kvm-72)
   Kernel is Ubuntu 8.10 kernel (2.6.27-7-server)
   
The two VMs are both run like:
     kvm -daemonize \
     -hda Imgs/ndev_root.img \
     -m 1024 -cdrom ISOs/ubuntu-8.10-server-amd64.iso      \
     -vnc :1 -net nic,macaddr=DE:AD:BE:EF:04:04,model=e1000 \
     -net tap,ifname=tap1,script=/home/chris/kvm/qemu-ifup.sh 

(With different disk imgs, vnc addresses and macaddrs)

The disk images are raw format:
    qemu-img create -f raw 80G Imgs/ndev_root.img

I've tried running with these options, and in different combos:

    -no-kvm-pit 
    -no-kvm-irqchip
    -no-acpi 

They don't seem to help much.  If anything, -no-kvm-irqchip seems to cause more
troubles.

I tried running with -no-kvm and it won't boot at all.

I also built my own 2.6.27.4 kernel, from kernel.org, and built my own version
of kvm-78, but saw the same behavior.

Anyone have any advice how I can resolve these hangs?  When the setup works,
it's a beautiful setup and I love kvm.  But with the 100% busy hangs I can't
really continue with it.

Thanks!
Chris



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-14 15:34 Hangs Chris Jones
@ 2008-11-16 16:36 ` Chris Jones
  2008-11-18 21:34 ` Hangs Marcelo Tosatti
  1 sibling, 0 replies; 26+ messages in thread
From: Chris Jones @ 2008-11-16 16:36 UTC (permalink / raw)
  To: kvm


Oops,

I was wrong, -no-kvm does boot and runs cleanly (It's just insanely slow).  The
other configurations the guests still eventually hang.

Any advice would be much appreciated.

Thanks,
Chris



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-14 15:34 Hangs Chris Jones
  2008-11-16 16:36 ` Hangs Chris Jones
@ 2008-11-18 21:34 ` Marcelo Tosatti
  2008-11-19 10:57   ` Hangs Roland Lammel
  1 sibling, 1 reply; 26+ messages in thread
From: Marcelo Tosatti @ 2008-11-18 21:34 UTC (permalink / raw)
  To: Chris Jones; +Cc: kvm

On Fri, Nov 14, 2008 at 03:34:57PM +0000, Chris Jones wrote:
> I've have setup a couple virtual machines and they work great... for anywhere
> between 2-24 hours.  But then, for no reason I can determine, they just go 100%
> busy and stop responding.

Hi Chris,

Can you please reproduce with kvm-79 and provide "kvm_stat -l"
(kvm-79/kvm_stat) output (for 10s or so).


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-18 21:34 ` Hangs Marcelo Tosatti
@ 2008-11-19 10:57   ` Roland Lammel
  2008-11-19 21:53     ` Hangs Roland Lammel
  0 siblings, 1 reply; 26+ messages in thread
From: Roland Lammel @ 2008-11-19 10:57 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Chris Jones, kvm

Sorry to repost under a different topic again, but it fits far better here.

I saw similar issues when running from a debian lenny 2.6.26-1-amd64
64bit kvm host (which is kvm72  on currently) and the guests are
debian lenny 2.6.26-1-486 32bit. So the setup is similar to the Ubuntu setup.

I have configured ntpd in the host system and the guest systems, but
of course ntpd crashes after that severe clock jump.

The problem shows exactly the same systems, but the system is able to
recover from time to time, which allowed me to see the actual cause of
the problem, which seems to be a severe backward time jump (it is
mostly somerwhere in Nov 1912, so it seems to be correlated as a
backward shift form the current time (e.g. int overflow) which causes
the VM to hang.

In case it is able to recover I saw a very big clock jump (for the
kernel timer it is a forward jump but it seems to cause the system
clock to be in Nov 1912).
Nov 12 20:56:03 bit kernel: [   38.061596] warning: `ntpd' uses 32-bit
capabilities (legacy support in use)
Nov 13 06:25:03 bit kernel: imklog 3.18.2, log source = /proc/kmsg started.
Nov 30 06:25:48 bit kernel: imklog 3.18.2, log source = /proc/kmsg started.
Nov 30 06:25:48 bit kernel: imklog 3.18.2, log source = /proc/kmsg started.
Nov 30 06:25:51 bit kernel: [1266940721.901855] INFO: task
postdrop:19268 blocked for more than 120 seconds.
Nov 30 06:25:51 bit kernel: [1266940721.902793] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 30 06:25:51 bit kernel: [1266940721.905843] postdrop      D
c014f55e     0 19268  19267
Nov 30 06:25:51 bit kernel: [1266940721.906697]        dd8f9c00
00000086 00000000 c014f55e 54541a81 1194f8cd dd8f9d8c 00015f63
Nov 30 06:25:51 bit kernel: [1266940721.907799]        00000000
be709d78 be709d78 c657c3c4 dec3b400 c02a5b89 dd92ded4 dda43ed4
Nov 30 06:25:51 bit kernel: [1266940721.908245]        be709d78
c0121fc7 dd8f9c00 c03ec700 c02a5b84 74736f70 706f7264 642d7000
Nov 30 06:25:51 bit kernel: [1266940721.909838] Call Trace:
Nov 30 06:25:51 bit kernel: [1266940721.910957]  [<c014f55e>]
write_cache_pages+0x227/0x26d
Nov 30 06:25:51 bit kernel: [1266940721.911801]  [<c02a5b89>]
schedule_timeout+0x69/0x86
Nov 30 06:25:51 bit kernel: [1266940721.912646]  [<c0121fc7>]
process_timeout+0x0/0x5
Nov 30 06:25:51 bit kernel: [1266940721.913463]  [<c02a5b84>]
schedule_timeout+0x64/0x86
Nov 30 06:25:51 bit kernel: [1266940721.914288]  [<e00852e4>]
journal_stop+0x7d/0x12b [jbd]
Nov 30 06:25:51 bit kernel: [1266940721.915134]  [<c017bfcd>]
__writeback_single_inode+0x13f/0x231
Nov 30 06:25:51 bit kernel: [1266940721.916017]  [<c014f5ee>]
do_writepages+0x29/0x30
Nov 30 06:25:51 bit kernel: [1266940721.916834]  [<c014ace8>]
__filemap_fdatawrite_range+0x65/0x70
Nov 30 06:25:51 bit kernel: [1266940721.917722]  [<e00fbeab>]
ext3_sync_file+0x87/0x9c [ext3]
Nov 30 06:25:51 bit kernel: [1266940721.918580]  [<c017e6f0>] do_fsync+0x3d/0x7e
Nov 30 06:25:51 bit kernel: [1266940721.919356]  [<c017e74e>]
__do_fsync+0x1d/0x2b
Nov 30 06:25:51 bit kernel: [1266940721.920142]  [<c010372f>]
sysenter_past_esp+0x78/0xb9
Nov 30 06:25:51 bit kernel: [1266940721.920993]  =======================

The guest is not really usable anymore as all diskio (mostly write but
also read) tend to hang the system completly.

I now manually compiled kvm-79 (including the kernel modules) and am
running from it with 3 instances now, non of them has crashed so far,
but it's only 20 hours so far.

For me the ping check is actually enough to detect if the host is ok,
and I'll probably use mon or something similar to just shutdown and
restart the instance.

From what I remember kvm_stat suddenly dropped all counters to 0
(chance increase with heavy disk io) only minor block device activity.
I'll try to reproduce and provide the kvm_stat output too.

Cheers

+rl


On Tue, Nov 18, 2008 at 10:34 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Fri, Nov 14, 2008 at 03:34:57PM +0000, Chris Jones wrote:
>> I've have setup a couple virtual machines and they work great... for anywhere
>> between 2-24 hours.  But then, for no reason I can determine, they just go 100%
>> busy and stop responding.
>
> Hi Chris,
>
> Can you please reproduce with kvm-79 and provide "kvm_stat -l"
> (kvm-79/kvm_stat) output (for 10s or so).
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Roland Lammel
QuikIT - IT Lösungen - flexibel und schnell
Web: http://www.quikit.at
Email: info@quikit.at

"Enjoy your job, make lots of money, work within the law. Choose any two."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-19 10:57   ` Hangs Roland Lammel
@ 2008-11-19 21:53     ` Roland Lammel
       [not found]       ` <20081120015600.GB10846@dmt.cnet>
  0 siblings, 1 reply; 26+ messages in thread
From: Roland Lammel @ 2008-11-19 21:53 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Chris Jones, kvm

Actually it just happenend again with the host running kvm-79. Host
CPU is at 100% but I'm still able to login (it recovered from the
first hang). But I'm not able to start e.g. top.
Writing to disk works (e.g. dd /dev/zero to /tmp/test.file with 1MB,
1G already caused the instance to hang)

In the guest I see:

The soft lockup of CPU#0 (only 1 cpu assigned to the guest) seems to
be either caused by or cause itself the clock problem

bit:~# date
Fri Dec  6 13:50:40 CET 1912

[   57.348217] eth1: no IPv6 routers present
[1266956800.037898] BUG: soft lockup - CPU#0 stuck for 1179869795s!
[logcheck:23795]
[1266956800.037898] Modules linked in: ipv6 dm_snapshot dm_mirror
dm_log dm_mod loop virtio_balloon serio_raw snd_pcsp virtio_net
snd_pcm snd_timer snd soundcore psmouse snd_page_alloc evdev ext3 jbd
mbcache ide_cd_mod cdrom ata_generic libata scsi_mod dock
ide_pci_generic virtio_blk uhci_hcd usbcore piix ide_core virtio_pci
thermal_sys
[1266956800.037898]
[1266956800.037898] Pid: 23795, comm: logcheck Not tainted (2.6.26-1-486 #1)
[1266956800.037898] EIP: 0060:[<c0118c28>] EFLAGS: 00000246 CPU: 0
[1266956800.037898] EIP is at finish_task_switch+0x20/0x78
[1266956800.037898] EAX: c03cb620 EBX: de82e800 ECX: 00000003 EDX: de824000
[1266956800.037898] ESI: 00000000 EDI: de824000 EBP: 00000000 ESP: c51e1f9c
[1266956800.037898]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[1266956800.037898] CR0: 8005003b CR2: 088a711c CR3: 04f08000 CR4: 00000690
[1266956800.037898] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[1266956800.037898] DR6: ffff0ff0 DR7: 00000400
[1266956800.037898]  [<c0118e22>] schedule_tail+0xe/0x39
[1266956800.037898]  [<c0103646>] ret_from_fork+0x6/0x20
[1266956800.037898]  =======================
[1266957741.780134] INFO: task postdrop:24584 blocked for more than 120 seconds.
[1266957741.780592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1266957741.781276] postdrop      D c014f55e     0 24584  24583
[1266957741.781696]        de894000 00000086 00000000 c014f55e
49506ad8 1195233c de89418c 00013315
[1266957741.782256]        00000000 bf229803 bf229803 d41593ec
ddb2d400 c02a5b89 c03ec750 ddbf319c
[1266957741.783040]        bf229803 c0121fc7 de894000 c03ec700
c02a5b84 74736f70 706f7264 642d7000


On the host I see (with 2 other guests running with no load):
KVM stat on the hosts shows:
kvm statistics

 efer_reload                  0       0
 exits                 83407498    2353
 fpu_reload              399797      17
 halt_exits             2079501      18
 halt_wakeup              67976       0
 host_state_reload      5925084      38
 hypercalls            20515664       0
 insn_emulation        22039607    1318
 insn_emulation_fail          0       0
 invlpg                 3237765       0
 io_exits               3745042       0
 irq_exits              1845939     722
 irq_injections         3705460     704
 irq_window                   0       0
 kvm_request_irq              0       0
 largepages                   0       0
 mmio_exits               82282       0
 mmu_cache_miss         1533699       0
 mmu_flooded            1194315       0
 mmu_pde_zapped         1468996       0
 mmu_pte_updated       18613367       0
 mmu_pte_write         28421353       0
 mmu_recycled                 0       0
 mmu_shadow_zapped      1779168       0
 mmu_unsync                  18       0
 nmi_injections               0       0
 nmi_window                   0       0
 pf_fixed              15449889       0
 pf_guest              18947094       0
 remote_tlb_flush             2       0
 request_nmi                  0       0
 signal_exits                 0       0
 tlb_flush             25533150     255

Clock sources used are (for host and guest):
host:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
acpi_pm
host:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
acpi_pm jiffies tsc

guest:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
kvm-clock
guest:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
kvm-clock jiffies tsc
bit:~#

Commandline for starting is (kvm-79):
/usr/local/bin/qemu-system-x86_64 -S -M pc -m 500 -smp 1 -name bit
-monitor pty -no-acpi -boot c -drive
file=/var/kvm/bit.img,if=virtio,index=0,boot=on -net
nic,macaddr=24:42:53:21:52:45,vlan=0,model=virtio -net
tap,fd=11,script=,vlan=0,ifname=vnet0 -serial tcp:127.0.0.1:50401
-parallel none -usb -vnc 0.0.0.0:45001

Hope that helps, Cheers

+rl
-- 
Roland Lammel
QuikIT - IT Lösungen - flexibel und schnell
Web: http://www.quikit.at
Email: info@quikit.at

"Enjoy your job, make lots of money, work within the law. Choose any two."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
@ 2008-11-19 22:43 chris
  2008-11-20 17:10 ` Hangs chris
  0 siblings, 1 reply; 26+ messages in thread
From: chris @ 2008-11-19 22:43 UTC (permalink / raw)
  To: kvm, mtosatti, rl; +Cc: chris

Thanks for the responses,

I'm not sure if my problem is the same as Roland's, but it definitely sounds
plausible.  I had been running ntpdate in the host to synchronize time every hour (in a cron job), so it sounds as if we could be seeing the same issue.

I'll spend some time trying to reproduce it with and without the ntpdate cron
job and see if I can get it to go away.

As for providing kvm_stat output - is kvm-78 OK?  On kvm-79, I've been 
hitting the same compiler errors on kvm-79 as described in Bug #2287677 
(undeclared kvm_arch_do_ioperm)

Thanks,
Chris

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-19 22:43 Hangs chris
@ 2008-11-20 17:10 ` chris
  2008-11-21 19:32   ` Hangs Marcelo Tosatti
  0 siblings, 1 reply; 26+ messages in thread
From: chris @ 2008-11-20 17:10 UTC (permalink / raw)
  To: kvm, mtosatti, rl

On Wed, Nov 19, 2008 at 02:43:42PM -0800, chris@versecorp.net wrote:
> Thanks for the responses,
> 
> I'm not sure if my problem is the same as Roland's, but it definitely sounds
> plausible.  I had been running ntpdate in the host to synchronize time every hour (in a cron job), so it sounds as if we could be seeing the same issue.
> 

Actually, with ntpdate taken out of crontab, I'm still seeing periodic hangs, so it's either a different problem or I'm hitting it in a different manner.

OK, I installed kvm-79 and kernel 2.6.27.6, and here's the the kvm-stat output 
with 1 guest hung and 3 more operational:

 efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall  insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_windo  largepage  mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc  mmu_shado  nmi_windo   pf_fixed   pf_guest  remote_tl  request_i  signal_ex  tlb_flush
         0        333         24         32          0        331          2        212          0          0         78          4          0          0        188          0          0          0          2          2          0          0          0          1          1          0          0          0         12
         0        360          3         30          0        331          0        290          0          0          0          4          0          0        269          0          0          0          5          5          0          0          0         35          5          0          0          0         15
         0        287          2         30          0        307          0        202          0          0         52          2          0          0        194          0          0          0          0          0          0          0          0          0          0          0          0          0          4
         0        389         20         29          0        405          0        277          0          0         78          3          0          0        267          0          0          0          0          0          0          0          0          0          0          0          0          0          6
         0        307          4         32          0        315          0        219          0          0         52          3          0          0        198          0          0          0          0          0          0          0          0          0          0          0          0          0         11
         0        327          2         35          0        346          2        285          0          0          0          4          0          0        274          0          0          0          2          2          0          0          0          1          1          0          0          0          7
         0        334         22         31          0        342          0        217          0          0         78          4          0          0        201          0          0          0          0          0          0          0          0          0          0          0          0          0          8
         0        311          3         28          0        324          0        280          0          0          0          2          0          0        265          0          0          0          0          0          0          0          0          0          0          0          0          0          9
         0        292          2         32          0        313          0        204          0          0         52          3          0          0        196          0          0          0          0          0          0          0          0          0          0          0          0          0          4
         0        791         23         46          0        780         10        352          0          0        364          6          0          0        320          0          0          0         10         10          0          0          0          5          5          0          0          0         20
         0        251          3         30          0        259          2        214          0          0          0          4          0          0        198          0          0          0          2          2          0          0          0          1          1          0          0          0         10
         0        313          2         31          0        330          0        278          0          0          0          4          0          0        266          0          0          0          0          0          0          0          0          0          0          0          0          0          6
         0        330         22         30          0        339          0        215          0          0         78          3          0          0        199          0          0          0          0          0          0          0          0          0          0          0          0          0          8

If I shut down the 3 operational guests leaving just the hung guest, the
kvm-stat output is all 0s:

 efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall  insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_windo  largepage  mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc  mmu_shado  nmi_windo   pf_fixed   pf_guest  remote_tl  request_i  signal_ex  tlb_flush
         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0


The hung guest in this case was run with this command:

sudo /usr/local/bin/qemu-system-x86_64             \
     -daemonize                                    \
     -no-kvm-irqchip                               \
     -hda Imgs/ndev_root.img                       \
     -m 1024                                       \
     -cdrom ISOs/ubuntu-8.10-server-amd64.iso      \
     -vnc :4                                       \
     -net nic,macaddr=DE:AD:BE:EF:04:04,model=e1000 \
     -net tap,ifname=tap4,script=/home/chris/kvm/qemu-ifup.sh \
     >>& Logs/ndev_run.log


I should also mention that when the guest is hung, I can still switch to the monitor with ctrl-alt 2.   So, at least it's a little bit alive.

I've also noticed that the behavior with the hung guest is slightly different on kvm-79 than it was earlier.  When the guest hangs, the kvm process in the host doesn't spin at 100% busy any longer - the guest is just unresponsive at both the network and VNC console.

Also, I've noticed that if I reset the guest from the monitor, the guest will boot up again, and I can get through to it on the network, but strangely, the mouse and keyboard will still be hung at the VNC console (except that I can still switch back and forth to the monitor).  

Hope some of this helps, let me know if you need to me to provide any other troubleshooting info.

Chris

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
       [not found]       ` <20081120015600.GB10846@dmt.cnet>
@ 2008-11-21 15:55         ` Glauber Costa
  2008-11-21 20:29           ` Hangs Roland Lammel
  0 siblings, 1 reply; 26+ messages in thread
From: Glauber Costa @ 2008-11-21 15:55 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Roland Lammel, Chris Jones, kvm, Glauber de Oliveira Costa,
	Gerd Hoffmann

On Thu, Nov 20, 2008 at 02:56:00AM +0100, Marcelo Tosatti wrote:
> 
> On Wed, Nov 19, 2008 at 10:53:27PM +0100, Roland Lammel wrote:
> > Actually it just happenend again with the host running kvm-79. Host
> > CPU is at 100% but I'm still able to login (it recovered from the
> > first hang). But I'm not able to start e.g. top.
> > Writing to disk works (e.g. dd /dev/zero to /tmp/test.file with 1MB,
> > 1G already caused the instance to hang)
> > 
> > In the guest I see:
> > 
> > The soft lockup of CPU#0 (only 1 cpu assigned to the guest) seems to
> > be either caused by or cause itself the clock problem
> > 
> > bit:~# date
> > Fri Dec  6 13:50:40 CET 1912
> > 
> > [   57.348217] eth1: no IPv6 routers present
> > [1266956800.037898] BUG: soft lockup - CPU#0 stuck for 1179869795s!
> 
> Funny. Glauber, Gerd?
So, can you provide a more informative dmesg? It doesn't need to be a full dmesg,
but something more than these two messages would help. Specially because they
have printk timestamps on it. It seems to me that our sched_clock went crazy,
since the timestamp in the second printk is so much bigger than the first,
and never changes after that.

Do you have an older version of both host/guest in which it used to work?

> 
> > [logcheck:23795]
> > [1266956800.037898] Modules linked in: ipv6 dm_snapshot dm_mirror
> > dm_log dm_mod loop virtio_balloon serio_raw snd_pcsp virtio_net
> > snd_pcm snd_timer snd soundcore psmouse snd_page_alloc evdev ext3 jbd
> > mbcache ide_cd_mod cdrom ata_generic libata scsi_mod dock
> > ide_pci_generic virtio_blk uhci_hcd usbcore piix ide_core virtio_pci
> > thermal_sys
> > [1266956800.037898]
> > [1266956800.037898] Pid: 23795, comm: logcheck Not tainted (2.6.26-1-486 #1)
> > [1266956800.037898] EIP: 0060:[<c0118c28>] EFLAGS: 00000246 CPU: 0
> > [1266956800.037898] EIP is at finish_task_switch+0x20/0x78
> > [1266956800.037898] EAX: c03cb620 EBX: de82e800 ECX: 00000003 EDX: de824000
> > [1266956800.037898] ESI: 00000000 EDI: de824000 EBP: 00000000 ESP: c51e1f9c
> > [1266956800.037898]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> > [1266956800.037898] CR0: 8005003b CR2: 088a711c CR3: 04f08000 CR4: 00000690
> > [1266956800.037898] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > [1266956800.037898] DR6: ffff0ff0 DR7: 00000400
> > [1266956800.037898]  [<c0118e22>] schedule_tail+0xe/0x39
> > [1266956800.037898]  [<c0103646>] ret_from_fork+0x6/0x20
> > [1266956800.037898]  =======================
> > [1266957741.780134] INFO: task postdrop:24584 blocked for more than 120 seconds.
> > [1266957741.780592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [1266957741.781276] postdrop      D c014f55e     0 24584  24583
> > [1266957741.781696]        de894000 00000086 00000000 c014f55e
> > 49506ad8 1195233c de89418c 00013315
> > [1266957741.782256]        00000000 bf229803 bf229803 d41593ec
> > ddb2d400 c02a5b89 c03ec750 ddbf319c
> > [1266957741.783040]        bf229803 c0121fc7 de894000 c03ec700
> > c02a5b84 74736f70 706f7264 642d7000
> >
> > Clock sources used are (for host and guest):
> > host:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > acpi_pm
> > host:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > acpi_pm jiffies tsc
> > 
> > guest:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > kvm-clock
> > guest:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > kvm-clock jiffies tsc
> > bit:~#
> >
> > Commandline for starting is (kvm-79):
> > /usr/local/bin/qemu-system-x86_64 -S -M pc -m 500 -smp 1 -name bit
> > -monitor pty -no-acpi -boot c -drive
> > file=/var/kvm/bit.img,if=virtio,index=0,boot=on -net
> > nic,macaddr=24:42:53:21:52:45,vlan=0,model=virtio -net
> > tap,fd=11,script=,vlan=0,ifname=vnet0 -serial tcp:127.0.0.1:50401
> > -parallel none -usb -vnc 0.0.0.0:45001
> 
> Why are you using -no-acpi? Perhaps switch the guest to acpi_pm to isolate
> kvm-clock issues?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-20 17:10 ` Hangs chris
@ 2008-11-21 19:32   ` Marcelo Tosatti
  2008-11-21 23:43     ` Hangs Roland Lammel
  2008-11-22 17:54     ` Hangs chris
  0 siblings, 2 replies; 26+ messages in thread
From: Marcelo Tosatti @ 2008-11-21 19:32 UTC (permalink / raw)
  To: chris; +Cc: kvm, rl

On Thu, Nov 20, 2008 at 09:10:57AM -0800, chris@versecorp.net wrote:
> On Wed, Nov 19, 2008 at 02:43:42PM -0800, chris@versecorp.net wrote:
> > Thanks for the responses,
> > 
> > I'm not sure if my problem is the same as Roland's, but it definitely sounds
> > plausible.  I had been running ntpdate in the host to synchronize time every hour (in a cron job), so it sounds as if we could be seeing the same issue.
> > 
> 
> Actually, with ntpdate taken out of crontab, I'm still seeing periodic
> hangs, so it's either a different problem or I'm hitting it in a
> different manner.
> 
> OK, I installed kvm-79 and kernel 2.6.27.6, and here's the the kvm-stat output 
> with 1 guest hung and 3 more operational:

<snip>

> If I shut down the 3 operational guests leaving just the hung guest, the
> kvm-stat output is all 0s:
> 
>  efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall  insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_windo  largepage  mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc  mmu_shado  nmi_windo   pf_fixed   pf_guest  remote_tl  request_i  signal_ex  tlb_flush
>          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0

So the guest is not actually running here, which means its
QEMU that its hanging at.

> The hung guest in this case was run with this command:
> 
> sudo /usr/local/bin/qemu-system-x86_64             \
>      -daemonize                                    \
>      -no-kvm-irqchip                               \
>      -hda Imgs/ndev_root.img                       \
>      -m 1024                                       \
>      -cdrom ISOs/ubuntu-8.10-server-amd64.iso      \
>      -vnc :4                                       \
>      -net nic,macaddr=DE:AD:BE:EF:04:04,model=e1000 \
>      -net tap,ifname=tap4,script=/home/chris/kvm/qemu-ifup.sh \
>      >>& Logs/ndev_run.log
> 
> 
> I should also mention that when the guest is hung, I can still switch
> to the monitor with ctrl-alt 2. So, at least it's a little bit alive.

In coma perhaps.

> I've also noticed that the behavior with the hung guest is slightly
> different on kvm-79 than it was earlier. When the guest hangs, the kvm
> process in the host doesn't spin at 100% busy any longer - the guest is
> just unresponsive at both the network and VNC console.

> Also, I've noticed that if I reset the guest from the monitor, the
> guest will boot up again, and I can get through to it on the network,
> but strangely, the mouse and keyboard will still be hung at the
> VNC console (except that I can still switch back and forth to the
> monitor).
>
> Hope some of this helps, let me know if you need to me to provide any
> other troubleshooting info.

$ gdb -p pid-of-qemu

(gdb) info threads

Print the backtrace for every thread with:

(gdb) thread N
(gdb) bt


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-21 15:55         ` Hangs Glauber Costa
@ 2008-11-21 20:29           ` Roland Lammel
  2008-11-21 21:01             ` Hangs Daniel P. Berrange
  0 siblings, 1 reply; 26+ messages in thread
From: Roland Lammel @ 2008-11-21 20:29 UTC (permalink / raw)
  To: Glauber Costa
  Cc: Marcelo Tosatti, Chris Jones, kvm, Glauber de Oliveira Costa,
	Gerd Hoffmann

[-- Attachment #1: Type: text/plain, Size: 5763 bytes --]

On Fri, Nov 21, 2008 at 4:55 PM, Glauber Costa <glommer@redhat.com> wrote:
> On Thu, Nov 20, 2008 at 02:56:00AM +0100, Marcelo Tosatti wrote:
>>
>> On Wed, Nov 19, 2008 at 10:53:27PM +0100, Roland Lammel wrote:
>> > Actually it just happenend again with the host running kvm-79. Host
>> > CPU is at 100% but I'm still able to login (it recovered from the
>> > first hang). But I'm not able to start e.g. top.
>> > Writing to disk works (e.g. dd /dev/zero to /tmp/test.file with 1MB,
>> > 1G already caused the instance to hang)
>> >
>> > In the guest I see:
>> >
>> > The soft lockup of CPU#0 (only 1 cpu assigned to the guest) seems to
>> > be either caused by or cause itself the clock problem
>> >
>> > bit:~# date
>> > Fri Dec  6 13:50:40 CET 1912
>> >
>> > [   57.348217] eth1: no IPv6 routers present
>> > [1266956800.037898] BUG: soft lockup - CPU#0 stuck for 1179869795s!
>>
>> Funny. Glauber, Gerd?
> So, can you provide a more informative dmesg? It doesn't need to be a full dmesg,
> but something more than these two messages would help. Specially because they
> have printk timestamps on it. It seems to me that our sched_clock went crazy,
> since the timestamp in the second printk is so much bigger than the first,
> and never changes after that.
>

Of course, I have 3 guests running all the same guest configuration
(debian 32bit), I'll enable now a 4th guest with debian 64 to see if
that makes any difference. Hosts usually crash within 12-48 hours
(although one is running for 50 hours right now).

Attached is the full dmesg of the host-system and the guest (kern.log)
which logged the CPU lockup.

> Do you have an older version of both host/guest in which it used to work?

Actually not, I just started out with KVM, as I'm used to use Xen
until now, but not on this particular maschine.

>>
>> > [logcheck:23795]
>> > [1266956800.037898] Modules linked in: ipv6 dm_snapshot dm_mirror
>> > dm_log dm_mod loop virtio_balloon serio_raw snd_pcsp virtio_net
>> > snd_pcm snd_timer snd soundcore psmouse snd_page_alloc evdev ext3 jbd
>> > mbcache ide_cd_mod cdrom ata_generic libata scsi_mod dock
>> > ide_pci_generic virtio_blk uhci_hcd usbcore piix ide_core virtio_pci
>> > thermal_sys
>> > [1266956800.037898]
>> > [1266956800.037898] Pid: 23795, comm: logcheck Not tainted (2.6.26-1-486 #1)
>> > [1266956800.037898] EIP: 0060:[<c0118c28>] EFLAGS: 00000246 CPU: 0
>> > [1266956800.037898] EIP is at finish_task_switch+0x20/0x78
>> > [1266956800.037898] EAX: c03cb620 EBX: de82e800 ECX: 00000003 EDX: de824000
>> > [1266956800.037898] ESI: 00000000 EDI: de824000 EBP: 00000000 ESP: c51e1f9c
>> > [1266956800.037898]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> > [1266956800.037898] CR0: 8005003b CR2: 088a711c CR3: 04f08000 CR4: 00000690
>> > [1266956800.037898] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>> > [1266956800.037898] DR6: ffff0ff0 DR7: 00000400
>> > [1266956800.037898]  [<c0118e22>] schedule_tail+0xe/0x39
>> > [1266956800.037898]  [<c0103646>] ret_from_fork+0x6/0x20
>> > [1266956800.037898]  =======================
>> > [1266957741.780134] INFO: task postdrop:24584 blocked for more than 120 seconds.
>> > [1266957741.780592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> > disables this message.
>> > [1266957741.781276] postdrop      D c014f55e     0 24584  24583
>> > [1266957741.781696]        de894000 00000086 00000000 c014f55e
>> > 49506ad8 1195233c de89418c 00013315
>> > [1266957741.782256]        00000000 bf229803 bf229803 d41593ec
>> > ddb2d400 c02a5b89 c03ec750 ddbf319c
>> > [1266957741.783040]        bf229803 c0121fc7 de894000 c03ec700
>> > c02a5b84 74736f70 706f7264 642d7000
>> >
>> > Clock sources used are (for host and guest):
>> > host:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>> > acpi_pm
>> > host:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>> > acpi_pm jiffies tsc
>> >
>> > guest:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>> > kvm-clock
>> > guest:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>> > kvm-clock jiffies tsc
>> > bit:~#
>> >
>> > Commandline for starting is (kvm-79):
>> > /usr/local/bin/qemu-system-x86_64 -S -M pc -m 500 -smp 1 -name bit
>> > -monitor pty -no-acpi -boot c -drive
>> > file=/var/kvm/bit.img,if=virtio,index=0,boot=on -net
>> > nic,macaddr=24:42:53:21:52:45,vlan=0,model=virtio -net
>> > tap,fd=11,script=,vlan=0,ifname=vnet0 -serial tcp:127.0.0.1:50401
>> > -parallel none -usb -vnc 0.0.0.0:45001
>>
>> Why are you using -no-acpi? Perhaps switch the guest to acpi_pm to isolate
>> kvm-clock issues?

I've changed one guest to use acpi and use the acpi_pm clock source.

Running now with:
/usr/local/bin/qemu-system-x86_64 -S -M pc -m 500 -smp 1 -name bit
-monitor pty -no-acpi -boot c -drive
file=/var/kvm/bit.img,if=virtio,index=0,boot=on -net
nic,macaddr=24:42:53:21:52:45,vlan=0,model=virtio -net
tap,fd=11,script=,vlan=0,ifname=vnet0 -serial tcp:127.0.0.1:50401
-parallel none -usb -vnc 0.0.0.0:45001

Actually I had similar problems when using kvm-72 and acpi, and after
switching to libvirt for configuration it automatically changed to
no-acpi which I just left that way to try it. I've read mixed
recommendaction concerning ACPI.

So should acpi be enabled generelly (for modern kernels and systems)?
Should ntpd be running on the guests, or just the host when using acpi_pm?


Cheers and thanks

+rl

-- 
Roland Lammel
QuikIT - IT Lösungen - flexibel und schnell
Web: http://www.quikit.at
Email: info@quikit.at

"Enjoy your job, make lots of money, work within the law. Choose any two."

[-- Attachment #2: dmesg_host.gz --]
[-- Type: application/x-gzip, Size: 10620 bytes --]

[-- Attachment #3: kern.log.gz --]
[-- Type: application/x-gzip, Size: 8371 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-21 20:29           ` Hangs Roland Lammel
@ 2008-11-21 21:01             ` Daniel P. Berrange
  2008-11-21 23:46               ` Hangs Roland Lammel
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel P. Berrange @ 2008-11-21 21:01 UTC (permalink / raw)
  To: Roland Lammel
  Cc: Glauber Costa, Marcelo Tosatti, Chris Jones, kvm,
	Glauber de Oliveira Costa, Gerd Hoffmann

On Fri, Nov 21, 2008 at 09:29:43PM +0100, Roland Lammel wrote:
> On Fri, Nov 21, 2008 at 4:55 PM, Glauber Costa <glommer@redhat.com> wrote:
> > On Thu, Nov 20, 2008 at 02:56:00AM +0100, Marcelo Tosatti wrote:
> >>
> >> Why are you using -no-acpi? Perhaps switch the guest to acpi_pm to isolate
> >> kvm-clock issues?
> 
> I've changed one guest to use acpi and use the acpi_pm clock source.
> 
> Running now with:
> /usr/local/bin/qemu-system-x86_64 -S -M pc -m 500 -smp 1 -name bit
> -monitor pty -no-acpi -boot c -drive
> file=/var/kvm/bit.img,if=virtio,index=0,boot=on -net
> nic,macaddr=24:42:53:21:52:45,vlan=0,model=virtio -net
> tap,fd=11,script=,vlan=0,ifname=vnet0 -serial tcp:127.0.0.1:50401
> -parallel none -usb -vnc 0.0.0.0:45001
> 
> Actually I had similar problems when using kvm-72 and acpi, and after
> switching to libvirt for configuration it automatically changed to
> no-acpi which I just left that way to try it. I've read mixed
> recommendaction concerning ACPI.

FYI, libvirt allows you to turn ACPI on or off. In the XML document,
the top level <domain> tag can contains a set of features, one of
which is ACPI, eg.


  <domain type='kvm'>
    <name>foo</name>
    .....
    <features>
      <acpi/>
    </features>
    ....
  </domain>

If the '<acpi/>' tag is left out, libvirt will add -no-acpi to QEMU
command line.

For more information check out the docs here:

  http://libvirt.org/formatdomain.html#elementsFeatures

> So should acpi be enabled generelly (for modern kernels and systems)?
> Should ntpd be running on the guests, or just the host when using acpi_pm?

The virt-install & virt-manager tools will enable ACPI by default for
nearly all OS, except for a couple of very old Windows which don;t
commonly work

Regards,
Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-21 19:32   ` Hangs Marcelo Tosatti
@ 2008-11-21 23:43     ` Roland Lammel
  2008-11-22 17:54     ` Hangs chris
  1 sibling, 0 replies; 26+ messages in thread
From: Roland Lammel @ 2008-11-21 23:43 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: chris, kvm

Hi all,

So it seems this is not the same issue as chris is seeing, as I now
have done the kvm_stat for my hanging domain. I'm not seeing all zero
counters. CPU is still hanging at 100%, I can still login and saw
again the timejump in the dmesg output (the guest was startet with
acpi, but still used the kvmclock, the other guest had used both acpi
and acpi_pm clock).

I'm switching also this guest to acpi_pm now.

dmesg in the guest shows again that nice timejump:
[    3.968415] Uniform CD-ROM driver Revision: 3.20
[    4.122814]  vda: vda1 vda2 < vda5 >
[    4.620176] kjournald starting.  Commit interval 5 seconds
[    4.626042] EXT3-fs: mounted filesystem with ordered data mode.
[    5.641077] udevd version 125 started
[    6.548858] input: Power Button (FF) as /class/input/input1
[    6.568119] ACPI: Power Button (FF) [PWRF]
[    6.847566] piix4_smbus 0000:00:01.3: Found 0000:00:01.3 device
[    7.065429] input: PC Speaker as /class/input/input2
[    7.229412] input: ImExPS/2 Generic Explorer Mouse as /class/input/input3
[    7.269947] Error: Driver 'pcspkr' is already registered, aborting...
[    7.277315] udev: renamed network interface eth0 to eth1
[    8.674616] Adding 489940k swap on /dev/vda5.  Priority:-1
extents:1 across:489940k
[    8.767526] EXT3 FS on vda1, internal journal
[   10.270122] loop: module loaded
[   10.461641] device-mapper: uevent: version 1.0.3
[   10.475258] device-mapper: ioctl: 4.13.0-ioctl (2007-10-18)
initialised: dm-devel@redhat.com
[   11.221794] NET: Registered protocol family 10
[   11.224276] lo: Disabled Privacy Extensions
[   19.770963] warning: `ntpd' uses 32-bit capabilities (legacy support in use)
[   21.420169] eth1: no IPv6 routers present
[1266862591.699790] BUG: soft lockup - CPU#0 stuck for 1179853412s!
[logcheck:4056]
[1266862591.699790] Modules linked in: video output ac battery ipv6
dm_snapshot dm_mirror dm_log dm_mod loop virtio_net virtio_balloon
snd_pcsp serio_raw psmouse snd_pcm snd_timer snd soundcore
snd_page_alloc i2c_piix4 i2c_core button evdev ext3 jbd mbcache
virtio_blk ide_cd_mod cdrom ata_generic libata scsi_mod dock
ide_pci_generic floppy virtio_pci uhci_hcd usbcore piix ide_core
thermal processor fan thermal_sys
[1266862591.699790]
[1266862591.699790] Pid: 4056, comm: logcheck Not tainted (2.6.26-1-486 #1)
[1266862591.699790] EIP: 0060:[<c0115324>] EFLAGS: 00000202 CPU: 0
[1266862591.699790] EIP is at ptep_set_access_flags+0x3e/0x6e
[1266862591.699790] EAX: 19070067 EBX: 09661cc0 ECX: ddb0d984 EDX: 09661cc0
[1266862591.699790] ESI: ddb0d984 EDI: 00000001 EBP: ddb0541c ESP: dededeb0
[1266862591.699790]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[1266862591.699791] CR0: 8005003b CR2: 09661cc0 CR3: 1dc49000 CR4: 00000690
[1266862591.699791] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[1266862591.699791] DR6: ffff0ff0 DR7: 00000400
[1266862591.699791]  [<c0154c38>] ? do_wp_page+0x3db/0x434
[1266862591.699791]  [<c011314b>] ? pvclock_clocksource_read+0x4b/0xd0
[1266862591.699791]  [<c011314b>] ? pvclock_clocksource_read+0x4b/0xd0
[1266862591.699791]  [<c0155da9>] ? handle_mm_fault+0x55a/0x5d2
[1266862591.699791]  [<c0116b87>] ? __dequeue_entity+0x1f/0x71
[1266862591.699791]  [<c0113ac2>] ? do_page_fault+0x294/0x5ea
[1266862591.699791]  [<c011f275>] ? __do_softirq+0x3e/0x87
[1266862591.699791]  [<c011382e>] ? do_page_fault+0x0/0x5ea
[1266862591.699791]  [<c02a6a1a>] ? error_code+0x6a/0x70
[1266862591.699791]  =======================


 efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat
hypercall  insn_emul  insn_emul     invlpg   io_exits  irq_exits
irq_injec  irq_windo  kvm_reque  largepage  mmio_exit  mmu_cache
mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc  mmu_shado
mmu_unsyn  nmi_injec  nmi_windo   pf_fixed   pf_guest  remote_tl
request_n  signal_ex  tlb_flush
         0       1848          0          0          0          5
    0        948          0          0          0        646
653          0          0          0          0          0          0
        0          0          0          0          0          0
   0          0          0          0          0          0          0
       150
         0       1852          0          0          0          5
    0        949          0          0          0        649
654          0          0          0          0          0          0
        0          0          0          0          0          0
   0          0          0          0          0          0          0
       149
         0       1848          0          0          0          5
    0        949          0          0          0        649
649          0          0          0          0          0          0
        0          0          0          0          0          0
   0          0          0          0          0          0          0
       151
         0       1843          0          0          0          6
    0        951          0          0          0        649
645          0          0          0          0          0          0
        0          0          0          0          0          0
   0          0          0          0          0          0          0
       149
         0       1825          0          0          0          6
    0        946          0          0          0        649
625          0          0          0          0          0          0
        0          0          0          0          0          0
   0          0          0          0          0          0          0
       150
         0       1832          0          0          0          6
    0        948          0          0          0


I've tried a dd if=/dev/zero of=/tmp/zero.file bs=10M count=100 to
test IO in the hanging guest, and now the console hangs there. Dong an
strace shows:
select(17, [4 7 9 10 11 12 14 16], [], [], {1, 0}) = 2 (in [12 14], left {1, 0})
read(12, 0x7fff94b745e0, 8)             = -1 EIO (Input/output error)
write(15, "\1\0\0\0\0\0\0\0"..., 8)     = 8
clock_gettime(CLOCK_MONOTONIC, {263807, 676112562}) = 0
clock_gettime(CLOCK_MONOTONIC, {263807, 676175416}) = 0
clock_gettime(CLOCK_MONOTONIC, {263807, 676237712}) = 0
timer_gettime(0, {it_interval={0, 0}, it_value={0, 9550245}}) = 0
read(14, "\2\0\0\0\0\0\0\0"..., 8)      = 8
select(17, [4 7 9 10 11 14 16], [], [], {1, 0}) = 1 (in [16], left {0, 992000})
read(16, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
128) = 128
rt_sigaction(SIGALRM, NULL, {0x405980, ~[KILL STOP RTMIN RT_1],
SA_RESTORER, 0x7fe58c0aaa80}, 8) = 0
write(5, "\0"..., 1)                    = 1
read(16, 0x7fff94b74950, 128)           = -1 EAGAIN (Resource
temporarily unavailable)
select(17, [4 7 9 10 11 14 16], [], [], {1, 0}) = 1 (in [4], left {1, 0})
read(4, "\0"..., 512)                   = 1
read(4, 0x7fff94b747e0, 512)            = -1 EAGAIN (Resource
temporarily unavailable)
clock_gettime(CLOCK_MONOTONIC, {263807, 686388502}) = 0
clock_gettime(CLOCK_MONOTONIC, {263807, 686449959}) = 0
clock_gettime(CLOCK_MONOTONIC, {263807, 686511137}) = 0
clock_gettime(CLOCK_MONOTONIC, {263807, 686572035}) = 0

Should I start a different thread for this issue, to not mix things up
with chris problem?

+rl

On Fri, Nov 21, 2008 at 8:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Thu, Nov 20, 2008 at 09:10:57AM -0800, chris@versecorp.net wrote:
>> On Wed, Nov 19, 2008 at 02:43:42PM -0800, chris@versecorp.net wrote:
>> > Thanks for the responses,
>> >
>> > I'm not sure if my problem is the same as Roland's, but it definitely sounds
>> > plausible.  I had been running ntpdate in the host to synchronize time every hour (in a cron job), so it sounds as if we could be seeing the same issue.
>> >
>>
>> Actually, with ntpdate taken out of crontab, I'm still seeing periodic
>> hangs, so it's either a different problem or I'm hitting it in a
>> different manner.
>>
>> OK, I installed kvm-79 and kernel 2.6.27.6, and here's the the kvm-stat output
>> with 1 guest hung and 3 more operational:
>
> <snip>
>
>> If I shut down the 3 operational guests leaving just the hung guest, the
>> kvm-stat output is all 0s:
>>
>>  efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall  insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_windo  largepage  mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc  mmu_shado  nmi_windo   pf_fixed   pf_guest  remote_tl  request_i  signal_ex  tlb_flush
>>          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
>
> So the guest is not actually running here, which means its
> QEMU that its hanging at.
>
>> The hung guest in this case was run with this command:
>>
>> sudo /usr/local/bin/qemu-system-x86_64             \
>>      -daemonize                                    \
>>      -no-kvm-irqchip                               \
>>      -hda Imgs/ndev_root.img                       \
>>      -m 1024                                       \
>>      -cdrom ISOs/ubuntu-8.10-server-amd64.iso      \
>>      -vnc :4                                       \
>>      -net nic,macaddr=DE:AD:BE:EF:04:04,model=e1000 \
>>      -net tap,ifname=tap4,script=/home/chris/kvm/qemu-ifup.sh \
>>      >>& Logs/ndev_run.log
>>
>>
>> I should also mention that when the guest is hung, I can still switch
>> to the monitor with ctrl-alt 2. So, at least it's a little bit alive.
>
> In coma perhaps.
>
>> I've also noticed that the behavior with the hung guest is slightly
>> different on kvm-79 than it was earlier. When the guest hangs, the kvm
>> process in the host doesn't spin at 100% busy any longer - the guest is
>> just unresponsive at both the network and VNC console.
>
>> Also, I've noticed that if I reset the guest from the monitor, the
>> guest will boot up again, and I can get through to it on the network,
>> but strangely, the mouse and keyboard will still be hung at the
>> VNC console (except that I can still switch back and forth to the
>> monitor).
>>
>> Hope some of this helps, let me know if you need to me to provide any
>> other troubleshooting info.
>
> $ gdb -p pid-of-qemu
>
> (gdb) info threads
>
> Print the backtrace for every thread with:
>
> (gdb) thread N
> (gdb) bt
>
>



-- 
Roland Lammel
QuikIT - IT Lösungen - flexibel und schnell
Web: http://www.quikit.at
Email: info@quikit.at

"Enjoy your job, make lots of money, work within the law. Choose any two."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-21 21:01             ` Hangs Daniel P. Berrange
@ 2008-11-21 23:46               ` Roland Lammel
  2008-12-06 23:18                 ` Hangs Roland Lammel
  0 siblings, 1 reply; 26+ messages in thread
From: Roland Lammel @ 2008-11-21 23:46 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Glauber Costa, Marcelo Tosatti, Chris Jones, kvm,
	Glauber de Oliveira Costa, Gerd Hoffmann

On Fri, Nov 21, 2008 at 10:01 PM, Daniel P. Berrange
<berrange@redhat.com> wrote:
> On Fri, Nov 21, 2008 at 09:29:43PM +0100, Roland Lammel wrote:
>> On Fri, Nov 21, 2008 at 4:55 PM, Glauber Costa <glommer@redhat.com> wrote:
>> > On Thu, Nov 20, 2008 at 02:56:00AM +0100, Marcelo Tosatti wrote:
>> >>
>> >> Why are you using -no-acpi? Perhaps switch the guest to acpi_pm to isolate
>> >> kvm-clock issues?
>>
>> I've changed one guest to use acpi and use the acpi_pm clock source.
>>
>> Running now with:
>> /usr/local/bin/qemu-system-x86_64 -S -M pc -m 500 -smp 1 -name bit
>> -monitor pty -no-acpi -boot c -drive
>> file=/var/kvm/bit.img,if=virtio,index=0,boot=on -net
>> nic,macaddr=24:42:53:21:52:45,vlan=0,model=virtio -net
>> tap,fd=11,script=,vlan=0,ifname=vnet0 -serial tcp:127.0.0.1:50401
>> -parallel none -usb -vnc 0.0.0.0:45001
>>
>> Actually I had similar problems when using kvm-72 and acpi, and after
>> switching to libvirt for configuration it automatically changed to
>> no-acpi which I just left that way to try it. I've read mixed
>> recommendaction concerning ACPI.
>
> FYI, libvirt allows you to turn ACPI on or off. In the XML document,
> the top level <domain> tag can contains a set of features, one of
> which is ACPI, eg.
>
>
>  <domain type='kvm'>
>    <name>foo</name>
>    .....
>    <features>
>      <acpi/>
>    </features>
>    ....
>  </domain>
>
> If the '<acpi/>' tag is left out, libvirt will add -no-acpi to QEMU
> command line.

Thanks, just found the acpi flag today and am using it now for the domains.

> For more information check out the docs here:
>
>  http://libvirt.org/formatdomain.html#elementsFeatures
>
>> So should acpi be enabled generelly (for modern kernels and systems)?
>> Should ntpd be running on the guests, or just the host when using acpi_pm?
>
> The virt-install & virt-manager tools will enable ACPI by default for
> nearly all OS, except for a couple of very old Windows which don;t
> commonly work

I actually installed the domain manually once and just cloned it, but
thanks to point that out!

Regards

+rl
-- 
Roland Lammel
QuikIT - IT Lösungen - flexibel und schnell
Web: http://www.quikit.at
Email: info@quikit.at

"Enjoy your job, make lots of money, work within the law. Choose any two."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-21 19:32   ` Hangs Marcelo Tosatti
  2008-11-21 23:43     ` Hangs Roland Lammel
@ 2008-11-22 17:54     ` chris
       [not found]       ` <519a8b110811280305v764fade1w9d02f5c9188f56e5@mail.gmail.com>
  1 sibling, 1 reply; 26+ messages in thread
From: chris @ 2008-11-22 17:54 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, rl

> 
> In coma perhaps.
> 
> > let me know if you need to me to provide any other troubleshooting info.
> 
> $ gdb -p pid-of-qemu
> 
> (gdb) info threads
> 
> Print the backtrace for every thread with:
> 
> (gdb) thread N
> (gdb) bt
> 

OK, here is the gdb log with the guest stuck in this state:

chris@k9[1]:~/pkgs/kvm/kvm-79% ps aux | grep qemu
root       414  1.9  5.0 1131616 197852 ?      Sl   09:22   0:28 /usr/local/bin/qemu-system-x86_64 -daemonize -serial file:Logs/deb_db_serial.log -no-kvm-irqchip -hda Imgs/debdb_root.img -hdb Imgs/debdb_ora.img -m 1024 -cdrom ISOs/ubuntu-8.10-server-amd64.iso -vnc :2 -net nic,macaddr=DE:AD:BE:EF:02:02,model=e1000 -net tap,ifname=tap2,script=/home/chris/kvm/qemu-ifup.sh
chris@k9[2]:~/pkgs/kvm/kvm-79% sudo gdb -p 414
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Attaching to process 414
Reading symbols from /usr/local/bin/qemu-system-x86_64...done.
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /usr/lib/libgnutls.so.26...done.
Loaded symbols for /usr/lib/libgnutls.so.26
Reading symbols from /lib/librt.so.1...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x7f36f1f306e0 (LWP 414)]
[New Thread 0x414f1950 (LWP 422)]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libutil.so.1...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /usr/lib/libSDL-1.2.so.0...done.
Loaded symbols for /usr/lib/libSDL-1.2.so.0
Reading symbols from /lib/libncurses.so.5...done.
Loaded symbols for /lib/libncurses.so.5
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /usr/lib/libtasn1.so.3...done.
Loaded symbols for /usr/lib/libtasn1.so.3
Reading symbols from /lib/libgcrypt.so.11...done.
Loaded symbols for /lib/libgcrypt.so.11
Reading symbols from /lib/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /usr/lib/libasound.so.2...done.
Loaded symbols for /usr/lib/libasound.so.2
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libdirectfb-1.0.so.0...done.
Loaded symbols for /usr/lib/libdirectfb-1.0.so.0
Reading symbols from /usr/lib/libfusion-1.0.so.0...done.
Loaded symbols for /usr/lib/libfusion-1.0.so.0
Reading symbols from /usr/lib/libdirect-1.0.so.0...done.
Loaded symbols for /usr/lib/libdirect-1.0.so.0
Reading symbols from /lib/libgpg-error.so.0...done.
Loaded symbols for /lib/libgpg-error.so.0
0x00007f36f084b482 in select () from /lib/libc.so.6
(gdb) info threads
  2 Thread 0x414f1950 (LWP 422)  0x00007f36f07a03e1 in sigtimedwait ()
   from /lib/libc.so.6
  1 Thread 0x7f36f1f306e0 (LWP 414)  0x00007f36f084b482 in select ()
   from /lib/libc.so.6
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f36f1f306e0 (LWP 414))]#0  0x00007f36f084b482 in select () from /lib/libc.so.6
(gdb) bt
#0  0x00007f36f084b482 in select () from /lib/libc.so.6
#1  0x00000000004094cb in main_loop_wait (timeout=0)
    at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4719
#2  0x000000000050a7ea in kvm_main_loop ()
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:619
#3  0x000000000040fafc in main (argc=<value optimized out>,
    argv=0x7ffff9f41948) at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4871
(gdb) thread 2
[Switching to thread 2 (Thread 0x414f1950 (LWP 422))]#0  0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6
(gdb) bt
#0  0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6
#1  0x000000000050a560 in kvm_main_loop_wait (env=0xc319e0, timeout=0)
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:284
#2  0x000000000050aaf7 in ap_main_loop (_env=<value optimized out>)
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:425
#3  0x00007f36f11ba3ea in start_thread () from /lib/libpthread.so.0
#4  0x00007f36f0852c6d in clone () from /lib/libc.so.6
#5  0x0000000000000000 in ?? ()

Thanks for your help, let me know if I can provide more.
Chris 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
       [not found]       ` <519a8b110811280305v764fade1w9d02f5c9188f56e5@mail.gmail.com>
@ 2008-11-28 12:35         ` xming
  2008-12-02 10:47           ` Hangs xming
  0 siblings, 1 reply; 26+ messages in thread
From: xming @ 2008-11-28 12:35 UTC (permalink / raw)
  To: kvm

My guests do travel too. The host is vanilla 2.6.27.6 with kvm-79.

Right now I have a guest which looks like this:

# uname -a
Linux spaceball 2.6.27.6 #1 SMP Fri Nov 14 11:51:10 CET 2008 i686 QEMU
Virtual CPU version 0.9.1 AuthenticAMD GNU/Linux

# uptime
 02:00:11 up 14663 days, 18:37,  5 users,  load average: 0.99, 0.97, 0.91

# uptime
 02:11:10 up 14663 days, 18:48,  5 users,  load average: 0.99, 0.97, 0.91

Other symptoms are the same as my previous report. top hangs, sync hangs, ....

alt-sysrq-t,q.m show nothing.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-28 12:35         ` Hangs xming
@ 2008-12-02 10:47           ` xming
  2008-12-02 12:09             ` Hangs Avi Kivity
  0 siblings, 1 reply; 26+ messages in thread
From: xming @ 2008-12-02 10:47 UTC (permalink / raw)
  To: kvm

The same guest did it again.

# uname -a
Linux spaceball 2.6.27.6 #1 SMP Fri Nov 14 11:51:10 CET 2008 i686 QEMU
Virtual CPU version 0.9.1 AuthenticAMD GNU/Linux

# date
Thu Dec 19 01:54:27 WET 1912

# uptime
 01:54:29 up 14666 days, 21:17, 12 users,  load average: 3.99, 3.97, 3.91

What can I do to provide more info?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-02 10:47           ` Hangs xming
@ 2008-12-02 12:09             ` Avi Kivity
  2008-12-02 20:58               ` Hangs chris
  0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2008-12-02 12:09 UTC (permalink / raw)
  To: xming; +Cc: kvm

xming wrote:
> The same guest did it again.
>
> # uname -a
> Linux spaceball 2.6.27.6 #1 SMP Fri Nov 14 11:51:10 CET 2008 i686 QEMU
> Virtual CPU version 0.9.1 AuthenticAMD GNU/Linux
>
> # date
> Thu Dec 19 01:54:27 WET 1912
>
> # uptime
>  01:54:29 up 14666 days, 21:17, 12 users,  load average: 3.99, 3.97, 3.91
>
> What can I do to provide more info?
>   

A way to reproduce would be best.  If you have access to multiple hosts, 
try to isolate whether it happens only on amd or only on intel.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-02 12:09             ` Hangs Avi Kivity
@ 2008-12-02 20:58               ` chris
  2008-12-02 23:01                 ` Hangs xming
  2008-12-03 10:44                 ` Hangs Avi Kivity
  0 siblings, 2 replies; 26+ messages in thread
From: chris @ 2008-12-02 20:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Dec 02, 2008 at 02:09:39PM +0200, Avi Kivity wrote:
> xming wrote:
> >The same guest did it again.
> >
> ># uname -a
> >Linux spaceball 2.6.27.6 #1 SMP Fri Nov 14 11:51:10 CET 2008 i686 QEMU
> >Virtual CPU version 0.9.1 AuthenticAMD GNU/Linux
> >
> ># date
> >Thu Dec 19 01:54:27 WET 1912
> >
> ># uptime
> > 01:54:29 up 14666 days, 21:17, 12 users,  load average: 3.99, 3.97, 3.91
> >
> >What can I do to provide more info?
> >  
> 
> A way to reproduce would be best.  If you have access to multiple hosts, 
> try to isolate whether it happens only on amd or only on intel.
> 
> -- 
> error compiling committee.c: too many arguments to function
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


I have a way to reproduce my instance of the problem easily now.   I was trying
to build a new kernel on my guest,  and found that depmod hangs guests every 
time. 
   In my case, I only have an amd processor - I don't have an intel 
host to try it on, right now,  but it happens on Ubuntu 8.04
and Ubuntu 8.10 guests, both using kvm-79 and the version of kvm that ships
with ubuntu 8.10.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-02 20:58               ` Hangs chris
@ 2008-12-02 23:01                 ` xming
  2008-12-03  1:20                   ` Hangs chris
  2008-12-03 10:44                 ` Hangs Avi Kivity
  1 sibling, 1 reply; 26+ messages in thread
From: xming @ 2008-12-02 23:01 UTC (permalink / raw)
  To: chris; +Cc: kvm

> I have a way to reproduce my instance of the problem easily now.   I was trying
> to build a new kernel on my guest,  and found that depmod hangs guests every
> time.
>   In my case, I only have an amd processor - I don't have an intel
> host to try it on, right now,  but it happens on Ubuntu 8.04
> and Ubuntu 8.10 guests, both using kvm-79 and the version of kvm that ships
> with ubuntu 8.10.

I have AMD too, vanilla kernel 2.6.27.6 and kvm-79 (although I have
this before 79).
the guest is SMP (UP guests hang too but less frequent).

depmod does not hang here (not reproducible). Heavy CPU + heavy IO on nfs mounts
triggers this on my side.

I have a very subjective feeling that it happens more frequently when
the host has
less uptime (freshly rebooted).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-02 23:01                 ` Hangs xming
@ 2008-12-03  1:20                   ` chris
  2008-12-03  9:13                     ` Hangs xming
  0 siblings, 1 reply; 26+ messages in thread
From: chris @ 2008-12-03  1:20 UTC (permalink / raw)
  To: xming; +Cc: kvm

Sounds like your configuration is very similar to mine.  
I'm also on a vanilla kernel (2.6.27.7 in my case) with kvm-79 and 
AMD processors.

You sparked my curiosity on the depmod -a issue, so I spent some time trying
it on different configurations.

I have two servers:
  * 1.8GHz AMD Opteron 2210 on an HP DL385G2,  AMD Opteron 2210
  * 2.5GHz AMD Athlon 4850e on a "green" build w/ a Gigabyte GA-MA74Gm-S2 

I can reproduce it on the Gigagbyte build with ease.

On the HP DL385G2 server, I could reproduce it on one of my guests but only
once in every four times (and with multiple guests running, not sure if that
made a difference).

Like you, I also have had the feeling that my occassional hangs were more 
likely on a freshly reboot, but don't have anything real conclusive to prove it.

I'm afraid I don't have an Intel-based server around right now to see if this
is an AMD-only issue.  I might be able to scrounge up an HP DL380G5 
(with an Intel Core 2) but I'm not sure.

Chris


On Wed, Dec 03, 2008 at 12:01:32AM +0100, xming wrote:
> > I have a way to reproduce my instance of the problem easily now.   I was trying
> > to build a new kernel on my guest,  and found that depmod hangs guests every
> > time.
> >   In my case, I only have an amd processor - I don't have an intel
> > host to try it on, right now,  but it happens on Ubuntu 8.04
> > and Ubuntu 8.10 guests, both using kvm-79 and the version of kvm that ships
> > with ubuntu 8.10.
> 
> I have AMD too, vanilla kernel 2.6.27.6 and kvm-79 (although I have
> this before 79).
> the guest is SMP (UP guests hang too but less frequent).
> 
> depmod does not hang here (not reproducible). Heavy CPU + heavy IO on nfs mounts
> triggers this on my side.
> 
> I have a very subjective feeling that it happens more frequently when
> the host has
> less uptime (freshly rebooted).
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-03  1:20                   ` Hangs chris
@ 2008-12-03  9:13                     ` xming
  0 siblings, 0 replies; 26+ messages in thread
From: xming @ 2008-12-03  9:13 UTC (permalink / raw)
  To: kvm

Neither do I have access to an Intel box.

I have an AMD Athlon(tm) X2 Dual Core Processor BE-2300
on asus m2a vm with 6GB RAM.

The host is running 32-bit Gentoo with PAE. Guests are 32-bit
Gentoo (w/o PAE, tried with same hangs) using virt block and
virt net.

I suspect it's the load (CPU + IO) that will cause the hang and
it's reverse proportional with the uptime.

When the host just booted, it sometimes even have trouble to
boot the first guest completely, but killing it and restart it would
boot.

I have tried different clocks for the guest, (kvm and acpi_pm), it
doess't matter. I have tried different kernel config, last attempt
is to build separate kernel for host and guest, still hangs.

I also tried to disable oos_shadow, but that doesn't change anything.

The host is booted with (as Marcelo told me):
kernel /boot/bzImage-2.6.27.6-host root=/dev/md0 clocksource=acpi_pm
notsc profile=kvm

but I don't have /proc/profile, so can't do readprofile.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-02 20:58               ` Hangs chris
  2008-12-02 23:01                 ` Hangs xming
@ 2008-12-03 10:44                 ` Avi Kivity
  2008-12-03 17:49                   ` Hangs chris
  1 sibling, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2008-12-03 10:44 UTC (permalink / raw)
  To: chris; +Cc: kvm

chris@versecorp.net wrote:
>
> I have a way to reproduce my instance of the problem easily now.   I was trying
> to build a new kernel on my guest,  and found that depmod hangs guests every 
> time. 
>    In my case, I only have an amd processor - I don't have an intel 
> host to try it on, right now,  but it happens on Ubuntu 8.04
> and Ubuntu 8.10 guests, both using kvm-79 and the version of kvm that ships
> with ubuntu 8.10.
>   

What's your guest, how is qemu launched (command line)?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-03 10:44                 ` Hangs Avi Kivity
@ 2008-12-03 17:49                   ` chris
  2008-12-18 18:05                     ` Hangs chris
  0 siblings, 1 reply; 26+ messages in thread
From: chris @ 2008-12-03 17:49 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Wed, Dec 03, 2008 at 12:44:54PM +0200, Avi Kivity wrote:
> chris@versecorp.net wrote:
> >
> >I have a way to reproduce my instance of the problem easily now.   I was 
> >trying
> >to build a new kernel on my guest,  and found that depmod hangs guests 
> >every time. 
> >   In my case, I only have an amd processor - I don't have an intel 
> >host to try it on, right now,  but it happens on Ubuntu 8.04
> >and Ubuntu 8.10 guests, both using kvm-79 and the version of kvm that ships
> >with ubuntu 8.10.
> >  
> 
> What's your guest, how is qemu launched (command line)?
> 
> -- 
> error compiling committee.c: too many arguments to function
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

The guest is Ubuntu 8.10 server (64-bit version).  I also have the same 
problems with Ubuntu 8.04LTS server.

Here's the command line:

sudo /usr/local/bin/qemu-system-x86_64        \
     -no-kvm-irqchip                         \
     -daemonize                               \
     -hda Imgs/sam_home.img                   \
     -m 512                                   \
     -cdrom ISOs/ubuntu-8.10-server-amd64.iso \
     -parallel /dev/lp0                       \
     -vnc :1                                  \
     -net nic,macaddr=DE:AD:BE:EF:01:01,model=e1000 \
     -net tap,ifname=tap1,script=/home/chris/kvm/qemu-ifup.sh \
     >>& Logs/sam_run.log

Earlier in the mail chain, Marcelo had me run vmstat when it was hung,
and it was all zeros.  He also asked for a stack trace on the qemu and it
showed two threads:

	(gdb) info threads
	  2 Thread 0x414f1950 (LWP 422)  0x00007f36f07a03e1 in sigtimedwait ()
	   from /lib/libc.so.6
	  1 Thread 0x7f36f1f306e0 (LWP 414)  0x00007f36f084b482 in select ()
	   from /lib/libc.so.6
	(gdb) thread 1
	[Switching to thread 1 (Thread 0x7f36f1f306e0 (LWP 414))]#0  0x00007f36f084b482
	+in select () from /lib/libc.so.6
	(gdb) bt
	#0  0x00007f36f084b482 in select () from /lib/libc.so.6
	#1  0x00000000004094cb in main_loop_wait (timeout=0)
	    at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4719
	#2  0x000000000050a7ea in kvm_main_loop ()
	    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:619
	#3  0x000000000040fafc in main (argc=<value optimized out>,
	    argv=0x7ffff9f41948) at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4871
	(gdb) thread 2
	[Switching to thread 2 (Thread 0x414f1950 (LWP 422))]#0  0x00007f36f07a03e1 in
	+sigtimedwait () from /lib/libc.so.6
	(gdb) bt
	#0  0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6
	#1  0x000000000050a560 in kvm_main_loop_wait (env=0xc319e0, timeout=0)
	    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:284
	#2  0x000000000050aaf7 in ap_main_loop (_env=<value optimized out>)
	    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:425
	#3  0x00007f36f11ba3ea in start_thread () from /lib/libpthread.so.0
	#4  0x00007f36f0852c6d in clone () from /lib/libc.so.6
	#5  0x0000000000000000 in ?? ()

If I can provide any other debug info I'm happy to.   I'm beginning to suspect
you'll be able to easily reproduce it if you run Ubuntu 8.10 as a guest on 
amd processors.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-11-21 23:46               ` Hangs Roland Lammel
@ 2008-12-06 23:18                 ` Roland Lammel
  2008-12-09  0:34                   ` Hangs xming
  0 siblings, 1 reply; 26+ messages in thread
From: Roland Lammel @ 2008-12-06 23:18 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Glauber Costa, Marcelo Tosatti, Chris Jones, kvm,
	Glauber de Oliveira Costa, Gerd Hoffmann

I just like to note that my issues with the clock were solved when I
switched to acpi_pm clock. I can still reproduce the hang with the
kvm_clock though.

I still use ntpd in the guests, as it is not clear to me whether the
acpi_pm clock synchronizes to the hostclock or not.

random@bit:~$ uptime
 00:15:21 up 13 days, 21:44,  1 user,  load average: 0.00, 0.00, 0.00

In case you want some more info on the hangs when using kvm_clock,
please drop me a mail.

Cheers and thanks for all your help

+rl

On Sat, Nov 22, 2008 at 00:46, Roland Lammel <rl@brabbel.net> wrote:
> On Fri, Nov 21, 2008 at 10:01 PM, Daniel P. Berrange
> <berrange@redhat.com> wrote:
>> On Fri, Nov 21, 2008 at 09:29:43PM +0100, Roland Lammel wrote:
>>> On Fri, Nov 21, 2008 at 4:55 PM, Glauber Costa <glommer@redhat.com> wrote:
>>> > On Thu, Nov 20, 2008 at 02:56:00AM +0100, Marcelo Tosatti wrote:
>>> >>
>>> >> Why are you using -no-acpi? Perhaps switch the guest to acpi_pm to isolate
>>> >> kvm-clock issues?
>>>
>>> I've changed one guest to use acpi and use the acpi_pm clock source.
>>>
>>> Running now with:
>>> /usr/local/bin/qemu-system-x86_64 -S -M pc -m 500 -smp 1 -name bit
>>> -monitor pty -no-acpi -boot c -drive
>>> file=/var/kvm/bit.img,if=virtio,index=0,boot=on -net
>>> nic,macaddr=24:42:53:21:52:45,vlan=0,model=virtio -net
>>> tap,fd=11,script=,vlan=0,ifname=vnet0 -serial tcp:127.0.0.1:50401
>>> -parallel none -usb -vnc 0.0.0.0:45001
>>>
>>> Actually I had similar problems when using kvm-72 and acpi, and after
>>> switching to libvirt for configuration it automatically changed to
>>> no-acpi which I just left that way to try it. I've read mixed
>>> recommendaction concerning ACPI.
>>
>> FYI, libvirt allows you to turn ACPI on or off. In the XML document,
>> the top level <domain> tag can contains a set of features, one of
>> which is ACPI, eg.
>>
>>
>>  <domain type='kvm'>
>>    <name>foo</name>
>>    .....
>>    <features>
>>      <acpi/>
>>    </features>
>>    ....
>>  </domain>
>>
>> If the '<acpi/>' tag is left out, libvirt will add -no-acpi to QEMU
>> command line.
>
> Thanks, just found the acpi flag today and am using it now for the domains.
>
>> For more information check out the docs here:
>>
>>  http://libvirt.org/formatdomain.html#elementsFeatures
>>
>>> So should acpi be enabled generelly (for modern kernels and systems)?
>>> Should ntpd be running on the guests, or just the host when using acpi_pm?
>>
>> The virt-install & virt-manager tools will enable ACPI by default for
>> nearly all OS, except for a couple of very old Windows which don;t
>> commonly work
>
> I actually installed the domain manually once and just cloned it, but
> thanks to point that out!
>
> Regards
>
> +rl
> --
> Roland Lammel
> QuikIT - IT Lösungen - flexibel und schnell
> Web: http://www.quikit.at
> Email: info@quikit.at
>
> "Enjoy your job, make lots of money, work within the law. Choose any two."
>



-- 
Roland Lammel
QuikIT - IT Lösungen - flexibel und schnell
Web: http://www.quikit.at
Email: info@quikit.at

"Enjoy your job, make lots of money, work within the law. Choose any two."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-06 23:18                 ` Hangs Roland Lammel
@ 2008-12-09  0:34                   ` xming
  0 siblings, 0 replies; 26+ messages in thread
From: xming @ 2008-12-09  0:34 UTC (permalink / raw)
  To: Roland Lammel
  Cc: Daniel P. Berrange, Glauber Costa, Marcelo Tosatti, Chris Jones,
	kvm, Glauber de Oliveira Costa, Gerd Hoffmann

updated to kvm-80 and vanilla kernel 2.6.27.8.

This is still happening after a few hours on a idle host.

# uptime
 01:19:43 up  8:02,  1 user,  load average: 0.08, 0.03, 0.01
# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0     68   7516   4084 379072    0    0    41    15   27   83  0  0 99  0
 1  0     68   7516   4084 379072    0    0     0     0  970 681963  0  0 100  0
 1  0     68   7516   4084 379072    0    0     0     0  991 708242  0  0 100  0
 0  0     68   7516   4084 379072    0    0     0     0  975 710206  0  0 100  0
 0  0     68   7308   4084 379072    0    0     0     0 1026 674657  1  1 98  0
 0  0     68   7308   4084 379072    0    0     0     0 1000 650320  0  0 100  0
 1  0     68   7308   4084 379072    0    0     0     0  995 696319  0  0 100  0
 1  0     68   7308   4084 379072    0    0     0     0  985 707663  0  0 100  0
 1  0     68   7308   4084 379072    0    0     0     0  987 706937  0  0 100  0
 0  0     68   7308   4084 379072    0    0     0     0  989 706222  0  0 100  0
 1  0     68   7308   4084 379072    0    0     0     0  978 700453  0  0 100  0
^C
# uptime
 01:19:43 up  8:02,  1 user,  load average: 0.08, 0.03, 0.01

here are the outputs of sysrq-t, sysreq-m and sysrq-w

Dec  9 01:19:43 builder SysRq : Show State
Dec  9 01:19:43 builder task                PC stack   pid father
Dec  9 01:19:43 builder init          S df82bb3c     0     1      0
Dec  9 01:19:43 builder df830000 00000086 00000002 df82bb3c df82bb44
00000000 df8310e0 df830154
Dec  9 01:19:43 builder df830154 c1413500 c03ce080 c03ce080 c03ce080
df82bb64 01b3bb70 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f df82bb64
01b48dd0 00000000 00001388
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c013546d>] add_wait_queue+0x1d/0x50
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<c017d50e>] do_select+0x3ae/0x4a0
Dec  9 01:19:43 builder [<c017db60>] __pollwait+0x0/0x100
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c013a66d>] getnstimeofday+0x3d/0xe0
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01e2f82>] __next_cpu+0x12/0x20
Dec  9 01:19:43 builder [<c01e2f82>] __next_cpu+0x12/0x20
Dec  9 01:19:43 builder [<c011b335>] find_busiest_group+0x195/0x560
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01e8303>] number+0x2d3/0x2e0
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01e8ba7>] vsnprintf+0x307/0x5e0
Dec  9 01:19:43 builder [<c02f0b73>] _spin_lock_irq+0x13/0x20
Dec  9 01:19:43 builder [<c012b426>] run_timer_softirq+0x166/0x1a0
Dec  9 01:19:43 builder [<c013ebba>] tick_program_event+0x2a/0x40
Dec  9 01:19:43 builder [<c0127118>] __do_softirq+0x88/0x100
Dec  9 01:19:43 builder [<c0183af0>] destroy_inode+0x20/0x40
Dec  9 01:19:43 builder [<c0183af0>] destroy_inode+0x20/0x40
Dec  9 01:19:43 builder [<c02f0a85>] _spin_lock+0x5/0x10
Dec  9 01:19:43 builder [<c0182103>] __d_lookup+0xe3/0x110
Dec  9 01:19:43 builder [<c0183af0>] destroy_inode+0x20/0x40
Dec  9 01:19:43 builder [<c018177c>] dput+0x1c/0x120
Dec  9 01:19:43 builder [<c017d7c8>] core_sys_select+0x1c8/0x2f0
Dec  9 01:19:43 builder [<c017a097>] __link_path_walk+0xa27/0xba0
Dec  9 01:19:43 builder [<c01a60e9>] proc_flush_task+0x159/0x280
Dec  9 01:19:43 builder [<c0186de5>] mntput_no_expire+0x15/0xe0
Dec  9 01:19:43 builder [<c017a25f>] path_walk+0x4f/0x90
Dec  9 01:19:43 builder [<c017a35a>] do_path_lookup+0x6a/0x120
Dec  9 01:19:43 builder [<c0179513>] getname+0xb3/0xe0
Dec  9 01:19:43 builder [<c0173c98>] cp_new_stat64+0xf8/0x110
Dec  9 01:19:43 builder [<c017dd42>] sys_select+0xe2/0x1b0
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder kthreadd      S df839fa8     0     2      0
Dec  9 01:19:43 builder df830360 00000046 00000002 df839fa8 df839fb0
00000000 deddde60 df8304b4
Dec  9 01:19:43 builder df8304b4 c1409500 c03ce080 c03ce080 c03ce080
dedddeb0 fffc4930 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f c03751b4
00001010 deddde8c c03eb684
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c0135029>] kthreadd+0x169/0x170
Dec  9 01:19:43 builder [<c0134ec0>] kthreadd+0x0/0x170
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder migration/0   S df83bf90     0     3      2
Dec  9 01:19:43 builder df8306c0 00000046 00000002 df83bf90 df83bf98
00000000 d219bf2c df830814
Dec  9 01:19:43 builder df830814 c1409500 c03ce080 c03ce080 c03ce080
d219bf6c 01ab9906 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f c1409918
c140fdc0 c1409500 00000000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c011ee48>] migration_thread+0x158/0x250
Dec  9 01:19:43 builder [<c011ecf0>] migration_thread+0x0/0x250
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder [<c01351b0>] autoremove_wake_function+0x0/0x50
Dec  9 01:19:43 builder [<c02443f0>] serio_thread+0x0/0x300
Dec  9 01:19:43 builder [<c0134e92>] /0x200
Dec  9 01:19:43 builder [<c0126ab3>] current_fs_time+0x13/0x20
Dec  9 01:19:43 builder [<c014dbc5>] __generic_file_aio_write_nolock+0x255/0x520
Dec  9 01:19:43 builder [<c02580aa>] verify_iovec+0x2a/0x90
Dec  9 01:19:43 builder [<c0250674>] sys_sendmsg+0x164/0x280
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c013a66d>] getnstimeofday+0x3d/0xe0
Dec  9 01:19:43 builder [<c010f75c>] lapic_next_event+0xc/0x10
Dec  9 01:19:43 builder [<c013d8d8>] clockevents_program_event+0xa8/0x120
Dec  9 01:19:43 builder [<c02f0b73>] _spin_lock_irq+0x13/0x20
Dec  9 01:19:43 builder [<c0251abb>] sys_socketcall+0x25b/0x2b0
Dec  9 01:19:43 builder [<c017d0a5>] sys_poll+0x35/0x80
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f 0000000a
00000000 00000000 ffffffff
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef747>] schedule_timeout+0x77/0xd0
Dec  9 01:19:43 builder [<c013546d>] add_wait_queue+0x1d/0x50
Dec  9 01:19:43 builder [<c02844c4>] tcp_poll+0x14/0x160
Dec  9 01:19:43 builder [<c017d50e>] do_select+0x3ae/0x4a0
Dec  9 01:19:43 builder [<c017db60>] __pollwait+0x0/0x100
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c027e08f>] ip_finish_output+0x14f/0x2d0
Dec  9 01:19:43 builder [<c027c9c8>] ip_cork_release+0x28/0x40
Dec  9 01:19:43 builder [<c027de64>] ip_push_pending_frames+0x274/0x350
Dec  9 01:19:43 builder [<c0298c97>] udp_push_pending_frames+0x167/0x3a0
Dec  9 01:19:43 builder [<c02f0b2f>] _spin_lock_bh+0xf/0x20
Dec  9 01:19:43 builder [<c0252783>] release_sock+0x13/0xa0
Dec  9 01:19:43 builder [<c029a0eb>] udp_sendmsg+0x34b/0x690
Dec  9 01:19:43 builder [<c0258057>] memcpy_toiovec+0x37/0x60
Dec  9 01:19:43 builder [<c02f0b2f>] _spin_lock_bh+0xf/0x20
Dec  9 01:19:43 builder [<c02528b5>] lock_sock_nested+0xa5/0xb0
Dec  9 01:19:43 builder [<c02f0b2f>] _spin_lock_bh+0xf/0x20
Dec  9 01:19:43 builder [<c0252783>] release_sock+0x13/0xa0
Dec  9 01:19:43 builder [<c029adb9>] udp_recvmsg+0x1a9/0x2e0
Dec  9 01:19:43 builder [<c02520f5>] sock_common_recvmsg+0x45/0x70
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c0192228>] __find_get_block+0x88/0x190
Dec  9 01:19:43 builder [<c011a19c>] resched_task+0x1c/0x60
Dec  9 01:19:43 builder dfb9a314 c1409500 c03ce080 c03ce080 c03ce080
00000000 00000082 df832880
Dec  9 01:19:43 builder dfb9a314 c02f0bc3 00000282 c0135360 dede3680
dede368c 00000000 00000000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c011da4d>] finish_task_switch+0x2d/0xb0
Dec  9 01:19:43 builder [<c02f0bc3>] _spin_lock_irqsave+0x23/0x30
Dec  9 01:19:43 builder [<c0135360>] prepare_to_wait+0x20/0x70
Dec  9 01:19:43 builder [<c0132655>] worker_thread+0x95/0xb0
Dec  9 01:19:43 builder [<c01351b0>] autoremove_wake_function+0x0/0x50
Dec  9 01:19:43 builder [<c01325c0>] worker_thread+0x0/0xb0
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder df89ed34 c1413500 c03ce080 c03ce080 c03ce080
c02f0b73 c0131ce7 00000000
Dec  9 01:19:43 builder 00000000 c02f0bc3 00000282 c0135360 dede3780
dede378c 00000000 00000000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<e08a668b>] rpc_wake_up+0xb/0x60 [sunrpc]
Dec  9 01:19:43 builder [<c02f0b73>] _spin_lock_irq+0x13/0x20
Dec  9 01:19:43 builder [<c0131ce7>] run_workqueue+0xb7/0x120
Dec  9 01:19:43 builder [<c02f0bc3>] _spin_lock_irqsave+0x23/0x30
Dec  9 01:19:43 builder [<c0135360>] prepare_to_wait+0x20/0x70
Dec  9 01:19:43 builder [<c0132655>] worker_thread+0x95/0xb0
Dec  9 01:19:43 builder rpc.mountd    S dededb3c     0  4102      1
Dec  9 01:19:43 builder df89c6c0 00000086 00000002 dededb3c dededb44
00000000 df565000 df89c814
Dec  9 01:19:43 builder df89c814 c1413500 c03ce080 c03ce080 c03ce080
00000007 fffc4918 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f 00000008
00000000 00000000 ffffffff
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef747>] schedule_timeout+0x77/0xd0
Dec  9 01:19:43 builder [<c013546d>] add_wait_queue+0x1d/0x50
Dec  9 01:19:43 builder [<c02844c4>] tcp_poll+0x14/0x160
Dec  9 01:19:43 builder [<c017d50e>] do_select+0x3ae/0x4a0
Dec  9 01:19:43 builder [<c017db60>] __pollwait+0x0/0x100
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c01a3bc7>] proc_alloc_inode+0x47/0x70
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c014c245>] find_get_page+0x25/0xa0
Dec  9 01:19:43 builder [<c02f0a85>] _spin_lock+0x5/0x10
Dec  9 01:19:43 builder [<c0182103>] __d_lookup+0xe3/0x110
Dec  9 01:19:43 builder [<c0154e1b>] mark_page_accessed+0x2b/0x40
Dec  9 01:19:43 builder [<c018177c>] dput+0x1c/0x120
Dec  9 01:19:43 builder [<c01783b5>] do_lookup+0x65/0x1a0
Dec  9 01:19:43 builder [<c018177c>] dput+0x1c/0x120
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder lockd         S deddbee0     0  4104      2
Dec  9 01:19:43 builder df89f600 00000046 00000002 deddbee0 deddbee8
00000000 c0119d9d df89f754
Dec  9 01:19:43 builder df89f754 c1413500 c03ce080 c03ce080 c03ce080
c1409500 fffc4918 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f df8edd90
deddbf50 ded9ce04 df8ed000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c0119d9d>] update_curr+0x4d/0x70
Dec  9 01:19:43 builder [<c02ef747>] schedule_timeout+0x77/0xd0
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c02f0bc3>] _spin_lock_irqsave+0x23/0x30
Dec  9 01:19:43 builder [<e0880797>] lockd+0x97/0x1a0 [lockd]
Dec  9 01:19:43 builder [<c02ef1bc>] schedule+0x21c/0x620
Dec  9 01:19:43 builder [<e0880700>] lockd+0x0/0x1a0 [lockd]
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder nfsd          S df833b64     0  4105      2
Dec  9 01:19:43 builder df89d7a0 00000046 df833960 df833b64 df89d7a0
c011da4d df833960 df89d8f4
Dec  9 01:19:43 builder df89d8f4 c1409500 c03ce080 c03ce080 c03ce080
deddff44 deddff44 00000282
Dec  9 01:19:43 builder [<e08b42af>] svc_recv+0x26f/0x770 [sunrpc]
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<e08ec7dc>] nfsd+0xbc/0x270 [nfsd]
Dec  9 01:19:43 builder [<c011a28d>] complete+0x3d/0x60
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<e08b42af>] svc_recv+0x26f/0x770 [sunrpc]
Dec  9 01:19:43 builder [<c0119282>] enqueue_task+0x12/0x30
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<e08ec7dc>] nfsd+0xbc/0x270 [nfsd]
Dec  9 01:19:43 builder [<c011a28d>] complete+0x3d/0x60
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder df832f40 00000046 df832f6c c0119d9d c140953c
df832f40 df8321c0 df833094
Dec  9 01:19:43 builder df833094 c1409500 c03ce080 c03ce080 c03ce080
ded2bf44 ded2bf44 00000282
Dec  9 01:19:43 builder c03ea300 c012bac7 00000000 00000282 ded2bf44
01eaabbc df8aff84 ded08000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c0119d9d>] update_curr+0x4d/0x70
Dec  9 01:19:43 builder [<c012bac7>] __mod_timer+0x97/0xb0
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<e08b42af>] svc_recv+0x26f/0x770 [sunrpc]
Dec  9 01:19:43 builder [<c0119282>] enqueue_task+0x12/0x30
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder nfsd          S c0119d9d     0  4109      2
Dec  9 01:19:43 builder df832520 00000046 df83254c c0119d9d c140953c
df832520 df832f40 df832674
Dec  9 01:19:43 builder df832674 c1409500 c03ce080 c03ce080 c03ce080
ded2df44 ded2df44 00000282
Dec  9 01:19:43 builder c03ea300 c012bac7 00000000 00000282 ded2df44
01eaabbc df8aff84 ded09000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c0119d9d>] update_curr+0x4d/0x70
Dec  9 01:19:43 builder [<c012bac7>] __mod_timer+0x97/0xb0
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<e08b42af>] svc_recv+0x26f/0x770 [sunrpc]
Dec  9 01:19:43 builder [<c0119282>] enqueue_task+0x12/0x30
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<e08ec7dc>] nfsd+0xbc/0x270 [nfsd]
Dec  9 01:19:43 builder [<c011a28d>] complete+0x3d/0x60
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder nfsd          S c0119d9d     0  4110      2
Dec  9 01:19:43 builder df833600 00000046 df83362c c0119d9d c140953c
df833600 df832520 df833754
Dec  9 01:19:43 builder df833754 c1409500 c03ce080 c03ce080 c03ce080
dee8df44 dee8df44 00000282
Dec  9 01:19:43 builder c03ea300 c012bac7 00000000 00000282 dee8df44
01eaabbc df8aff84 ded0a000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c0119d9d>] update_curr+0x4d/0x70
Dec  9 01:19:43 builder [<c012bac7>] __mod_timer+0x97/0xb0
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<e08b42af>] svc_recv+0x26f/0x770 [sunrpc]
Dec  9 01:19:43 builder [<c0119282>] enqueue_task+0x12/0x30
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<e08ec7dc>] nfsd+0xbc/0x270 [nfsd]
Dec  9 01:19:43 builder [<c011a28d>] complete+0x3d/0x60
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder nfsd          S c0119d9d     0  4111      2
Dec  9 01:19:43 builder df8332a0 00000046 df8332cc c0119d9d c140953c
df8332a0 df833600 df8333f4
Dec  9 01:19:43 builder df8333f4 c1409500 c03ce080 c03ce080 c03ce080
dee8ff44 dee8ff44 00000282
Dec  9 01:19:43 builder c03ea300 c012bac7 00000000 00000282 dee8ff44
01eaabbc df8aff84 ded0b000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c0119d9d>] update_curr+0x4d/0x70
Dec  9 01:19:43 builder [<c012bac7>] __mod_timer+0x97/0xb0
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<e08b42af>] svc_recv+0x26f/0x770 [sunrpc]
Dec  9 01:19:43 builder [<c02ef1bc>] schedule+0x21c/0x620
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<e08ec7dc>] nfsd+0xbc/0x270 [nfsd]
Dec  9 01:19:43 builder [<c011a28d>] complete+0x3d/0x60
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder nfsd          S deeb9f1c     0  4112      2
Dec  9 01:19:43 builder df832be0 00000046 00000002 deeb9f1c deeb9f24
00000000 df89d7a0 df832d34
Dec  9 01:19:43 builder df832d34 c1409500 c03ce080 c03ce080 c03ce080
deeb9f44 deeb9f44 00000282
Dec  9 01:19:43 builder c03ea300 c012bac7 00000000 00000282 deeb9f44
01eaabbc df8aff84 ded0c000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c012bac7>] __mod_timer+0x97/0xb0
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<e08b42af>] svc_recv+0x26f/0x770 [sunrpc]
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<e08ec7dc>] nfsd+0xbc/0x270 [nfsd]
Dec  9 01:19:43 builder [<c011a28d>] complete+0x3d/0x60
Dec  9 01:19:43 builder [<e08ec720>] nfsd+0x0/0x270 [nfsd]
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder smbd          S deeabb3c     0  4216      1
Dec  9 01:19:43 builder dfb98000 00000086 00000002 deeabb3c deeabb44
00000000 dfb9af40 dfb98154
Dec  9 01:19:43 builder dfb98154 c1409500 c03ce080 c03ce080 c03ce080
00000000 01b47a4c 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f 00000015
00000000 00000000 ffffffff
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef747>] schedule_timeout+0x77/0xd0
Dec  9 01:19:43 builder [<c013546d>] add_wait_queue+0x1d/0x50
Dec  9 01:19:43 builder [<c0176712>] pipe_poll+0x32/0xb0
Dec  9 01:19:43 builder [<c017d50e>] do_select+0x3ae/0x4a0
Dec  9 01:19:43 builder [<c017db60>] __pollwait+0x0/0x100
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c0254499>] sock_def_readable+0x69/0x70
Dec  9 01:19:43 builder [<c012b987>] lock_timer_base+0x27/0x60
Dec  9 01:19:43 builder [<c012bac7>] __mod_timer+0x97/0xb0
Dec  9 01:19:43 builder [<c02f0bc3>] _spin_lock_irqsave+0x23/0x30
Dec  9 01:19:43 builder [<c01351b0>] autoremove_wake_function+0x0/0x50
Dec  9 01:19:43 builder [<c017dcad>] sys_select+0x4d/0x1b0
Dec  9 01:19:43 builder [<c0171781>] sys_read+0x41/0x70
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder smbd          S deeddf94     0  4225   4216
Dec  9 01:19:43 builder deee06c0 00000082 00000002 deeddf94 deeddf9c
00000000 00000000 deee0814
Dec  9 01:19:43 builder deee0814 c1413500 c03ce080 c03ce080 c03ce080
00000000 fffc508e 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f b7f63ff4
b7f6bc80 00000000 deedc000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f deedfb64
01b4a158 00000000 00002710
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<c017d50e>] do_select+0x3ae/0x4a0
Dec  9 01:19:43 builder [<c017db60>] __pollwait+0x0/0x100
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c027e08f>] ip_finish_output+0x14f/0x2d0
Dec  9 01:19:43 builder [<c02a0827>] inet_sendmsg+0x37/0x70
Dec  9 01:19:43 builder [<c02504df>] sock_sendmsg+0xbf/0xf0
Dec  9 01:19:43 builder [<c01351b0>] autoremove_wake_function+0x0/0x50
Dec  9 01:19:43 builder [<c017d7c8>] core_sys_select+0x1c8/0x2f0
Dec  9 01:19:43 builder [<c0250897>] sys_sendto+0x107/0x160
Dec  9 01:19:43 builder [<c013a66d>] getnstimeofday+0x3d/0xe0
Dec  9 01:19:43 builder [<c010f75c>] lapic_next_event+0xc/0x10
Dec  9 01:19:43 builder [<c013d8d8>] clockevents_program_event+0xa8/0x120
Dec  9 01:19:43 builder sshd          S deef7b3c     0  4284      1
Dec  9 01:19:43 builder deee1e60 00000086 00000002 deef7b3c deef7b44
00000000 dfb9b960 deee1fb4
Dec  9 01:19:43 builder deee1fb4 c1413500 c03ce080 c03ce080 c03ce080
00000046 01a678d7 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f 00000005
00000000 00000000 ffffffff
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef747>] schedule_timeout+0x77/0xd0
Dec  9 01:19:43 builder [<c013546d>] add_wait_queue+0x1d/0x50
Dec  9 01:19:43 builder [<c0113535>] handle_vm86_fault+0x285/0x810
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01e8ba7>] vsnprintf+0x307/0x5e0
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c01244bf>] release_task+0x1cf/0x300
Dec  9 01:19:43 builder [<c01244bf>] release_task+0x1cf/0x300
Dec  9 01:19:43 builder [<c01244bf>] release_task+0x1cf/0x300
Dec  9 01:19:43 builder [<c0124507>] release_task+0x217/0x300
Dec  9 01:19:43 builder [<c010b5ee>] restore_i387+0xee/0x100
Dec  9 01:19:43 builder [<c017dcad>] sys_select+0x4d/0x1b0
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder zabbix_agentd S deec7f0c     0  4399      1
Dec  9 01:19:43 builder dfae5b00 00000082 00000002 deec7f0c deec7f14
00000000 c02f0b73 dfae5c54
Dec  9 01:19:43 builder dfae5c54 c1409500 c03ce080 c03ce080 c03ce080
c015b836 fffc5cb9 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f 00000000
dfae5b00 dfae5bf0 00000000
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02f0b73>] _spin_lock_irq+0x13/0x20
Dec  9 01:19:43 builder [<c015b836>] handle_mm_fault+0x4c6/0x6a0
Dec  9 01:19:43 builder [<c01250fc>] do_wait+0x29c/0x350
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c0138390>] hrtimer_wakeup+0x0/0x20
Dec  9 01:19:43 builder [<c0138988>] sys_nanosleep+0x58/0x60
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder [<c02528b5>] lock_sock_nested+0xa5/0xb0
Dec  9 01:19:43 builder [<c02f0b2f>] _spin_lock_bh+0xf/0x20
Dec  9 01:19:43 builder [<c0252783>] release_sock+0x13/0xa0
Dec  9 01:19:43 builder [<c0194265>] invalidate_inode_buffers+0x15/0x80
Dec  9 01:19:43 builder [<c01812d7>] d_kill+0x37/0x50
Dec  9 01:19:43 builder [<c0251932>] sys_socketcall+0xd2/0x2b0
Dec  9 01:19:43 builder [<c016ef67>] filp_close+0x47/0x80
Dec  9 01:19:43 builder [<c0282b82>] inet_csk_accept+0x132/0x230
Dec  9 01:19:43 builder [<c01351b0>] autoremove_wake_function+0x0/0x50
Dec  9 01:19:43 builder [<c02a1445>] inet_accept+0x25/0xa0
Dec  9 01:19:43 builder [<c025175e>] do_accept+0x12e/0x230
Dec  9 01:19:43 builder [<c0290a08>] __tcp_push_pending_frames+0x108/0x760
Dec  9 01:19:43 builder [<c02f0b2f>] _spin_lock_bh+0xf/0x20
Dec  9 01:19:43 builder [<c02528b5>] lock_sock_nested+0xa5/0xb0
Dec  9 01:19:43 builder [<c02f0b2f>] _spin_lock_bh+0xf/0x20
Dec  9 01:19:43 builder [<c0252783>] release_sock+0x13/0xa0
Dec  9 01:19:43 builder [<c0194265>] invalidate_inode_buffers+0x15/0x80
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef747>] schedule_timeout+0x77/0xd0
Dec  9 01:19:43 builder [<c02528b5>] lock_sock_nested+0xa5/0xb0
Dec  9 01:19:43 builder [<c02f0b2f>] _spin_lock_bh+0xf/0x20
Dec  9 01:19:43 builder [<c0252783>] release_sock+0x13/0xa0
Dec  9 01:19:43 builder [<c01352f0>] prepare_to_wait_exclusive+0x20/0x70
Dec  9 01:19:43 builder [<c0282b82>] inet_csk_accept+0x132/0x230
Dec  9 01:19:43 builder [<c0194265>] invalidate_inode_buffers+0x15/0x80
Dec  9 01:19:43 builder [<c01812d7>] d_kill+0x37/0x50
Dec  9 01:19:43 builder [<c0251932>] sys_socketcall+0xd2/0x2b0
Dec  9 01:19:43 builder [<c016ef67>] filp_close+0x47/0x80
Dec  9 01:19:43 builder [<c01703ee>] sys_close+0x6e/0xc0
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder zabbix_agentd S def2df30     0  4407   4399
Dec  9 01:19:43 builder deee2520 00000086 00000002 def2df30 def2df38
00000000 00000001 deee2674
Dec  9 01:19:43 builder deee2674 c1409500 c03ce080 c03ce080 c03ce080
297d5df0 01b47a4c 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f def2df60
deee2520 00000000 bfbd9c54
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02efd6f>] do_nanosleep+0x5f/0x90
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder agetty        S deef1eb0     0  4469      1
Dec  9 01:19:43 builder dfae7960 00000082 00000002 deef1eb0 deef1eb8
00000000 00000007 dfae7ab4
Dec  9 01:19:43 builder dfae7ab4 c1409500 c03ce080 c03ce080 c03ce080
c038750c fffc656a 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f 00000001
dfbad800 7fffffff 7fffffff
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef747>] schedule_timeout+0x77/0xd0
Dec  9 01:19:43 builder [<c02f0bc3>] _spin_lock_irqsave+0x23/0x30
Dec  9 01:19:43 builder [<c0201c85>] tty_read+0x85/0xc0
Dec  9 01:19:43 builder [<c0170b17>] do_sync_read+0xc7/0x110
Dec  9 01:19:43 builder [<c01351b0>] autoremove_wake_function+0x0/0x50
Dec  9 01:19:43 builder [<c0171453>] vfs_read+0x133/0x140
Dec  9 01:19:43 builder [<c0171781>] sys_read+0x41/0x70
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder sshd          S 00000000     0  4519   4516
Dec  9 01:19:43 builder deee2f40 00000086 00000000 00000000 00000000
00000000 00000000 deee3094
Dec  9 01:19:43 builder deee3094 c1409500 c03ce080 c03ce080 c03ce080
00000000 00000001 00000082
Dec  9 01:19:43 builder c011a35e 00000000 00000000 00000003 0000000b
00000000 00000000 ffffffff
Dec  9 01:19:43 builder [<c02640bb>] neigh_resolve_output+0xdb/0x280

Dec  9 01:19:43 builder SysRq : Show Blocked State
Dec  9 01:19:43 builder task                PC stack   pid father
Dec  9 01:19:43 builder pdflush       D c140fdc0     0    94      2
Dec  9 01:19:43 builder df89c000 00000046 00000818 c140fdc0 00000002
df87def8 df87df34 df89c154
Dec  9 01:19:43 builder df89c154 c1413500 c03ce080 c03ce080 c03ce080
df87df24 01b3bb70 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000818 0000000f df87df24
01b47a49 df567198 00000001
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c02ef71b>] schedule_timeout+0x4b/0xd0
Dec  9 01:19:43 builder [<c012b5d0>] process_timeout+0x0/0x10
Dec  9 01:19:43 builder [<c01c46b1>] journal_stop+0x141/0x1d0
Dec  9 01:19:43 builder [<c0153db0>] pdflush+0x0/0x1e0
Dec  9 01:19:43 builder [<c01bd36d>] ext3_sync_fs+0xd/0x30
Dec  9 01:19:43 builder [<c017267a>] __fsync_super+0x4a/0x70
Dec  9 01:19:43 builder [<c01726a8>] fsync_super+0x8/0x20
Dec  9 01:19:43 builder [<c01726e9>] do_remount_sb+0x29/0x170
Dec  9 01:19:43 builder [<c0153db0>] pdflush+0x0/0x1e0
Dec  9 01:19:43 builder [<c0172c35>] do_emergency_remount+0x95/0xc0
Dec  9 01:19:43 builder [<c0153eb6>] pdflush+0x106/0x1e0
Dec  9 01:19:43 builder [<c0172ba0>] do_emergency_remount+0x0/0xc0
Dec  9 01:19:43 builder [<c0134e92>] kthread+0x42/0x70
Dec  9 01:19:43 builder [<c0134e50>] kthread+0x0/0x70
Dec  9 01:19:43 builder [<c0104fc7>] kernel_thread_helper+0x7/0x10
Dec  9 01:19:43 builder =======================

Dec  9 01:19:43 builder [<c027e08f>] ip_finish_output+0x14f/0x2d0
Dec  9 01:19:43 builder [<c027dbe5>] ip_local_out+0x15/0x20
Dec  9 01:19:43 builder [<c027e725>] ip_queue_xmit+0x145/0x310
Dec  9 01:19:43 builder [<c0294613>] tcp_v4_send_check+0x43/0xf0
Dec  9 01:19:43 builder [<c028f366>] tcp_transmit_skb+0x386/0x690
Dec  9 01:19:43 builder [<c01155de>] pvclock_clocksource_read+0x4e/0xe0
Dec  9 01:19:43 builder [<c0290a08>] __tcp_push_pending_frames+0x108/0x760
Dec  9 01:19:43 builder [<c0283fe5>] sk_stream_alloc_skb+0x35/0xe0
Dec  9 01:19:43 builder [<c017dcad>] sys_select+0x4d/0x1b0
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f 00000000
deee32a0 deee3390 00000003
Dec  9 01:19:43 builder Call Trace:
Dec  9 01:19:43 builder [<c015b836>] handle_mm_fault+0x4c6/0x6a0
Dec  9 01:19:43 builder [<c01250fc>] do_wait+0x29c/0x350
Dec  9 01:19:43 builder [<c011ca60>] default_wake_function+0x0/0x10
Dec  9 01:19:43 builder [<c0125225>] sys_wait4+0x75/0xc0
Dec  9 01:19:43 builder [<c0125295>] sys_waitpid+0x25/0x30
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder bash          S def33eb0     0  4526   4525
Dec  9 01:19:43 builder [<c0205d10>] read_chan+0x0/0x650
Dec  9 01:19:43 builder [<c01713c1>] vfs_read+0xa1/0x140
Dec  9 01:19:43 builder [<c0201c00>] tty_read+0x0/0xc0
Dec  9 01:19:43 builder [<c0171781>] sys_read+0x41/0x70
Dec  9 01:19:43 builder [<c0103d1f>] sysenter_do_call+0x12/0x33
Dec  9 01:19:43 builder =======================
Dec  9 01:19:43 builder smbd          S def53b3c     0  5148   4216
Dec  9 01:19:43 builder deee0000 00000086 00000002 def53b3c def53b44
00000000 c0371340 deee0154
Dec  9 01:19:43 builder deee0154 c1409500 c03ce080 c03ce080 c03ce080
def53b64 01b3f66d 00000000
Dec  9 01:19:43 builder 00000000 00000000 00000000 0000000f def53b64
01b4edf2 00000000 0000ea60
Dec  9 01:19:43 builder [<c02640bb>] neigh_resolve_output+0xdb/0x280
Dec  9 01:19:43 builder [<c027e08f>] ip_finish_output+0x14f/0x2d0
Dec  9 01:19:43 builder [<c027dbe5>] ip_local_out+0x15/0x20
Dec  9 01:19:43 builder [<c027e725>] ip_queue_xmit+0x145/0x310
Dec  9 01:19:43 builder [<c0294613>] tcp_v4_send_check+0x43/0xf0
Dec  9 01:19:43 builder [<c028f366>] tcp_transmit_skb+0x386/0x690
Dec  9 01:19:43 builder [<c02f0bc3>] _spin_lock_irqsave+0x23/0x30
Dec  9 01:19:43 builder [<c012b987>] lock_timer_base+0x27/0x60
Dec  9 01:19:43 builder [<c012bac7>] __mod_timer+0x97/0xb0
Dec  9 01:19:43 builder [<c02528fc>] sk_reset_timer+0xc/0x20

Dec  9 01:19:43 builder SysRq : Show Memory
Dec  9 01:19:43 builder Mem-Info:
Dec  9 01:19:43 builder DMA per-cpu:
Dec  9 01:19:43 builder CPU    0: hi:    0, btch:   1 usd:   0
Dec  9 01:19:43 builder CPU    1: hi:    0, btch:   1 usd:   0
Dec  9 01:19:43 builder Normal per-cpu:
Dec  9 01:19:43 builder CPU    0: hi:  186, btch:  31 usd: 171
Dec  9 01:19:43 builder CPU    1: hi:  186, btch:  31 usd:  75
Dec  9 01:19:43 builder Active:34994 inactive:89089 dirty:19
writeback:0 unstable:0
Dec  9 01:19:43 builder free:1827 slab:2161 mapped:1690 pagetables:144 bounce:0
Dec  9 01:19:43 builder DMA free:2324kB min:84kB low:104kB high:124kB
active:1964kB inactive:8580kB present:15868kB pages_scanned:0
all_unreclaimable? no
Dec  9 01:19:43 builder lowmem_reserve[]: 0 492 492
Dec  9 01:19:43 builder Normal free:4984kB min:2792kB low:3488kB
high:4188kB active:138012kB inactive:347776kB present:503872kB
pages_scanned:0 all_unreclaimable?
 no
Dec  9 01:19:43 builder lowmem_reserve[]: 0 0 0
Dec  9 01:19:43 builder DMA: 3*4kB 11*8kB 1*16kB 5*32kB 8*64kB 2*128kB
1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 2324kB
Dec  9 01:19:43 builder Normal: 230*4kB 33*8kB 8*16kB 18*32kB 3*64kB
3*128kB 0*256kB 1*512kB 2*1024kB 0*2048kB 0*4096kB = 5024kB
Dec  9 01:19:43 builder 95789 total pagecache pages
Dec  9 01:19:43 builder 0 pages in swap cache
Dec  9 01:19:43 builder Swap cache stats: add 17, delete 17, find 0/0
Dec  9 01:19:43 builder Free swap  = 262068kB
Dec  9 01:19:43 builder Total swap = 262136kB
Dec  9 01:19:43 builder 131056 pages RAM
Dec  9 01:19:43 builder 1975 pages reserved
Dec  9 01:19:43 builder 7105 pages shared
Dec  9 01:19:43 builder 123482 pages non-shared

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hangs
  2008-12-03 17:49                   ` Hangs chris
@ 2008-12-18 18:05                     ` chris
  0 siblings, 0 replies; 26+ messages in thread
From: chris @ 2008-12-18 18:05 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Wed, Dec 03, 2008 at 09:49:18AM -0800, chris@versecorp.net wrote:
> On Wed, Dec 03, 2008 at 12:44:54PM +0200, Avi Kivity wrote:
> > chris@versecorp.net wrote:
> > >
> > >I have a way to reproduce my instance of the problem easily now.   I was 
> > >trying
> > >to build a new kernel on my guest,  and found that depmod hangs guests 
> > >every time. 
> > >   In my case, I only have an amd processor - I don't have an intel 
> > >host to try it on, right now,  but it happens on Ubuntu 8.04
> > >and Ubuntu 8.10 guests, both using kvm-79 and the version of kvm that ships
> > >with ubuntu 8.10.
> > >  
> > 
> > What's your guest, how is qemu launched (command line)?
> > 
> > -- 
> > error compiling committee.c: too many arguments to function
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> The guest is Ubuntu 8.10 server (64-bit version).  I also have the same 
> problems with Ubuntu 8.04LTS server.
> 
> Here's the command line:
> 
> sudo /usr/local/bin/qemu-system-x86_64        \
>      -no-kvm-irqchip                         \
>      -daemonize                               \
>      -hda Imgs/sam_home.img                   \
>      -m 512                                   \
>      -cdrom ISOs/ubuntu-8.10-server-amd64.iso \
>      -parallel /dev/lp0                       \
>      -vnc :1                                  \
>      -net nic,macaddr=DE:AD:BE:EF:01:01,model=e1000 \
>      -net tap,ifname=tap1,script=/home/chris/kvm/qemu-ifup.sh \
>      >>& Logs/sam_run.log
> 
> Earlier in the mail chain, Marcelo had me run vmstat when it was hung,
> and it was all zeros.  He also asked for a stack trace on the qemu and it
> showed two threads:
> 
> 	(gdb) info threads
> 	  2 Thread 0x414f1950 (LWP 422)  0x00007f36f07a03e1 in sigtimedwait ()
> 	   from /lib/libc.so.6
> 	  1 Thread 0x7f36f1f306e0 (LWP 414)  0x00007f36f084b482 in select ()
> 	   from /lib/libc.so.6
> 	(gdb) thread 1
> 	[Switching to thread 1 (Thread 0x7f36f1f306e0 (LWP 414))]#0  0x00007f36f084b482
> 	+in select () from /lib/libc.so.6
> 	(gdb) bt
> 	#0  0x00007f36f084b482 in select () from /lib/libc.so.6
> 	#1  0x00000000004094cb in main_loop_wait (timeout=0)
> 	    at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4719
> 	#2  0x000000000050a7ea in kvm_main_loop ()
> 	    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:619
> 	#3  0x000000000040fafc in main (argc=<value optimized out>,
> 	    argv=0x7ffff9f41948) at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4871
> 	(gdb) thread 2
> 	[Switching to thread 2 (Thread 0x414f1950 (LWP 422))]#0  0x00007f36f07a03e1 in
> 	+sigtimedwait () from /lib/libc.so.6
> 	(gdb) bt
> 	#0  0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6
> 	#1  0x000000000050a560 in kvm_main_loop_wait (env=0xc319e0, timeout=0)
> 	    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:284
> 	#2  0x000000000050aaf7 in ap_main_loop (_env=<value optimized out>)
> 	    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:425
> 	#3  0x00007f36f11ba3ea in start_thread () from /lib/libpthread.so.0
> 	#4  0x00007f36f0852c6d in clone () from /lib/libc.so.6
> 	#5  0x0000000000000000 in ?? ()
> 
> If I can provide any other debug info I'm happy to.   I'm beginning to suspect
> you'll be able to easily reproduce it if you run Ubuntu 8.10 as a guest on 
> amd processors.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

FYI, I'm seeing hangs on kvm-81, although the circumstances seem to have
changes somewhat.  Now, when the guest hangs, the qemu-system-x86 process
in the host runs 100% busy.   And now I do see some kvm_stat counters
increasing, not just all 0s:

kvm_stat -l
-----------------
 efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall  insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_injec  irq_windo  kvm_reque  largepage  mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc  mmu_shado  mmu_unsyn  mmu_unsyn  nmi_injec  nmi_windo   pf_fixed   pf_guest  remote_tl  request_n  signal_ex  tlb_flush
         0        277          0          0          0         29          0          0          0          0          0        277          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        289          0          0          0         35          0          0          0          0          0        290          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        277          0          0          0         27          0          0          0          0          0        277          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        290          0          0          0         33          0          0          0          0          0        289          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        290          0          0          0         34          0          0          0          0          0        290          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        288          0          0          0         30          0          0          0          0          0        288          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        296          0          0          0         36          0          0          0          0          0        297          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        292          0          0          0         33          0          0          0          0          0        291          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        275          0          0          0         26          0          0          0          0          0        275          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        289          0          0          0         31          0          0          0          0          0        290          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        309          0          0          0         41          0          0          0          0          0        308          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
         0        289          0          0          0         29          0          0          0          0          0        289          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0


GDB on it shows:
---------------
(gdb) info threads
  2 Thread 0x41fad950 (LWP 6514)  0x00007f498e8e4a17 in ioctl ()
   from /lib/libc.so.6
  1 Thread 0x7f498ffc36e0 (LWP 6506)  0x00007f498e8e5482 in select ()
   from /lib/libc.so.6

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f498ffc36e0 (LWP 6506))]#0  0x00007f498e8e5482 in select () from /lib/libc.so.6
(gdb) bt
#0  0x00007f498e8e5482 in select () from /lib/libc.so.6
#1  0x0000000000409b8b in main_loop_wait (timeout=0)
    at /home/chris/pkgs/kvm/kvm-81/qemu/vl.c:3623
#2  0x000000000051653a in kvm_main_loop ()
    at /home/chris/pkgs/kvm/kvm-81/qemu/qemu-kvm.c:597
#3  0x000000000040e0bc in main (argc=<value optimized out>, 
    argv=0x7fff97fda878) at /home/chris/pkgs/kvm/kvm-81/qemu/vl.c:3785

(gdb) thread 2
[Switching to thread 2 (Thread 0x41fad950 (LWP 6514))]#0  0x00007f498e8e4a17 in ioctl () from /lib/libc.so.6
(gdb) bt
#0  0x00007f498e8e4a17 in ioctl () from /lib/libc.so.6
#1  0x0000000000543bba in kvm_run (kvm=0xbe0040, vcpu=0, env=0xc5a1d0)
    at libkvm.c:881
#2  0x0000000000516649 in kvm_cpu_exec (env=<value optimized out>)
    at /home/chris/pkgs/kvm/kvm-81/qemu/qemu-kvm.c:207
#3  0x0000000000516948 in ap_main_loop (_env=<value optimized out>)
    at /home/chris/pkgs/kvm/kvm-81/qemu/qemu-kvm.c:414
#4  0x00007f498f2543ea in start_thread () from /lib/libpthread.so.0
#5  0x00007f498e8ecc6d in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thanks,
Chris


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2008-12-18 18:05 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-19 22:43 Hangs chris
2008-11-20 17:10 ` Hangs chris
2008-11-21 19:32   ` Hangs Marcelo Tosatti
2008-11-21 23:43     ` Hangs Roland Lammel
2008-11-22 17:54     ` Hangs chris
     [not found]       ` <519a8b110811280305v764fade1w9d02f5c9188f56e5@mail.gmail.com>
2008-11-28 12:35         ` Hangs xming
2008-12-02 10:47           ` Hangs xming
2008-12-02 12:09             ` Hangs Avi Kivity
2008-12-02 20:58               ` Hangs chris
2008-12-02 23:01                 ` Hangs xming
2008-12-03  1:20                   ` Hangs chris
2008-12-03  9:13                     ` Hangs xming
2008-12-03 10:44                 ` Hangs Avi Kivity
2008-12-03 17:49                   ` Hangs chris
2008-12-18 18:05                     ` Hangs chris
  -- strict thread matches above, loose matches on Subject: below --
2008-11-14 15:34 Hangs Chris Jones
2008-11-16 16:36 ` Hangs Chris Jones
2008-11-18 21:34 ` Hangs Marcelo Tosatti
2008-11-19 10:57   ` Hangs Roland Lammel
2008-11-19 21:53     ` Hangs Roland Lammel
     [not found]       ` <20081120015600.GB10846@dmt.cnet>
2008-11-21 15:55         ` Hangs Glauber Costa
2008-11-21 20:29           ` Hangs Roland Lammel
2008-11-21 21:01             ` Hangs Daniel P. Berrange
2008-11-21 23:46               ` Hangs Roland Lammel
2008-12-06 23:18                 ` Hangs Roland Lammel
2008-12-09  0:34                   ` Hangs xming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox