All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philipp Hahn <hahn@univention.de>
To: kvm <kvm@vger.kernel.org>
Subject: RFH: kvm-0.14.1 on 2.6.32.42 stuck in flash_tlb_others_ipi()
Date: Mon, 9 Jan 2012 12:41:41 +0100	[thread overview]
Message-ID: <201201091241.45281.hahn@univention.de> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 1366 bytes --]

Hello,

one of our VMs regularly get stuck: the VM is completely unresponsive (no ssh, 
no serial console, no VNC). Using "gdbserver" and a remote system to debug 
the running VM, I see 3 CPUs (1,3,4) stuck in
 pgd_alloc() → spin_lock_irqsave(pgd_lock)
while the 4th CPU (2) is waiting in
 pgd_alloc() → pgd_prepopulate_pmb() →... →  flush_tlb_others_ipi()

195                     while (!cpumask_empty(to_cpumask(f->flush_cpumask)))
196                             cpu_relax();
(gdb) print f->flush_cpumask
$5 = {1}

CPU 1 is duing a do_exec() syscall, will CPU 2-4 are doing a do_fork() syscall 
according to "thread apply all backtrace".

After a "set variable f->flush_cpumask 0" from gdb the kernel continued 
dumping the trace-informations, which I attached.


Host: Debian linux-2.6.32-38-amd64 (=2.6.32.42), 8 Cores
Kvm: 0.14.1+dfsg
Guest: Debian linux-2.6.32-38-i686-bigmem, 4 CPUs

Is this a known bug and/or is a fix available?
I can gather more information from the VM if needd.

Sincerely
Philipp
-- 
Philipp Hahn           Open Source Software Engineer      hahn@univention.de
Univention GmbH        Linux for Your Business        fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen                 fax: +49 421 22 232-99
                                                   http://www.univention.de/

[-- Attachment #1.2: dmesg.txt --]
[-- Type: text/plain, Size: 7840 bytes --]

[261850.404057] BUG: soft lockup - CPU#1 stuck for 219010s! [sshd:2707]
[261850.404057] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.404057]
[261850.404057] Pid: 2707, comm: sshd Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.404057] EIP: 0060:[<c1293e1d>] EFLAGS: 00000292 CPU: 1
[261850.404057] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.404057] EAX: 00000292 EBX: 00000003 ECX: 00000800 EDX: 00000292
[261850.404057] ESI: 00000000 EDI: 00000001 EBP: f55f3eb4 ESP: f55f3e64
[261850.404057]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.404057] CR0: 8005003b CR2: b72a1000 CR3: 35607000 CR4: 000006f0
[261850.404057] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.404057] DR6: ffff0ff0 DR7: 00000400
[261850.404057] Call Trace:
[261850.404057]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.404057]  [<c1034908>] mm_init+0xa8/0xd4
[261850.404057]  [<c1034dd1>] ? dup_mm+0x6a/0x389
[261850.404057]  [<c10356bf>] ? copy_process+0x57a/0xf28
[261850.404057]  [<c1035a60>] ? copy_process+0x91b/0xf28
[261850.404057]  [<c10361a7>] ? do_fork+0x13a/0x2bc
[261850.404057]  [<c1006e6e>] ? sys_clone+0x21/0x27
[261850.404057]  [<c100829c>] ? syscall_call+0x7/0xb
[261850.404058] BUG: soft lockup - CPU#2 stuck for 218991s! [runsv:1236]
[261850.404058] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.404058]
[261850.404058] Pid: 1236, comm: runsv Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.404058] EIP: 0060:[<c1293e1d>] EFLAGS: 00000296 CPU: 2
[261850.404058] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.404058] EAX: 00000296 EBX: 00000003 ECX: 00000003 EDX: 00000296
[261850.404058] ESI: 00000000 EDI: 00000001 EBP: f602febc ESP: f602fe6c
[261850.404058]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.404058] CR0: 8005003b CR2: 0810acf4 CR3: 36a91000 CR4: 000006f0
[261850.404058] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.404058] DR6: ffff0ff0 DR7: 00000400
[261850.404058] Call Trace:
[261850.404058]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.404058]  [<c1034908>] mm_init+0xa8/0xd4
[261850.404058]  [<c1034dd1>] ? dup_mm+0x6a/0x389
[261850.404058]  [<c10356bf>] ? copy_process+0x57a/0xf28
[261850.404058]  [<c1035a60>] ? copy_process+0x91b/0xf28
[261850.404058]  [<c10361a7>] ? do_fork+0x13a/0x2bc
[261850.404058]  [<c10bae59>] ? do_sync_read+0x0/0x107
[261850.404058]  [<c10bb8ae>] ? vfs_read+0x7b/0xd3
[261850.404058]  [<c100dc12>] ? sys_fork+0x15/0x19
[261850.404058]  [<c100829c>] ? syscall_call+0x7/0xb
[261850.438247] BUG: soft lockup - CPU#0 stuck for 218999s! [pbuilder-update:2713]
[261850.438247] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.438247]
[261850.438247] Pid: 2713, comm: pbuilder-update Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.438247] EIP: 0060:[<c1293e1d>] EFLAGS: 00200292 CPU: 0
[261850.438247] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.438247] EAX: 00200292 EBX: 00000003 ECX: 00000000 EDX: 00200292
[261850.438247] ESI: 00000000 EDI: 00000001 EBP: f54ddf4c ESP: f54ddefc
[261850.438247]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.438247] CR0: 8005003b CR2: b73611a0 CR3: 35683000 CR4: 000006f0
[261850.438247] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.438247] DR6: ffff0ff0 DR7: 00000400
[261850.438247] Call Trace:
[261850.438247]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.438247]  [<c1292e1f>] ? _cond_resched+0x25/0x3c
[261850.438247]  [<c1292e56>] ? wait_for_common+0x20/0x100
[261850.438247]  [<c1034908>] mm_init+0xa8/0xd4
[261850.438247]  [<c10bfb14>] ? bprm_mm_init+0x14/0x163
[261850.438247]  [<c10bfffc>] ? do_execve+0xe3/0x25b
[261850.438247]  [<c1006e29>] ? sys_execve+0x23/0x47
[261850.438247]  [<c100829c>] ? syscall_call+0x7/0xb
[261850.483139] BUG: soft lockup - CPU#3 stuck for 218991s! [nrpe:1254]
[261850.483139] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.483139]
[261850.483139] Pid: 1254, comm: nrpe Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.483139] EIP: 0060:[<c1293e1d>] EFLAGS: 00200292 CPU: 3
[261850.483139] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.483139] EAX: 00200292 EBX: 00000003 ECX: 00000004 EDX: 00200292
[261850.483139] ESI: 00000000 EDI: 00000001 EBP: f647feb4 ESP: f647fe64
[261850.483139]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.483139] CR0: 8005003b CR2: b74ee580 CR3: 36010000 CR4: 000006f0
[261850.483139] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.483139] DR6: ffff0ff0 DR7: 00000400
[261850.483139] Call Trace:
[261850.483139]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.483139]  [<c1034908>] mm_init+0xa8/0xd4
[261850.483139]  [<c1034dd1>] ? dup_mm+0x6a/0x389
[261850.483139]  [<c10356bf>] ? copy_process+0x57a/0xf28
[261850.483139]  [<c1035a60>] ? copy_process+0x91b/0xf28
[261850.483139]  [<c10361a7>] ? do_fork+0x13a/0x2bc
[261850.483139]  [<c114cae3>] ? copy_to_user+0x29/0xf8
[261850.483139]  [<c1006e6e>] ? sys_clone+0x21/0x27
[261850.483139]  [<c100829c>] ? syscall_call+0x7/0xb
[261852.541339] nfs: RPC call returned error 88

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

             reply	other threads:[~2012-01-09 11:51 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-09 11:41 Philipp Hahn [this message]
2012-03-30 17:44 ` RFH: kvm-1.0+git on 3.3.0 stuck in flash_tlb_others_ipi() Philipp Hahn
2012-04-19 13:42   ` [SOLVED] 2.6.32 stuck in flush_tlb_others_ipi() Philipp Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201201091241.45281.hahn@univention.de \
    --to=hahn@univention.de \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.