public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [Bug 153571] New: 100% CPU usage on guest VM
@ 2016-08-23 13:41 bugzilla-daemon
  2016-08-24  5:47 ` [Bug 153571] " bugzilla-daemon
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: bugzilla-daemon @ 2016-08-23 13:41 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=153571

            Bug ID: 153571
           Summary: 100% CPU usage on guest VM
           Product: Virtualization
           Version: unspecified
    Kernel Version: 3.10.0-327.22.2.el7.x86_64
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: laurentiu@soica.ro
        Regression: No

Hardware details:

Baremetal:
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
128 GB RAM

L1 guest:
36 vCPUs
100 GB RAM

Software details:

Baremetal:
CentOS Linux release 7.2.1511 (Core)
qemu-kvm 2.3.0-31.el7.16.1
Nested KVM enabled

VM:
CentOS Linux release 7.2.1511 (Core)
On this VMs I have 15 L2 guests running.

Problem description:
After a few days of running (more than 90% idle on both baremetal and compute),
the compute node goes to 100% CPU usage (3600 %). The L2 guests are not
accessible anymore.

The only workaround is to shutdown the L1 guest and start it again. a restart
on the L1 guest isn't enough.

Running perf record -a -g on the baremetal shows that most of the CPU time is
in _raw_spin_lock

Children      Self    Command          Shared Object                Symbol
-  93.62%     93.62%  qemu-kvm         [kernel.kallsyms]            [k]
_raw_spin_lock
   - _raw_spin_lock
      + 45.30% kvm_mmu_sync_roots
      + 28.49% kvm_mmu_load
      + 25.00% mmu_free_roots
      + 1.12% tdp_page_fault

When the CPU goes to 100%:
 - the reported free memory is close to 0 on the host (around 300 MB). Anyway
there are about 50 GB as cached/buffered memory.
 - the Resident Memory on the host for the L1 VM is around 50-60 GB
 - the free memory inside L1 VM is around 60 GB
 - there is no swapping activity on the host (around 150 MB of used swap). the
swap is disabled on the L1 guest.

qemu command line: /usr/libexec/qemu-kvm -name baremetalbrbm_1 -S -machine
pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu host -m 102400 -realtime mlock=off
-smp 36,sockets=36,cores=1,threads=1 -uuid 534e9b54-5e4c-4acb-adcf-793f841551a7
-no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-baremetalbrbm_1/monitor.sock,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
-boot menu=off,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
ahci,id=sata0,bus=pci.0,addr=0x5 -drive
file=/var/lib/libvirt/images/baremetalbrbm_1.qcow2,if=none,id=drive-sata0-0-0,format=qcow2,cache=unsafe
-device ide-hd,bus=sata0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=1
-netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=26 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:f1:15:20:c5:46,bus=pci.0,addr=0x3
-netdev tap,fd=28,id=hostnet1 -device
rtl8139,netdev=hostnet1,id=net1,mac=52:54:00:d3:c9:24,bus=pci.0,addr=0x7
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0
-vnc 127.0.0.1:2 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 153571] 100% CPU usage on guest VM
  2016-08-23 13:41 [Bug 153571] New: 100% CPU usage on guest VM bugzilla-daemon
@ 2016-08-24  5:47 ` bugzilla-daemon
  2016-08-24  6:14 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2016-08-24  5:47 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=153571

Wanpeng Li <wanpeng.li@hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wanpeng.li@hotmail.com

--- Comment #1 from Wanpeng Li <wanpeng.li@hotmail.com> ---
Could you reproduce this against last kvm tree? Then we can pay more attention
to it.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 153571] 100% CPU usage on guest VM
  2016-08-23 13:41 [Bug 153571] New: 100% CPU usage on guest VM bugzilla-daemon
  2016-08-24  5:47 ` [Bug 153571] " bugzilla-daemon
@ 2016-08-24  6:14 ` bugzilla-daemon
  2016-08-24  8:10 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2016-08-24  6:14 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=153571

--- Comment #2 from Wanpeng Li <wanpeng.li@hotmail.com> ---
s/last/latest

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 153571] 100% CPU usage on guest VM
  2016-08-23 13:41 [Bug 153571] New: 100% CPU usage on guest VM bugzilla-daemon
  2016-08-24  5:47 ` [Bug 153571] " bugzilla-daemon
  2016-08-24  6:14 ` bugzilla-daemon
@ 2016-08-24  8:10 ` bugzilla-daemon
  2016-08-26 11:42 ` bugzilla-daemon
  2017-02-05 11:01 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2016-08-24  8:10 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=153571

--- Comment #3 from Laurentiu Soica <laurentiu@soica.ro> ---
(In reply to Wanpeng Li from comment #1)
> Could you reproduce this against last kvm tree? Then we can pay more
> attention to it.

Is there a kvm build from latest sources reported as working on CentOS 7 ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 153571] 100% CPU usage on guest VM
  2016-08-23 13:41 [Bug 153571] New: 100% CPU usage on guest VM bugzilla-daemon
                   ` (2 preceding siblings ...)
  2016-08-24  8:10 ` bugzilla-daemon
@ 2016-08-26 11:42 ` bugzilla-daemon
  2017-02-05 11:01 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2016-08-26 11:42 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=153571

Paolo Bonzini <bonzini@gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bonzini@gnu.org

--- Comment #4 from Paolo Bonzini <bonzini@gnu.org> ---
You should be able to get a Fedora kernel package from koji.fedoraproject.org
and use it on CentOS 7.  That's close enough.  Please use the latest 4.7 build
you can find there.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 153571] 100% CPU usage on guest VM
  2016-08-23 13:41 [Bug 153571] New: 100% CPU usage on guest VM bugzilla-daemon
                   ` (3 preceding siblings ...)
  2016-08-26 11:42 ` bugzilla-daemon
@ 2017-02-05 11:01 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2017-02-05 11:01 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=153571

--- Comment #5 from Laurentiu Soica <laurentiu@soica.ro> ---
I wasn't able to boot with the kernel from koji.fedoraproject.org. Instead I
built the 4.9.6 from sources.

Even if for now the CPU usage is not 100% (but still very high), the profiling
looks similar, this time the the load is on queued_spin_lock_slowpath:

  Children      Self  Command          Shared Object               Symbol
-   69.38%    69.38%  qemu-kvm         [kernel.vmlinux]            [k]
queued_spin_lock_slowpath
- queued_spin_lock_slowpath
      - _raw_spin_lock
         + 46.76% kvm_mmu_sync_roots
         + 29.95% mmu_free_roots
         + 23.12% kvm_mmu_load

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-02-05 11:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-23 13:41 [Bug 153571] New: 100% CPU usage on guest VM bugzilla-daemon
2016-08-24  5:47 ` [Bug 153571] " bugzilla-daemon
2016-08-24  6:14 ` bugzilla-daemon
2016-08-24  8:10 ` bugzilla-daemon
2016-08-26 11:42 ` bugzilla-daemon
2017-02-05 11:01 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox