kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* KVM lock contention on 48 core AMD machine
@ 2011-03-18 12:02 Ben Nagy
  2011-03-18 12:30 ` Joerg Roedel
  2011-03-18 12:44 ` Stefan Hajnoczi
  0 siblings, 2 replies; 32+ messages in thread
From: Ben Nagy @ 2011-03-18 12:02 UTC (permalink / raw)
  To: kvm

Hi,

We've been trying to debug a problem when bring up VMs on a 48 core
AMD machine (4 x Opteron 6128). After some investigation and some
helpful comments from #kvm, it appears that we hit a serious lock
contention issue at a certain point. We have enabled lockdep debugging
(had to increase MAX_LOCK_DEPTH in sched.h to 144!) and have some
output, but I'm not all that sure how to progress from here in
troubleshooting the issue.

Linux eax 2.6.38-7-vmhost #35 SMP Thu Mar 17 13:25:10 SGT 2011 x86_64
x86_64 x86_64 GNU/Linux
(vmhost is a custom flavour which has the lock debugging stuff
enabled, the base distro is Ubuntu Natty alpha3)

QEMU emulator version 0.14.0 (qemu-kvm-0.14.0), Copyright (c)
2003-2008 Fabrice Bellard

CPUs - 4 physical 12 core CPUs.
model name      : AMD Opteron(tm) Processor 6168
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
nonstop_tsc extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm
cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch
osvw ibs skinit wdt nodeid_msr npt lbrv svm_lock nrip_save pausefilter

RAM: 96GB

KVM commandline (using libvirt):
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
QEMU_AUDIO_DRV=none /usr/local/bin/kvm-snapshot -S -M pc-0.14
-enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name fb-0
-uuid de59229b-eb06-9ecc-758e-d20bc5ddc291 -nodefconfig -nodefaults
-chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/fb-0.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=readline -rtc base=localtime
-no-acpi -boot cd -drive
file=/mnt/big/bigfiles/kvm_disks/eax/fb-0.ovl,if=none,id=drive-ide0-0-0,format=qcow2
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0
-drive if=none,media=cdrom,id=drive-ide0-0-1,readonly=on,format=raw
-device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1
-netdev tap,fd=17,id=hostnet0 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:d9:09:ef,bus=pci.0,addr=0x3
-usb -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -k en-us -vga
cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

kvm-snapshot is just a script that runs /usr/bin/kvm "$@" -snapshot

The VMs are .ovl files which link to a single qcow2 disk, which is
hosted on an iscsi volume, with ocfs2 as the filesystem. However, we
reproduced the problem running all the VMs locally, so that seems to
indicate that it's not an IB issue.

Basically, after between 30-40 machines finish booting, the system cpu
utilisation climbs to up to 99.9%. VMs are unresponsive, but the host
itself is still responsive.

Here's some output from perf top while the system is locky:
           263832.00 46.3% delay_tsc
[kernel.kallsyms]
           231491.00 40.7% __ticket_spin_trylock
[kernel.kallsyms]
            14609.00  2.6% native_read_tsc
[kernel.kallsyms]
             9414.00  1.7% do_raw_spin_lock
[kernel.kallsyms]
             8041.00  1.4% local_clock
[kernel.kallsyms]
             6081.00  1.1% native_safe_halt
[kernel.kallsyms]
             3901.00  0.7% __lock_acquire.clone.18
[kernel.kallsyms]
             3665.00  0.6% do_raw_spin_unlock
[kernel.kallsyms]
             3042.00  0.5% __delay
[kernel.kallsyms]
             2484.00  0.4% lock_contended
[kernel.kallsyms]
             2484.00  0.4% sched_clock_cpu
[kernel.kallsyms]
             1906.00  0.3% sched_clock_local
[kernel.kallsyms]
             1419.00  0.2% lock_acquire
[kernel.kallsyms]
             1332.00  0.2% lock_release
[kernel.kallsyms]
              987.00  0.2% tg_load_down
[kernel.kallsyms]
              895.00  0.2% _raw_spin_lock_irqsave
[kernel.kallsyms]
              686.00  0.1% find_busiest_group
[kernel.kallsyms]

I have been looking at the top contended locks from when the system is
idle, with some VMs running and when it's in the locky condition

http://paste.ubuntu.com/582025/ - idle
http://paste.ubuntu.com/582007/ - some VMs
http://paste.ubuntu.com/582019/ - locky

(output is from grep : /proc/lock_stat | head 30)

The main things that I see are fidvid_mutex and idr_lock#3. The
fidvid_mutex seems like it might be related to the high % spent in
delay_tsc from perf top...

Anyway, if someone could give me some suggestions for things to try,
or more information that might help... anything really. :)

Thanks a lot,

ben

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2011-04-05  9:50 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-18 12:02 KVM lock contention on 48 core AMD machine Ben Nagy
2011-03-18 12:30 ` Joerg Roedel
2011-03-19  4:45   ` Ben Nagy
2011-03-21  9:50     ` Avi Kivity
2011-03-21 11:43       ` Ben Nagy
2011-03-21 13:41         ` Ben Nagy
2011-03-21 13:53           ` Avi Kivity
     [not found]             ` <AANLkTikWQS281kTtJ32-qo5U+w_BAak7qUwVhUQgOxxv@mail.gmail.com>
2011-03-21 15:50               ` Avi Kivity
2011-03-21 16:16                 ` Ben Nagy
2011-03-21 16:33                   ` Avi Kivity
2011-03-21 16:54                   ` Eric Dumazet
2011-03-21 17:02                     ` Avi Kivity
2011-03-21 17:12                       ` Eric Dumazet
2011-03-21 18:12                         ` Ben Nagy
2011-03-21 22:27                           ` [RFC] posix-timers: RCU conversion Eric Dumazet
2011-03-22  7:09                             ` [PATCH] " Eric Dumazet
2011-03-22  8:59                               ` Ben Nagy
2011-03-22 10:35                                 ` Avi Kivity
2011-04-04  3:30                                   ` Ben Nagy
2011-04-04  7:18                                     ` Avi Kivity
2011-04-05  7:49                                   ` Peter Zijlstra
2011-04-05  8:16                                     ` Avi Kivity
2011-04-05  8:48                                   ` Peter Zijlstra
2011-04-05  8:56                                     ` Avi Kivity
2011-04-05  9:03                                       ` Peter Zijlstra
2011-04-05  9:08                                         ` Avi Kivity
2011-04-05  9:50                                         ` Ben Nagy
2011-04-05  8:56                                     ` Mike Galbraith
2011-03-21 18:14                         ` KVM lock contention on 48 core AMD machine Avi Kivity
2011-03-21 18:48                         ` Michael Tokarev
2011-03-21 18:53                           ` Avi Kivity
2011-03-18 12:44 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).