From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: KVM lock contention on 48 core AMD machine Date: Mon, 21 Mar 2011 19:02:33 +0200 Message-ID: <4D8784A9.8040303@redhat.com> References: <20110318123031.GB6066@8bytes.org> <4D871F6C.40207@redhat.com> <4D875842.9050308@redhat.com> <4D8773AA.8030408@redhat.com> <1300726498.2884.493.camel@edumazet-laptop> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: ben@iagu.net, KVM list To: Eric Dumazet Return-path: Received: from mx1.redhat.com ([209.132.183.28]:18876 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751249Ab1CURCn (ORCPT ); Mon, 21 Mar 2011 13:02:43 -0400 In-Reply-To: <1300726498.2884.493.camel@edumazet-laptop> Sender: kvm-owner@vger.kernel.org List-ID: On 03/21/2011 06:54 PM, Eric Dumazet wrote: > Le lundi 21 mars 2011 =C3=A0 22:01 +0545, Ben Nagy a =C3=A9crit : > > >> On Mon, Mar 21, 2011 at 7:38 PM, Avi Kivity = wrote: > > >> > In the future, please post the binary perf.dat. > > >> > > >> Hi Avi, > > >> > > >> How do I do that? > > > > > > 'make nconfig' and go to the kernel hacking section. > > > > Imprecise question sorry, I meant how do I get the perf.dat not ho= w do > > I disable the debugging. > > > > On the non-debug kernel: Linux eax 2.6.38-7-server #36-Ubuntu SMP = =46ri > > Mar 18 23:36:13 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux > > > > 150512.00 67.8% __ticket_spin_lock > > [kernel.kallsyms] > > 11126.00 5.0% memcmp_pages > > [kernel.kallsyms] > > 9563.00 4.3% native_safe_halt > > [kernel.kallsyms] > > 8965.00 4.0% svm_vcpu_run > > /lib/modules/2.6.38-7-server/kernel/arch/x86/kvm/kvm-amd.ko > > 6489.00 2.9% tg_load_down > > [kernel.kallsyms] > > 4676.00 2.1% kvm_get_cs_db_l_bits > > /lib/modules/2.6.38-7-server/kernel/arch/x86/kvm/kvm.ko > > 1931.00 0.9% load_balance_fair > > [kernel.kallsyms] > > 1917.00 0.9% ktime_get > > [kernel.kallsyms] > > 1624.00 0.7% walk_tg_tree.clone.129 > > [kernel.kallsyms] > > 1542.00 0.7% find_busiest_group > > [kernel.kallsyms] > > 1326.00 0.6% find_next_bit > > [kernel.kallsyms] > > 673.00 0.3% lock_hrtimer_base.clone.25 > > [kernel.kallsyms] > > 624.00 0.3% copy_user_generic_string [kernel.kal= lsyms] > > > > top now says: > > > > top - 00:11:35 up 22 min, 4 users, load average: 0.11, 6.15, 7.7= 8 > > Tasks: 491 total, 3 running, 488 sleeping, 0 stopped, 0 zomb= ie > > Cpu(s): 0.9%us, 15.4%sy, 0.0%ni, 83.7%id, 0.0%wa, 0.0%hi, 0.0= %si, 0.0%st > > Mem: 99068660k total, 70831760k used, 28236900k free, 10036k b= uffers > > Swap: 2438140k total, 2173652k used, 264488k free, 3396144k c= ached > > > > With average 'resting cpu' per idle guest 8%, 96 guests running. > > > > Is this as good as I am going to get? It seems like I can't really > > debug that lock contention without blowing stuff up because of the > > load of the lock debugging... > > > > Don't know if I mentioned this before, but those guests are each > > pinned to a cpu (cpu guestnum%48) with taskset. > > > > Cheers, > > It seems you hit idr_lock contention (used in kernel/posix-timers.c) > Any ideas on how to fix it? We could pre-allocate IDs and batch them i= n=20 per-cpu caches, but it seems like a lot of work. --=20 error compiling committee.c: too many arguments to function