From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tokarev Subject: Re: high load with usb device Date: Tue, 14 Sep 2010 14:00:01 +0400 Message-ID: <4C8F47A1.4030600@msgid.tls.msk.ru> References: <4C80E839.50604@msgid.tls.msk.ru> <4C8A1AE3.8030705@redhat.com> <4C8A2187.1020202@msgid.tls.msk.ru> <4C8A59FE.9090400@msgid.tls.msk.ru> <4C8C9CAA.9030203@redhat.com> <4C8F1B54.60604@msgid.tls.msk.ru> <4C8F3B00.3000703@redhat.com> <4C8F3BA7.7060302@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: KVM list , Jes Sorensen , linux-perf-users@vger.kernel.org To: Avi Kivity Return-path: Received: from isrv.corpit.ru ([86.62.121.231]:53906 "EHLO isrv.corpit.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752321Ab0INKAF (ORCPT ); Tue, 14 Sep 2010 06:00:05 -0400 In-Reply-To: <4C8F3BA7.7060302@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: 14.09.2010 13:08, Avi Kivity wrote: >> On 09/14/2010 08:51 AM, Michael Tokarev wrote: >>> 12.09.2010 13:26, Avi Kivity =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >>> Guest: 0 load >>> Host: >>> Cpu(s): 5.3%us, 6.7%sy, 0.0%ni, 88.0%id, 0.0%wa, 0.0%hi,=20 >>> 0.0%si, 0.0%st >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMA= ND >>> 25685 mjt 20 0 1074m 109m 4652 S 19 1.9 0:29.03 >>> qemu-system-x86 >>> >>> So it appears it's the qemu process that's doing something, >>> after all. Is there a way to compile it with profiling >>> information? I tried, but it does not compile: it uses >>> -fomit-frame-pointer which is not compatible with -p[g], >>> but when removing -fo-f-p it complains about assembly. >> >> Really, 'perf top' should have shown what qemu was doing. I don't >> have a lot of experience with profiling userspace, copying >> linux-perf-users for tips. `perf top' shows nothing interesting. Here it is: -----------------------------------------------------------------------= ------------------ PerfTop: 1161 irqs/sec kernel:93.7% exact: 0.0% [1000Hz cycles= ], (all, 2 CPUs) -----------------------------------------------------------------------= ------------------ samples pcnt function DSO _______ _____ ______________________________ _____________= _____ 12345.00 74.4% native_safe_halt [kernel.kalls= yms] 224.00 1.4% hpet_next_event [kernel.kalls= yms] 178.00 1.1% tick_dev_program_event [kernel.kalls= yms] 125.00 0.8% tick_broadcast_oneshot_control [kernel.kalls= yms] 113.00 0.7% do_select [kernel.kalls= yms] 105.00 0.6% delay_tsc [kernel.kalls= yms] 94.00 0.6% schedule [kernel.kalls= yms] 79.00 0.5% ktime_get [kernel.kalls= yms] 78.00 0.5% main_loop_wait qemu-system-x= 86_64 75.00 0.5% __hrtimer_start_range_ns [kernel.kalls= yms] 69.00 0.4% tick_nohz_stop_sched_tick [kernel.kalls= yms] 61.00 0.4% rb_erase [kernel.kalls= yms] 58.00 0.3% _raw_spin_unlock_irqrestore [kernel.kalls= yms] 55.00 0.3% _raw_spin_lock_irqsave [kernel.kalls= yms] 54.00 0.3% enqueue_hrtimer [kernel.kalls= yms] 54.00 0.3% __switch_to [kernel.kalls= yms] 49.00 0.3% ia32_syscall [kernel.kalls= yms] I started perf after guest were running for some time, and waited for about a minute for perf to get proper stats. Here's vmstat on host, when an idle linux guest is running with usbdevi= ce: procs -----------memory---------- ---swap-- -----io---- -system-- ----c= pu---- r b swpd free buff cache si so bi bo in cs us sy= id wa 1 0 0 306744 1213984 1376000 0 0 1 0 5531 4114 4 = 5 91 0 1 0 0 306356 1213996 1375972 0 0 0 0 5886 4229 4 = 6 90 0 when linux guest is running w/o -usbdevice: 1 0 0 303604 1214116 1376388 0 0 0 9 986 623 1 = 2 97 0 1 0 0 303356 1214116 1376340 0 0 0 0 946 600 2 = 1 96 0 when no guest running: 1 0 0 415048 1214000 1374860 0 0 0 9 821 481 2 = 2 96 0 0 0 0 415056 1214000 1374868 0 0 0 0 717 454 1 = 1 98 0 > Also, please strace qemu. Try 'strace -p $pid1 -p $pid2 -p $pid3 -c'= to > get a summary. There are only two qemu threads running (as shown in /proc/$pid/task). = Here's a sample (after hitting Ctrl+C): # strace -p 23020 -p 23023 -c Process 23020 attached - interrupt to quit Process 23023 attached - interrupt to quit ^CProcess 23020 detached Process 23023 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 98.98 23.462158 15569 1507 ioctl 0.64 0.151725 37 4063 5 futex 0.11 0.025051 0 86932 select 0.09 0.021188 0 408836 clock_gettime 0.07 0.016543 0 271933 136484 read 0.04 0.008958 0 181118 gettimeofday 0.02 0.005756 0 101190 write 0.02 0.005506 0 89180 timer_settime 0.02 0.004754 0 96271 timer_gettime 0.01 0.002682 0 50595 rt_sigaction 0.00 0.000000 0 1 rt_sigpending 0.00 0.000000 0 1 1 rt_sigtimedwait ------ ----------- ----------- --------- --------- ---------------- 100.00 23.704321 1291627 136490 total (Note that during strace run the CPU is >100% busy -- top(1) reports 65% for strace and 45% for the qemu process). Thanks! /mjt