From: "David S. Ahern" <daahern@cisco.com>
To: Avi Kivity <avi@qumranet.com>
Cc: kvm@vger.kernel.org
Subject: Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)
Date: Thu, 29 May 2008 10:42:22 -0600 [thread overview]
Message-ID: <483EDCEE.6070307@cisco.com> (raw)
In-Reply-To: <483D391F.7050007@qumranet.com>
[-- Attachment #1: Type: text/plain, Size: 5838 bytes --]
Avi Kivity wrote:
> David S. Ahern wrote:
>> The short answer is that I am still see large system time hiccups in the
>> guests due to kscand in the guest scanning its active lists. I do see
>> better response for a KVM_MAX_PTE_HISTORY of 3 than with 4. (For
>> completeness I also tried a history of 2, but it performed worse than 3
>> which is no surprise given the meaning of it.)
>>
>>
>> I have been able to scratch out a simplistic program that stimulates
>> kscand activity similar to what is going on in my real guest (see
>> attached). The program requests a memory allocation, initializes it (to
>> get it backed) and then in a loop sweeps through the memory in chunks
>> similar to a program using parts of its memory here and there but
>> eventually accessing all of it.
>>
>> Start the RHEL3/CentOS 3 guest with *2GB* of RAM (or more). The key is
>> using a fair amount of highmem. Start a couple of instances of the
>> attached. For example, I've been using these 2:
>>
>> memuser 768M 120 5 300
>> memuser 384M 300 10 600
>>
>> Together these instances take up a 1GB of RAM and once initialized
>> consume very little CPU. On kvm they make kscand and kswapd go nuts
>> every 5-15 minutes. For comparison, I do not see the same behavior for
>> an identical setup running on esx 3.5.
>>
>
> I haven't been able to reproduce this:
>
>> [root@localhost root]# ps -elf | grep -E 'memuser|kscand'
>> 1 S root 7 1 1 75 0 - 0 schedu 10:07 ?
>> 00:00:26 [kscand]
>> 0 S root 1464 1 1 75 0 - 196986 schedu 10:20 pts/0
>> 00:00:21 ./memuser 768M 120 5 300
>> 0 S root 1465 1 0 75 0 - 98683 schedu 10:20 pts/0
>> 00:00:10 ./memuser 384M 300 10 600
>> 0 S root 2148 1293 0 75 0 - 922 pipe_w 10:48 pts/0
>> 00:00:00 grep -E memuser|kscand
>
> The workload has been running for about half an hour, and kswapd cpu
> usage doesn't seem significant. This is a 2GB guest running with my
> patch ported to kvm.git HEAD. Guest is has 2G of memory.
>
I'm running on the per-page-pte-tracking branch, and I am still seeing it.
I doubt you want to sit and watch the screen for an hour, so install sysstat if not already, change the sample rate to 1 minute (/etc/cron.d/sysstat), let the server run for a few hours and then run 'sar -u'. You'll see something like this:
10:12:11 AM LINUX RESTART
10:13:03 AM CPU %user %nice %system %iowait %idle
10:14:01 AM all 0.08 0.00 2.08 0.35 97.49
10:15:03 AM all 0.05 0.00 0.79 0.04 99.12
10:15:59 AM all 0.15 0.00 1.52 0.06 98.27
10:17:01 AM all 0.04 0.00 0.69 0.04 99.23
10:17:59 AM all 0.01 0.00 0.39 0.00 99.60
10:18:59 AM all 0.00 0.00 0.12 0.02 99.87
10:20:02 AM all 0.18 0.00 14.62 0.09 85.10
10:21:01 AM all 0.71 0.00 26.35 0.01 72.94
10:22:02 AM all 0.67 0.00 10.61 0.00 88.72
10:22:59 AM all 0.14 0.00 1.80 0.00 98.06
10:24:03 AM all 0.13 0.00 0.50 0.00 99.37
10:24:59 AM all 0.09 0.00 11.46 0.00 88.45
10:26:03 AM all 0.16 0.00 0.69 0.03 99.12
10:26:59 AM all 0.14 0.00 10.01 0.02 89.83
10:28:03 AM all 0.57 0.00 2.20 0.03 97.20
Average: all 0.21 0.00 5.55 0.05 94.20
every one of those jumps in %system time directly correlates to kscand activity. Without the memuser programs running the guest %system time is <1%. The point of this silly memuser program is just to use high memory -- let it age, then make it active again, sit idle, repeat. If you run kvm_stat with -l in the host you'll see the jump in pte writes/updates. An intern here added a timestamp to the kvm_stat output for me which helps to directly correlate guest/host data.
I also ran my real guest on the branch. Performance at boot through the first 15 minutes was much better, but I'm still seeing recurring hits every 5 minutes when kscand kicks in. Here's the data from the guest for the first one which happened after 15 minutes of uptime:
active_anon_scan: HighMem, age 11, count[age] 24886 -> 5796, direct 24845, dj 59
active_anon_scan: HighMem, age 7, count[age] 47772 -> 21289, direct 40868, dj 103
active_anon_scan: HighMem, age 3, count[age] 91007 -> 329, direct 45805, dj 1212
The kvm_stat data for this time period is attached due to line lengths.
Also, I forgot to mention this before, but there is a bug in the kscand code in the RHEL3U8 kernel. When it scans the cache list it uses the count from the anonymous list:
if (need_active_cache_scan(zone)) {
for (age = MAX_AGE-1; age >= 0; age--) {
scan_active_list(zone, age,
&zone->active_cache_list[age],
zone->active_anon_count[age]);
^^^^^^^^^^^^^^^^^
if (current->need_resched)
schedule();
}
}
When the anonymous count is higher it is scanning the cache list repeatedly. An example of that was captured here:
active_cache_scan: HighMem, age 7, count[age] 222 -> 179, count anon 111967, direct 626, dj 3
count anon is active_anon_count[age] which at this moment was 111,967. There were only 222 entries in the cache list, but the count value passed to scan_active_list was 111,967. When the cache list has a lot of direct pages, that causes a larger hit on kvm than needed. That said, I have to live with the bug in the guest.
david
[-- Attachment #2: kvm_stat.kscand --]
[-- Type: text/plain, Size: 2650 bytes --]
kvm-69/kvm_stat -f 'mmu*|pf*' -l:
mmio_exit mmu_cache mmu_flood mmu_pde_z mmu_pte_u mmu_pte_w mmu_recyc mmu_shado pf_fixed pf_guest
182 18 18 0 5664 5682 0 18 5720 21
211 59 59 0 7040 7105 0 59 7348 99
81 0 48 0 45861 45909 0 48 45910 1
209 683 814 0 178527 179405 0 814 181410 9
67 111 320 0 175602 175922 0 320 177202 0
28 0 29 0 181365 181394 0 29 181394 0
7 0 22 0 181834 181856 0 22 181855 0
35 0 14 0 180129 180143 0 14 180144 0
7 0 10 0 179141 179151 0 10 179150 0
35 0 3 0 181359 181361 0 3 181362 0
7 0 4 0 181565 181570 0 4 181570 0
21 0 3 0 181435 181437 0 3 181437 0
21 0 4 0 181281 181286 0 4 181285 0
21 0 3 0 179444 179447 0 3 179448 0
91 0 61 0 179841 179902 0 61 179902 0
7 0 247 0 176628 176875 0 247 176874 0
313 478 133 1 100486 100604 0 133 126690 80
162 21 18 0 6361 6379 0 18 6584 5
294 40 23 21 9144 9188 0 25 9544 45
143 5 1 0 5026 5027 0 1 5502 1
The above corresponds to the following from the guest:
active_anon_scan: HighMem, age 11, count[age] 24886 -> 5796, direct 24845, dj 59
active_anon_scan: HighMem, age 7, count[age] 47772 -> 21289, direct 40868, dj 103
active_anon_scan: HighMem, age 3, count[age] 91007 -> 329, direct 45805, dj 1212
next prev parent reply other threads:[~2008-05-29 16:43 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-16 0:15 performance with guests running 2.4 kernels (specifically RHEL3) David S. Ahern
2008-04-16 8:46 ` Avi Kivity
2008-04-17 21:12 ` David S. Ahern
2008-04-18 7:57 ` Avi Kivity
2008-04-21 4:31 ` David S. Ahern
2008-04-21 9:19 ` Avi Kivity
2008-04-21 17:07 ` David S. Ahern
2008-04-22 20:23 ` David S. Ahern
2008-04-23 8:04 ` Avi Kivity
2008-04-23 15:23 ` David S. Ahern
2008-04-23 15:53 ` Avi Kivity
2008-04-23 16:39 ` David S. Ahern
2008-04-24 17:25 ` David S. Ahern
2008-04-26 6:43 ` Avi Kivity
2008-04-26 6:20 ` Avi Kivity
2008-04-25 17:33 ` David S. Ahern
2008-04-26 6:45 ` Avi Kivity
2008-04-28 18:15 ` Marcelo Tosatti
2008-04-28 23:45 ` David S. Ahern
2008-04-30 4:18 ` David S. Ahern
2008-04-30 9:55 ` Avi Kivity
2008-04-30 13:39 ` David S. Ahern
2008-04-30 13:49 ` Avi Kivity
2008-05-11 12:32 ` Avi Kivity
2008-05-11 13:36 ` Avi Kivity
2008-05-13 3:49 ` David S. Ahern
2008-05-13 7:25 ` Avi Kivity
2008-05-14 20:35 ` David S. Ahern
2008-05-15 10:53 ` Avi Kivity
2008-05-17 4:31 ` David S. Ahern
[not found] ` <482FCEE1.5040306@qumranet.com>
[not found] ` <4830F90A.1020809@cisco.com>
2008-05-19 4:14 ` [kvm-devel] " David S. Ahern
2008-05-19 14:27 ` Avi Kivity
2008-05-19 16:25 ` David S. Ahern
2008-05-19 17:04 ` Avi Kivity
2008-05-20 14:19 ` Avi Kivity
2008-05-20 14:34 ` Avi Kivity
2008-05-22 22:08 ` David S. Ahern
2008-05-28 10:51 ` Avi Kivity
2008-05-28 14:13 ` David S. Ahern
2008-05-28 14:35 ` Avi Kivity
2008-05-28 19:49 ` David S. Ahern
2008-05-29 6:37 ` Avi Kivity
2008-05-28 14:48 ` Andrea Arcangeli
2008-05-28 14:57 ` Avi Kivity
2008-05-28 15:39 ` David S. Ahern
2008-05-29 11:49 ` Avi Kivity
2008-05-29 12:10 ` Avi Kivity
2008-05-29 13:49 ` David S. Ahern
2008-05-29 14:08 ` Avi Kivity
2008-05-28 15:58 ` Andrea Arcangeli
2008-05-28 15:37 ` Avi Kivity
2008-05-28 15:43 ` David S. Ahern
2008-05-28 17:04 ` Andrea Arcangeli
2008-05-28 17:24 ` David S. Ahern
2008-05-29 10:01 ` Avi Kivity
2008-05-29 14:27 ` Andrea Arcangeli
2008-05-29 15:11 ` David S. Ahern
2008-05-29 15:16 ` Avi Kivity
2008-05-30 13:12 ` Andrea Arcangeli
2008-05-31 7:39 ` Avi Kivity
2008-05-29 16:42 ` David S. Ahern [this message]
2008-05-31 8:16 ` Avi Kivity
2008-06-02 16:42 ` David S. Ahern
2008-06-05 8:37 ` Avi Kivity
2008-06-05 16:20 ` David S. Ahern
2008-06-06 16:40 ` Avi Kivity
2008-06-19 4:20 ` David S. Ahern
2008-06-22 6:34 ` Avi Kivity
2008-06-23 14:09 ` David S. Ahern
2008-06-25 9:51 ` Avi Kivity
2008-04-30 13:56 ` Daniel P. Berrange
2008-04-30 14:23 ` David S. Ahern
2008-04-23 8:03 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=483EDCEE.6070307@cisco.com \
--to=daahern@cisco.com \
--cc=avi@qumranet.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox