Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: "David S. Ahern" <daahern@cisco.com>
To: Avi Kivity <avi@qumranet.com>
Cc: kvm@vger.kernel.org
Subject: Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)
Date: Mon, 02 Jun 2008 10:42:22 -0600	[thread overview]
Message-ID: <484422EE.5090501@cisco.com> (raw)
In-Reply-To: <4841094A.8090507@qumranet.com>


Avi Kivity wrote:
> David S. Ahern wrote:
>>> I haven't been able to reproduce this:
>>>
>>>    
>>>> [root@localhost root]# ps -elf | grep -E 'memuser|kscand'
>>>> 1 S root         7     1  1  75   0    -     0 schedu 10:07 ?      
>>>> 00:00:26 [kscand]
>>>> 0 S root      1464     1  1  75   0    - 196986 schedu 10:20 pts/0 
>>>> 00:00:21 ./memuser 768M 120 5 300
>>>> 0 S root      1465     1  0  75   0    - 98683 schedu 10:20 pts/0  
>>>> 00:00:10 ./memuser 384M 300 10 600
>>>> 0 S root      2148  1293  0  75   0    -   922 pipe_w 10:48 pts/0  
>>>> 00:00:00 grep -E memuser|kscand
>>>>       
>>> The workload has been running for about half an hour, and kswapd cpu
>>> usage doesn't seem significant.  This is a 2GB guest running with my
>>> patch ported to kvm.git HEAD.  Guest is has 2G of memory.
>>>
>>>     
>>
>> I'm running on the per-page-pte-tracking branch, and I am still seeing
>> it.
>> I doubt you want to sit and watch the screen for an hour, so install
>> sysstat if not already, change the sample rate to 1 minute
>> (/etc/cron.d/sysstat), let the server run for a few hours and then run
>> 'sar -u'. You'll see something like this:
>>
>> 10:12:11 AM       LINUX RESTART
>>
>> 10:13:03 AM       CPU     %user     %nice   %system   %iowait     %idle
>> 10:14:01 AM       all      0.08      0.00      2.08      0.35     97.49
>> 10:15:03 AM       all      0.05      0.00      0.79      0.04     99.12
>> 10:15:59 AM       all      0.15      0.00      1.52      0.06     98.27
>> 10:17:01 AM       all      0.04      0.00      0.69      0.04     99.23
>> 10:17:59 AM       all      0.01      0.00      0.39      0.00     99.60
>> 10:18:59 AM       all      0.00      0.00      0.12      0.02     99.87
>> 10:20:02 AM       all      0.18      0.00     14.62      0.09     85.10
>> 10:21:01 AM       all      0.71      0.00     26.35      0.01     72.94
>> 10:22:02 AM       all      0.67      0.00     10.61      0.00     88.72
>> 10:22:59 AM       all      0.14      0.00      1.80      0.00     98.06
>> 10:24:03 AM       all      0.13      0.00      0.50      0.00     99.37
>> 10:24:59 AM       all      0.09      0.00     11.46      0.00     88.45
>> 10:26:03 AM       all      0.16      0.00      0.69      0.03     99.12
>> 10:26:59 AM       all      0.14      0.00     10.01      0.02     89.83
>> 10:28:03 AM       all      0.57      0.00      2.20      0.03     97.20
>> Average:          all      0.21      0.00      5.55      0.05     94.20
>>
>>
>> every one of those jumps in %system time directly correlates to kscand
>> activity. Without the memuser programs running the guest %system time
>> is <1%. The point of this silly memuser program is just to use high
>> memory -- let it age, then make it active again, sit idle, repeat. If
>> you run kvm_stat with -l in the host you'll see the jump in pte
>> writes/updates. An intern here added a timestamp to the kvm_stat
>> output for me which helps to directly correlate guest/host data.
>>
>>
>> I also ran my real guest on the branch. Performance at boot through
>> the first 15 minutes was much better, but I'm still seeing recurring
>> hits every 5 minutes when kscand kicks in. Here's the data from the
>> guest for the first one which happened after 15 minutes of uptime:
>>
>> active_anon_scan: HighMem, age 11, count[age] 24886 -> 5796, direct
>> 24845, dj 59
>>
>> active_anon_scan: HighMem, age 7, count[age] 47772 -> 21289, direct
>> 40868, dj 103
>>
>> active_anon_scan: HighMem, age 3, count[age] 91007 -> 329, direct
>> 45805, dj 1212
>>
>>   
> 
> We touched 90,000 ptes in 12 seconds.  That's 8,000 ptes per second. 
> Yet we see 180,000 page faults per second in the trace.
> 
> Oh!  Only 45K pages were direct, so the other 45K were shared, with
> perhaps many ptes.  We shoud count ptes, not pages.
> 
> Can you modify page_referenced() to count the numbers of ptes mapped (1
> for direct pages, nr_chains for indirect pages) and print the total
> deltas in active_anon_scan?
> 

Here you go. I've shortened the line lengths to get them to squeeze into
80 columns:

anon_scan, all HighMem zone, 187,910 active pages at loop start:
  count[12] 21462 -> 230,   direct 20469, chains 3479,   dj 58
  count[11] 1338  -> 1162,  direct 227,   chains 26144,  dj 59
  count[8] 29397  -> 5410,  direct 26115, chains 27617,  dj 117
  count[4] 35804  -> 25556, direct 31508, chains 82929,  dj 256
  count[3] 2738   -> 2207,  direct 2680,  chains 58,     dj 7
  count[0] 92580  -> 89509, direct 75024, chains 262834, dj 726
(age number is the index in [])

cache_scan, all HighMem zone, 48,298 active pages at loop start:
  count[12] 3642 -> 2982,  direct 499,  chains 20022, dj 44
  count[8] 11254 -> 11187, direct 7189, chains 9854,  dj 37
  count[4] 15709 -> 15702, direct 5071, chains 9388,  dj 31
(with anon_cache_count bug fixed)

If you sum the direct pages and the chains count for each row, convert
dj into dt (divided by HZ = 100) you get:

( 20469 + 3479 )   / 0.58 = 41289
( 227 + 26144 )    / 0.59 = 44696
( 26115 + 27617 )  / 1.17 = 45924
( 31508 + 82929 )  / 2.56 = 44701
( 2680 + 58 )      / 0.07 = 39114
( 75024 + 262834 ) / 7.26 = 46536
( 499 + 20022 )    / 0.44 = 46638
( 7189 + 9854 )    / 0.37 = 46062
( 5071 + 9388 )    / 0.31 = 46641

For 4 pte writes per direct page or chain entry comes to ~187,000/sec
which is close to the total collected by kvm_stat (data width shrunk to
fit in e-mail; hope this is readable still):


|----------         mmu_          ----------|-----  pf_  -----|
 cache  flood  pde_z    pte_u    pte_w  shado    fixed    guest
   267    271     95    21455    21842    285    22840      165
    66     88      0    12102    12224     88    12458        0
  2042   2133      0   178146   180515   2133   188089      387
  1053   1212      0   187067   188485   1212   193011        8
  4771   4811     88   185129   190998   4825   207490      448
   910    824      7   183066   184050    824   195836       12
   707    785      0   176381   177300    785   180350        6
  1167   1144      0   189618   191014   1144   195902       10
  4238   4193     87   188381   193590   4206   207030      465
  1448   1400      7   187786   189509   1400   198688       21
   982    971      0   187880   189076    971   198405        2
  1165   1208      0   190007   191503   1208   195746       13
  1106   1146      0   189144   190550   1146   195143        0
  4767   4788     96   185802   191704   4802   206362      477
  1388   1431      0   187387   188991   1431   195115        3
   584    551      0    77176    77802    551    84829       10
    12      7      0     3601     3609      7    13497        4
   243    153     91    31085    31333    167    35059      879
    21     18      6     3130     3155     18     3827        2
    21      4      1     4665     4670      4     6825        9

>> The kvm_stat data for this time period is attached due to line lengths.
>>
>>
>> Also, I forgot to mention this before, but there is a bug in the
>> kscand code in the RHEL3U8 kernel. When it scans the cache list it
>> uses the count from the anonymous list:
>>
>>             if (need_active_cache_scan(zone)) {
>>                 for (age = MAX_AGE-1; age >= 0; age--)  {
>>                     scan_active_list(zone, age,
>>                         &zone->active_cache_list[age],
>>                         zone->active_anon_count[age]);
>>                               ^^^^^^^^^^^^^^^^^
>>                     if (current->need_resched)
>>                         schedule();
>>                 }
>>             }
>>
>> When the anonymous count is higher it is scanning the cache list
>> repeatedly. An example of that was captured here:
>>
>> active_cache_scan: HighMem, age 7, count[age] 222 -> 179, count anon
>> 111967, direct 626, dj 3
>>
>> count anon is active_anon_count[age] which at this moment was 111,967.
>> There were only 222 entries in the cache list, but the count value
>> passed to scan_active_list was 111,967. When the cache list has a lot
>> of direct pages, that causes a larger hit on kvm than needed. That
>> said, I have to live with the bug in the guest.
>>   
> 
> For debugging, can you fix it?  It certainly has a large impact.
> 
yes, I have run a few tests with it fixed to get a ballpark on the
impact. The fix is included in the number above.

> Perhaps it is fixed in an update kernel.  There's a 2.4.21-50.EL in the
> centos 3.8 update repos.
>

next prev parent reply	other threads:[~2008-06-02 16:42 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-16  0:15 performance with guests running 2.4 kernels (specifically RHEL3) David S. Ahern
2008-04-16  8:46 ` Avi Kivity
2008-04-17 21:12   ` David S. Ahern
2008-04-18  7:57     ` Avi Kivity
2008-04-21  4:31       ` David S. Ahern
2008-04-21  9:19         ` Avi Kivity
2008-04-21 17:07           ` David S. Ahern
2008-04-22 20:23           ` David S. Ahern
2008-04-23  8:04             ` Avi Kivity
2008-04-23 15:23               ` David S. Ahern
2008-04-23 15:53                 ` Avi Kivity
2008-04-23 16:39                   ` David S. Ahern
2008-04-24 17:25                     ` David S. Ahern
2008-04-26  6:43                       ` Avi Kivity
2008-04-26  6:20                     ` Avi Kivity
2008-04-25 17:33                 ` David S. Ahern
2008-04-26  6:45                   ` Avi Kivity
2008-04-28 18:15                   ` Marcelo Tosatti
2008-04-28 23:45                     ` David S. Ahern
2008-04-30  4:18                       ` David S. Ahern
2008-04-30  9:55                         ` Avi Kivity
2008-04-30 13:39                           ` David S. Ahern
2008-04-30 13:49                             ` Avi Kivity
2008-05-11 12:32                               ` Avi Kivity
2008-05-11 13:36                                 ` Avi Kivity
2008-05-13  3:49                                   ` David S. Ahern
2008-05-13  7:25                                     ` Avi Kivity
2008-05-14 20:35                                       ` David S. Ahern
2008-05-15 10:53                                         ` Avi Kivity
2008-05-17  4:31                                           ` David S. Ahern
     [not found]                                             ` <482FCEE1.5040306@qumranet.com>
     [not found]                                               ` <4830F90A.1020809@cisco.com>
2008-05-19  4:14                                                 ` [kvm-devel] " David S. Ahern
2008-05-19 14:27                                                   ` Avi Kivity
2008-05-19 16:25                                                     ` David S. Ahern
2008-05-19 17:04                                                       ` Avi Kivity
2008-05-20 14:19                                                     ` Avi Kivity
2008-05-20 14:34                                                       ` Avi Kivity
2008-05-22 22:08                                                       ` David S. Ahern
2008-05-28 10:51                                                         ` Avi Kivity
2008-05-28 14:13                                                           ` David S. Ahern
2008-05-28 14:35                                                             ` Avi Kivity
2008-05-28 19:49                                                               ` David S. Ahern
2008-05-29  6:37                                                                 ` Avi Kivity
2008-05-28 14:48                                                             ` Andrea Arcangeli
2008-05-28 14:57                                                               ` Avi Kivity
2008-05-28 15:39                                                                 ` David S. Ahern
2008-05-29 11:49                                                                   ` Avi Kivity
2008-05-29 12:10                                                                   ` Avi Kivity
2008-05-29 13:49                                                                     ` David S. Ahern
2008-05-29 14:08                                                                       ` Avi Kivity
2008-05-28 15:58                                                                 ` Andrea Arcangeli
2008-05-28 15:37                                                               ` Avi Kivity
2008-05-28 15:43                                                                 ` David S. Ahern
2008-05-28 17:04                                                                   ` Andrea Arcangeli
2008-05-28 17:24                                                                     ` David S. Ahern
2008-05-29 10:01                                                                     ` Avi Kivity
2008-05-29 14:27                                                                       ` Andrea Arcangeli
2008-05-29 15:11                                                                         ` David S. Ahern
2008-05-29 15:16                                                                         ` Avi Kivity
2008-05-30 13:12                                                                           ` Andrea Arcangeli
2008-05-31  7:39                                                                             ` Avi Kivity
2008-05-29 16:42                                                           ` David S. Ahern
2008-05-31  8:16                                                             ` Avi Kivity
2008-06-02 16:42                                                               ` David S. Ahern [this message]
2008-06-05  8:37                                                                 ` Avi Kivity
2008-06-05 16:20                                                                   ` David S. Ahern
2008-06-06 16:40                                                                     ` Avi Kivity
2008-06-19  4:20                                                                       ` David S. Ahern
2008-06-22  6:34                                                                         ` Avi Kivity
2008-06-23 14:09                                                                           ` David S. Ahern
2008-06-25  9:51                                                                             ` Avi Kivity
2008-04-30 13:56                             ` Daniel P. Berrange
2008-04-30 14:23                               ` David S. Ahern
2008-04-23  8:03     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=484422EE.5090501@cisco.com \
    --to=daahern@cisco.com \
    --cc=avi@qumranet.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox