Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "David S. Ahern" <daahern@cisco.com>
To: Avi Kivity <avi@qumranet.com>
Cc: kvm@vger.kernel.org
Subject: Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)
Date: Mon, 02 Jun 2008 10:42:22 -0600	[thread overview]
Message-ID: <484422EE.5090501@cisco.com> (raw)
In-Reply-To: <4841094A.8090507@qumranet.com>


Avi Kivity wrote:
> David S. Ahern wrote:
>>> I haven't been able to reproduce this:
>>>
>>>    
>>>> [root@localhost root]# ps -elf | grep -E 'memuser|kscand'
>>>> 1 S root         7     1  1  75   0    -     0 schedu 10:07 ?      
>>>> 00:00:26 [kscand]
>>>> 0 S root      1464     1  1  75   0    - 196986 schedu 10:20 pts/0 
>>>> 00:00:21 ./memuser 768M 120 5 300
>>>> 0 S root      1465     1  0  75   0    - 98683 schedu 10:20 pts/0  
>>>> 00:00:10 ./memuser 384M 300 10 600
>>>> 0 S root      2148  1293  0  75   0    -   922 pipe_w 10:48 pts/0  
>>>> 00:00:00 grep -E memuser|kscand
>>>>       
>>> The workload has been running for about half an hour, and kswapd cpu
>>> usage doesn't seem significant.  This is a 2GB guest running with my
>>> patch ported to kvm.git HEAD.  Guest is has 2G of memory.
>>>
>>>     
>>
>> I'm running on the per-page-pte-tracking branch, and I am still seeing
>> it.
>> I doubt you want to sit and watch the screen for an hour, so install
>> sysstat if not already, change the sample rate to 1 minute
>> (/etc/cron.d/sysstat), let the server run for a few hours and then run
>> 'sar -u'. You'll see something like this:
>>
>> 10:12:11 AM       LINUX RESTART
>>
>> 10:13:03 AM       CPU     %user     %nice   %system   %iowait     %idle
>> 10:14:01 AM       all      0.08      0.00      2.08      0.35     97.49
>> 10:15:03 AM       all      0.05      0.00      0.79      0.04     99.12
>> 10:15:59 AM       all      0.15      0.00      1.52      0.06     98.27
>> 10:17:01 AM       all      0.04      0.00      0.69      0.04     99.23
>> 10:17:59 AM       all      0.01      0.00      0.39      0.00     99.60
>> 10:18:59 AM       all      0.00      0.00      0.12      0.02     99.87
>> 10:20:02 AM       all      0.18      0.00     14.62      0.09     85.10
>> 10:21:01 AM       all      0.71      0.00     26.35      0.01     72.94
>> 10:22:02 AM       all      0.67      0.00     10.61      0.00     88.72
>> 10:22:59 AM       all      0.14      0.00      1.80      0.00     98.06
>> 10:24:03 AM       all      0.13      0.00      0.50      0.00     99.37
>> 10:24:59 AM       all      0.09      0.00     11.46      0.00     88.45
>> 10:26:03 AM       all      0.16      0.00      0.69      0.03     99.12
>> 10:26:59 AM       all      0.14      0.00     10.01      0.02     89.83
>> 10:28:03 AM       all      0.57      0.00      2.20      0.03     97.20
>> Average:          all      0.21      0.00      5.55      0.05     94.20
>>
>>
>> every one of those jumps in %system time directly correlates to kscand
>> activity. Without the memuser programs running the guest %system time
>> is <1%. The point of this silly memuser program is just to use high
>> memory -- let it age, then make it active again, sit idle, repeat. If
>> you run kvm_stat with -l in the host you'll see the jump in pte
>> writes/updates. An intern here added a timestamp to the kvm_stat
>> output for me which helps to directly correlate guest/host data.
>>
>>
>> I also ran my real guest on the branch. Performance at boot through
>> the first 15 minutes was much better, but I'm still seeing recurring
>> hits every 5 minutes when kscand kicks in. Here's the data from the
>> guest for the first one which happened after 15 minutes of uptime:
>>
>> active_anon_scan: HighMem, age 11, count[age] 24886 -> 5796, direct
>> 24845, dj 59
>>
>> active_anon_scan: HighMem, age 7, count[age] 47772 -> 21289, direct
>> 40868, dj 103
>>
>> active_anon_scan: HighMem, age 3, count[age] 91007 -> 329, direct
>> 45805, dj 1212
>>
>>   
> 
> We touched 90,000 ptes in 12 seconds.  That's 8,000 ptes per second. 
> Yet we see 180,000 page faults per second in the trace.
> 
> Oh!  Only 45K pages were direct, so the other 45K were shared, with
> perhaps many ptes.  We shoud count ptes, not pages.
> 
> Can you modify page_referenced() to count the numbers of ptes mapped (1
> for direct pages, nr_chains for indirect pages) and print the total
> deltas in active_anon_scan?
> 

Here you go. I've shortened the line lengths to get them to squeeze into
80 columns:

anon_scan, all HighMem zone, 187,910 active pages at loop start:
  count[12] 21462 -> 230,   direct 20469, chains 3479,   dj 58
  count[11] 1338  -> 1162,  direct 227,   chains 26144,  dj 59
  count[8] 29397  -> 5410,  direct 26115, chains 27617,  dj 117
  count[4] 35804  -> 25556, direct 31508, chains 82929,  dj 256
  count[3] 2738   -> 2207,  direct 2680,  chains 58,     dj 7
  count[0] 92580  -> 89509, direct 75024, chains 262834, dj 726
(age number is the index in [])

cache_scan, all HighMem zone, 48,298 active pages at loop start:
  count[12] 3642 -> 2982,  direct 499,  chains 20022, dj 44
  count[8] 11254 -> 11187, direct 7189, chains 9854,  dj 37
  count[4] 15709 -> 15702, direct 5071, chains 9388,  dj 31
(with anon_cache_count bug fixed)

If you sum the direct pages and the chains count for each row, convert
dj into dt (divided by HZ = 100) you get:

( 20469 + 3479 )   / 0.58 = 41289
( 227 + 26144 )    / 0.59 = 44696
( 26115 + 27617 )  / 1.17 = 45924
( 31508 + 82929 )  / 2.56 = 44701
( 2680 + 58 )      / 0.07 = 39114
( 75024 + 262834 ) / 7.26 = 46536
( 499 + 20022 )    / 0.44 = 46638
( 7189 + 9854 )    / 0.37 = 46062
( 5071 + 9388 )    / 0.31 = 46641

For 4 pte writes per direct page or chain entry comes to ~187,000/sec
which is close to the total collected by kvm_stat (data width shrunk to
fit in e-mail; hope this is readable still):


|----------         mmu_          ----------|-----  pf_  -----|
 cache  flood  pde_z    pte_u    pte_w  shado    fixed    guest
   267    271     95    21455    21842    285    22840      165
    66     88      0    12102    12224     88    12458        0
  2042   2133      0   178146   180515   2133   188089      387
  1053   1212      0   187067   188485   1212   193011        8
  4771   4811     88   185129   190998   4825   207490      448
   910    824      7   183066   184050    824   195836       12
   707    785      0   176381   177300    785   180350        6
  1167   1144      0   189618   191014   1144   195902       10
  4238   4193     87   188381   193590   4206   207030      465
  1448   1400      7   187786   189509   1400   198688       21
   982    971      0   187880   189076    971   198405        2
  1165   1208      0   190007   191503   1208   195746       13
  1106   1146      0   189144   190550   1146   195143        0
  4767   4788     96   185802   191704   4802   206362      477
  1388   1431      0   187387   188991   1431   195115        3
   584    551      0    77176    77802    551    84829       10
    12      7      0     3601     3609      7    13497        4
   243    153     91    31085    31333    167    35059      879
    21     18      6     3130     3155     18     3827        2
    21      4      1     4665     4670      4     6825        9

>> The kvm_stat data for this time period is attached due to line lengths.
>>
>>
>> Also, I forgot to mention this before, but there is a bug in the
>> kscand code in the RHEL3U8 kernel. When it scans the cache list it
>> uses the count from the anonymous list:
>>
>>             if (need_active_cache_scan(zone)) {
>>                 for (age = MAX_AGE-1; age >= 0; age--)  {
>>                     scan_active_list(zone, age,
>>                         &zone->active_cache_list[age],
>>                         zone->active_anon_count[age]);
>>                               ^^^^^^^^^^^^^^^^^
>>                     if (current->need_resched)
>>                         schedule();
>>                 }
>>             }
>>
>> When the anonymous count is higher it is scanning the cache list
>> repeatedly. An example of that was captured here:
>>
>> active_cache_scan: HighMem, age 7, count[age] 222 -> 179, count anon
>> 111967, direct 626, dj 3
>>
>> count anon is active_anon_count[age] which at this moment was 111,967.
>> There were only 222 entries in the cache list, but the count value
>> passed to scan_active_list was 111,967. When the cache list has a lot
>> of direct pages, that causes a larger hit on kvm than needed. That
>> said, I have to live with the bug in the guest.
>>   
> 
> For debugging, can you fix it?  It certainly has a large impact.
> 
yes, I have run a few tests with it fixed to get a ballpark on the
impact. The fix is included in the number above.

> Perhaps it is fixed in an update kernel.  There's a 2.4.21-50.EL in the
> centos 3.8 update repos.
>

next prev parent reply	other threads:[~2008-06-02 16:42 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-16  0:15 performance with guests running 2.4 kernels (specifically RHEL3) David S. Ahern
2008-04-16  8:46 ` Avi Kivity
2008-04-17 21:12   ` David S. Ahern
2008-04-18  7:57     ` Avi Kivity
2008-04-21  4:31       ` David S. Ahern
2008-04-21  9:19         ` Avi Kivity
2008-04-21 17:07           ` David S. Ahern
2008-04-22 20:23           ` David S. Ahern
2008-04-23  8:04             ` Avi Kivity
2008-04-23 15:23               ` David S. Ahern
2008-04-23 15:53                 ` Avi Kivity
2008-04-23 16:39                   ` David S. Ahern
2008-04-24 17:25                     ` David S. Ahern
2008-04-26  6:43                       ` Avi Kivity
2008-04-26  6:20                     ` Avi Kivity
2008-04-25 17:33                 ` David S. Ahern
2008-04-26  6:45                   ` Avi Kivity
2008-04-28 18:15                   ` Marcelo Tosatti
2008-04-28 23:45                     ` David S. Ahern
2008-04-30  4:18                       ` David S. Ahern
2008-04-30  9:55                         ` Avi Kivity
2008-04-30 13:39                           ` David S. Ahern
2008-04-30 13:49                             ` Avi Kivity
2008-05-11 12:32                               ` Avi Kivity
2008-05-11 13:36                                 ` Avi Kivity
2008-05-13  3:49                                   ` David S. Ahern
2008-05-13  7:25                                     ` Avi Kivity
2008-05-14 20:35                                       ` David S. Ahern
2008-05-15 10:53                                         ` Avi Kivity
2008-05-17  4:31                                           ` David S. Ahern
     [not found]                                             ` <482FCEE1.5040306@qumranet.com>
     [not found]                                               ` <4830F90A.1020809@cisco.com>
2008-05-19  4:14                                                 ` [kvm-devel] " David S. Ahern
2008-05-19 14:27                                                   ` Avi Kivity
2008-05-19 16:25                                                     ` David S. Ahern
2008-05-19 17:04                                                       ` Avi Kivity
2008-05-20 14:19                                                     ` Avi Kivity
2008-05-20 14:34                                                       ` Avi Kivity
2008-05-22 22:08                                                       ` David S. Ahern
2008-05-28 10:51                                                         ` Avi Kivity
2008-05-28 14:13                                                           ` David S. Ahern
2008-05-28 14:35                                                             ` Avi Kivity
2008-05-28 19:49                                                               ` David S. Ahern
2008-05-29  6:37                                                                 ` Avi Kivity
2008-05-28 14:48                                                             ` Andrea Arcangeli
2008-05-28 14:57                                                               ` Avi Kivity
2008-05-28 15:39                                                                 ` David S. Ahern
2008-05-29 11:49                                                                   ` Avi Kivity
2008-05-29 12:10                                                                   ` Avi Kivity
2008-05-29 13:49                                                                     ` David S. Ahern
2008-05-29 14:08                                                                       ` Avi Kivity
2008-05-28 15:58                                                                 ` Andrea Arcangeli
2008-05-28 15:37                                                               ` Avi Kivity
2008-05-28 15:43                                                                 ` David S. Ahern
2008-05-28 17:04                                                                   ` Andrea Arcangeli
2008-05-28 17:24                                                                     ` David S. Ahern
2008-05-29 10:01                                                                     ` Avi Kivity
2008-05-29 14:27                                                                       ` Andrea Arcangeli
2008-05-29 15:11                                                                         ` David S. Ahern
2008-05-29 15:16                                                                         ` Avi Kivity
2008-05-30 13:12                                                                           ` Andrea Arcangeli
2008-05-31  7:39                                                                             ` Avi Kivity
2008-05-29 16:42                                                           ` David S. Ahern
2008-05-31  8:16                                                             ` Avi Kivity
2008-06-02 16:42                                                               ` David S. Ahern [this message]
2008-06-05  8:37                                                                 ` Avi Kivity
2008-06-05 16:20                                                                   ` David S. Ahern
2008-06-06 16:40                                                                     ` Avi Kivity
2008-06-19  4:20                                                                       ` David S. Ahern
2008-06-22  6:34                                                                         ` Avi Kivity
2008-06-23 14:09                                                                           ` David S. Ahern
2008-06-25  9:51                                                                             ` Avi Kivity
2008-04-30 13:56                             ` Daniel P. Berrange
2008-04-30 14:23                               ` David S. Ahern
2008-04-23  8:03     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=484422EE.5090501@cisco.com \
    --to=daahern@cisco.com \
    --cc=avi@qumranet.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.