linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Zlatko Calusic <zcalusic@bitsync.net>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH RFC] mm: lru milestones, timestamps and ages
Date: Sat, 04 May 2013 17:01:14 +0400	[thread overview]
Message-ID: <5185069A.1080306@openvz.org> (raw)
In-Reply-To: <5184F6C9.4060506@openvz.org>

Konstantin Khlebnikov wrote:
> Zlatko Calusic wrote:
>> On 30.04.2013 13:02, Konstantin Khlebnikov wrote:
>>> This patch adds engine for estimating rotation time for pages in lru lists.
>>>
>>> This adds bunch of 'milestones' into each struct lruvec and inserts them into
>>> lru lists periodically. Milestone flows in lru together with pages and brings
>>> timestamp to the end of lru. Because milestones are embedded into lruvec they
>>> can be easily distinguished from pages by comparing pointers.
>>> Only few functions should care about that.
>>>
>>> This machinery provides discrete-time estimation for age of pages from the end
>>> of each lru and average age of each kind of evictable lrus in each zone.
>>
>> Great stuff!
>
> Thanks!
>
>>
>> Believe it or not, I had an idea of writing something similar to this, but of course having an idea and actually
>> implementing it are two very different things. Thank you for your work!
>>
>> I will use this to prove (or not) that file pages in the normal zone on a 4GB RAM machine are reused waaaay too soon.
>> Actually, I already have the patch applied and running on the desktop, but it should be much more useful on server
>> workloads. Desktops have erratic load and can go for a long time with very little I/O activity. But, here are the
>> current numbers anyway:
>>
>> Node 0, zone DMA32
>> pages free 5371
>> nr_inactive_anon 4257
>> nr_active_anon 139719
>> nr_inactive_file 617537
>> nr_active_file 51671
>> inactive_ratio: 5
>> avg_age_inactive_anon: 2514752
>> avg_age_active_anon: 2514752
>> avg_age_inactive_file: 876416
>> avg_age_active_file: 2514752
>> Node 0, zone Normal
>> pages free 424
>> nr_inactive_anon 253
>> nr_active_anon 54480
>> nr_inactive_file 63274
>> nr_active_file 44116
>> inactive_ratio: 1
>> avg_age_inactive_anon: 2531712
>> avg_age_active_anon: 2531712
>> avg_age_inactive_file: 901120
>> avg_age_active_file: 2531712
>>
>>> In our kernel we use similar engine as source of statistics for scheduler in
>>> memory reclaimer. This is O(1) scheduler which shifts vmscan priorities for lru
>>> vectors depending on their sizes, limits and ages. It tries to balance memory
>>> pressure among containers. I'll try to rework it for the mainline kernel soon.
>>>
>>> Seems like these ages also can be used for optimal memory pressure distribution
>>> between file and anon pages, and probably for balancing pressure among zones.
>>
>> This all sounds very promising. Especially because I currently observe quite some imbalance among zones.
>
> As I see, most likely reason of such imbalances is 'break' condition inside of shrink_lruvec().
> So can try to disable it see what will happen.
>
> But these numbers from your desktop actually doesn't proves this problem. Seems like difference
> between zones is within the precision of this method. I don't know how to describe this precisely.
> Probably irregularity between milestones also should be taken into the account to describe current
> situation and quality of measurement.
>
> Here current numbers from my 8Gb node. Main workload is a torrent client.
>
> Node 0, zone DMA32
> nr_inactive_anon 1
> nr_active_anon 1494
> nr_inactive_file 404028
> nr_active_file 365525
> nr_dirtied 855068
> nr_written 854991
> avg_age_inactive_anon: 64942528
> avg_age_active_anon: 64942528
> avg_age_inactive_file: 1281317
> avg_age_active_file: 15813376
> Node 0, zone Normal
> nr_inactive_anon 376
> nr_active_anon 13793
> nr_inactive_file 542605
> nr_active_file 542247
> nr_dirtied 2746747
> nr_written 2746266
> avg_age_inactive_anon: 65064192
> avg_age_active_anon: 65064192
> avg_age_inactive_file: 1260611
> avg_age_active_file: 8765240
>
> So, here noticeable imbalance in ages of active file lru and nr_dirtied/nr_written.
> I have no idea why, but torrent client uses syscall fadvise() which messes whole picture.

Hey! I can reproduce this:

Node 0, zone    DMA32
     nr_inactive_anon 1
     nr_active_anon 2368
     nr_inactive_file 373642
     nr_active_file 375462
     nr_dirtied   2887369
     nr_written   2887291
   inactive_ratio:    5
   avg_age_inactive_anon: 64942528
   avg_age_active_anon:   64942528
   avg_age_inactive_file: 389824
   avg_age_active_file:   1330368
Node 0, zone   Normal
     nr_inactive_anon 376
     nr_active_anon 17768
     nr_inactive_file 534695
     nr_active_file 533685
     nr_dirtied   12071397
     nr_written   11940007
   inactive_ratio:    6
   avg_age_inactive_anon: 65064192
   avg_age_active_anon:   65064192
   avg_age_inactive_file: 28074
   avg_age_active_file:   1304800

I'm just copying huge files from one disk to another by rsync.

In /proc/vmstat pgsteal_kswapd_normal and pgscan_kswapd_normal are rising rapidly,
other pgscan_* pgsteal_* are standing still. So, bug is somewhere in the kswapd.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-05-04 13:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-30 11:02 [PATCH RFC] mm: lru milestones, timestamps and ages Konstantin Khlebnikov
2013-05-03 14:07 ` Zlatko Calusic
2013-05-04 11:53   ` Konstantin Khlebnikov
2013-05-04 13:01     ` Konstantin Khlebnikov [this message]
2013-05-04 21:36       ` Zlatko Calusic
2013-05-06 19:08       ` Johannes Weiner
2013-05-04 13:32     ` Zlatko Calusic
2013-05-10 10:28 ` Mel Gorman
2013-05-10 14:12   ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5185069A.1080306@openvz.org \
    --to=khlebnikov@openvz.org \
    --cc=linux-mm@kvack.org \
    --cc=zcalusic@bitsync.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).