From: Zlatko Calusic <zcalusic@bitsync.net>
To: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH RFC] mm: lru milestones, timestamps and ages
Date: Sat, 04 May 2013 15:32:58 +0200 [thread overview]
Message-ID: <51850E0A.5010803@bitsync.net> (raw)
In-Reply-To: <5184F6C9.4060506@openvz.org>
On 04.05.2013 13:53, Konstantin Khlebnikov wrote:
> Zlatko Calusic wrote:
>> On 30.04.2013 13:02, Konstantin Khlebnikov wrote:
>>> This patch adds engine for estimating rotation time for pages in lru
>>> lists.
>>>
>>> This adds bunch of 'milestones' into each struct lruvec and inserts
>>> them into
>>> lru lists periodically. Milestone flows in lru together with pages
>>> and brings
>>> timestamp to the end of lru. Because milestones are embedded into
>>> lruvec they
>>> can be easily distinguished from pages by comparing pointers.
>>> Only few functions should care about that.
>>>
>>> This machinery provides discrete-time estimation for age of pages
>>> from the end
>>> of each lru and average age of each kind of evictable lrus in each zone.
>>
>> Great stuff!
>
> Thanks!
>
>>
>> Believe it or not, I had an idea of writing something similar to this,
>> but of course having an idea and actually implementing it are two very
>> different things. Thank you for your work!
>>
>> I will use this to prove (or not) that file pages in the normal zone
>> on a 4GB RAM machine are reused waaaay too soon. Actually, I already
>> have the patch applied and running on the desktop, but it should be
>> much more useful on server workloads. Desktops have erratic load and
>> can go for a long time with very little I/O activity. But, here are
>> the current numbers anyway:
>>
>> Node 0, zone DMA32
>> pages free 5371
>> nr_inactive_anon 4257
>> nr_active_anon 139719
>> nr_inactive_file 617537
>> nr_active_file 51671
>> inactive_ratio: 5
>> avg_age_inactive_anon: 2514752
>> avg_age_active_anon: 2514752
>> avg_age_inactive_file: 876416
>> avg_age_active_file: 2514752
>> Node 0, zone Normal
>> pages free 424
>> nr_inactive_anon 253
>> nr_active_anon 54480
>> nr_inactive_file 63274
>> nr_active_file 44116
>> inactive_ratio: 1
>> avg_age_inactive_anon: 2531712
>> avg_age_active_anon: 2531712
>> avg_age_inactive_file: 901120
>> avg_age_active_file: 2531712
>>
>>> In our kernel we use similar engine as source of statistics for
>>> scheduler in
>>> memory reclaimer. This is O(1) scheduler which shifts vmscan
>>> priorities for lru
>>> vectors depending on their sizes, limits and ages. It tries to
>>> balance memory
>>> pressure among containers. I'll try to rework it for the mainline
>>> kernel soon.
>>>
>>> Seems like these ages also can be used for optimal memory pressure
>>> distribution
>>> between file and anon pages, and probably for balancing pressure
>>> among zones.
>>
>> This all sounds very promising. Especially because I currently observe
>> quite some imbalance among zones.
>
> As I see, most likely reason of such imbalances is 'break' condition
> inside of shrink_lruvec().
> So can try to disable it see what will happen.
Thanks for the hint. I will pay some more attention to this function
next time I investigate code.
>
> But these numbers from your desktop actually doesn't proves this
> problem. Seems like difference
> between zones is within the precision of this method. I don't know how
> to describe this precisely.
> Probably irregularity between milestones also should be taken into the
> account to describe current
> situation and quality of measurement.
>
Ah, no, the numbers were more like a proof that your patch is running
fine, nothing specific about them. I was just making a quick check that
your patch is stable enough before I run it in production, and it seems
it's working just fine.
In the next hour or so I will patch the kernel on the server where I
intend to do much more analysis. I also prepared a set of graphs based
on the numbers your code provides. Based on the preliminary tests, I
believe that I'll be interested only in the aging of the inactive file
lists. What I'm after is the bug explained here
http://marc.info/?l=linux-mm&m=136571221426984 and if I'm right, your
patch will help to better reveal extreme disbalance observed between
dma32 and normal zone file LRU aging. But only on a 4GB nodes. I haven't
seen anything similar on a 8GB nodes, where dma32 and normal zones are
approximately the same sizes.
--
Zlatko
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-05-04 13:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-30 11:02 [PATCH RFC] mm: lru milestones, timestamps and ages Konstantin Khlebnikov
2013-05-03 14:07 ` Zlatko Calusic
2013-05-04 11:53 ` Konstantin Khlebnikov
2013-05-04 13:01 ` Konstantin Khlebnikov
2013-05-04 21:36 ` Zlatko Calusic
2013-05-06 19:08 ` Johannes Weiner
2013-05-04 13:32 ` Zlatko Calusic [this message]
2013-05-10 10:28 ` Mel Gorman
2013-05-10 14:12 ` Konstantin Khlebnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51850E0A.5010803@bitsync.net \
--to=zcalusic@bitsync.net \
--cc=khlebnikov@openvz.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.