From: Zlatko Calusic <zcalusic@bitsync.net>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 0/3] mm: improve page aging fairness between zones/nodes
Date: Mon, 22 Jul 2013 19:14:22 +0200 [thread overview]
Message-ID: <51ED686E.6030606@bitsync.net> (raw)
In-Reply-To: <20130722170112.GE715@cmpxchg.org>
[-- Attachment #1: Type: text/plain, Size: 2359 bytes --]
On 22.07.2013 19:01, Johannes Weiner wrote:
> Hi Zlatko,
>
> On Mon, Jul 22, 2013 at 06:48:52PM +0200, Zlatko Calusic wrote:
>> On 19.07.2013 22:55, Johannes Weiner wrote:
>>> The way the page allocator interacts with kswapd creates aging
>>> imbalances, where the amount of time a userspace page gets in memory
>>> under reclaim pressure is dependent on which zone, which node the
>>> allocator took the page frame from.
>>>
>>> #1 fixes missed kswapd wakeups on NUMA systems, which lead to some
>>> nodes falling behind for a full reclaim cycle relative to the other
>>> nodes in the system
>>>
>>> #3 fixes an interaction where kswapd and a continuous stream of page
>>> allocations keep the preferred zone of a task between the high and
>>> low watermark (allocations succeed + kswapd does not go to sleep)
>>> indefinitely, completely underutilizing the lower zones and
>>> thrashing on the preferred zone
>>>
>>> These patches are the aging fairness part of the thrash-detection
>>> based file LRU balancing. Andrea recommended to submit them
>>> separately as they are bugfixes in their own right.
>>>
>>
>> I have the patch applied and under testing. So far, so good. It
>> looks like it could finally fix the bug that I was chasing few
>> months ago (nicely described in your bullet #3). But, few more days
>> of testing will be needed before I can reach a quality verdict.
>
> I should have remembered that you talked about this problem... Thanks
> a lot for testing!
>
> May I ask for the zone layout of your test machine(s)? I.e. how many
> nodes if NUMA, how big Normal and DMA32 (on Node 0) are.
>
I have been reading about NUMA hw for at least a decade, but I guess
another one will pass before I actually see one. ;) Find /proc/zoneinfo
attached.
If your patchset fails my case, then nr_{in,}active_file in Normal zone
will drop close to zero in a matter of days. If it fixes this particular
imbalance, and I have faith it will, then those two counters will stay
in relative balance with nr_{in,}active_anon in the same zone. I also
applied Konstantin's excellent lru-milestones-timestamps-and-ages, and
graphing of interesting numbers on top of that, which is why I already
have faith in your patchset. I can see much better balance between zones
already. But, let's give it some more time...
--
Zlatko
[-- Attachment #2: zoneinfo --]
[-- Type: text/plain, Size: 3970 bytes --]
Node 0, zone DMA
pages free 3975
min 132
low 165
high 198
scanned 0
spanned 4095
present 3998
managed 3977
nr_free_pages 3975
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 2
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 3236, 3933, 3933)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 4
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 4
all_unreclaimable: 1
start_pfn: 1
inactive_ratio: 1
avg_age_inactive_anon: 0
avg_age_active_anon: 0
avg_age_inactive_file: 0
avg_age_active_file: 0
Node 0, zone DMA32
pages free 83177
min 27693
low 34616
high 41539
scanned 0
spanned 1044480
present 847429
managed 829295
nr_free_pages 83177
nr_inactive_anon 2061
nr_active_anon 313380
nr_inactive_file 199460
nr_active_file 207097
nr_unevictable 0
nr_mlock 0
nr_anon_pages 239688
nr_mapped 38888
nr_file_pages 424978
nr_dirty 87
nr_writeback 0
nr_slab_reclaimable 9119
nr_slab_unreclaimable 2054
nr_page_table_pages 1795
nr_kernel_stack 144
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 18421
nr_dirtied 725414
nr_written 768505
nr_anon_transparent_hugepages 112
nr_free_cma 0
protection: (0, 0, 697, 697)
pagesets
cpu: 0
count: 132
high: 186
batch: 31
vm stats threshold: 24
cpu: 1
count: 146
high: 186
batch: 31
vm stats threshold: 24
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 5
avg_age_inactive_anon: 5467648
avg_age_active_anon: 5467648
avg_age_inactive_file: 3184128
avg_age_active_file: 5467648
Node 0, zone Normal
pages free 17164
min 5965
low 7456
high 8947
scanned 0
spanned 196607
present 196607
managed 178491
nr_free_pages 17164
nr_inactive_anon 294
nr_active_anon 64754
nr_inactive_file 42191
nr_active_file 44925
nr_unevictable 0
nr_mlock 0
nr_anon_pages 51456
nr_mapped 9580
nr_file_pages 91492
nr_dirty 27
nr_writeback 0
nr_slab_reclaimable 2686
nr_slab_unreclaimable 1194
nr_page_table_pages 401
nr_kernel_stack 65
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 4376
nr_dirtied 163250
nr_written 172369
nr_anon_transparent_hugepages 18
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 177
high: 186
batch: 31
vm stats threshold: 16
cpu: 1
count: 170
high: 186
batch: 31
vm stats threshold: 16
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 1
avg_age_inactive_anon: 5468672
avg_age_active_anon: 5468672
avg_age_inactive_file: 3382628
avg_age_active_file: 5468672
next prev parent reply other threads:[~2013-07-22 17:14 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-19 20:55 [patch 0/3] mm: improve page aging fairness between zones/nodes Johannes Weiner
2013-07-19 20:55 ` [patch 1/3] mm: vmscan: fix numa reclaim balance problem in kswapd Johannes Weiner
2013-07-22 19:47 ` Rik van Riel
2013-07-22 20:14 ` Johannes Weiner
2013-07-26 22:53 ` Andrew Morton
2013-07-30 17:45 ` Johannes Weiner
2013-07-31 12:43 ` Johannes Weiner
2013-07-19 20:55 ` [patch 2/3] mm: page_alloc: rearrange watermark checking in get_page_from_freelist Johannes Weiner
2013-07-22 19:51 ` Rik van Riel
2013-07-19 20:55 ` [patch 3/3] mm: page_alloc: fair zone allocator policy Johannes Weiner
2013-07-22 20:21 ` Rik van Riel
2013-07-22 21:04 ` Johannes Weiner
2013-07-22 22:48 ` Rik van Riel
2013-07-25 6:50 ` Paul Bolle
2013-07-25 15:10 ` Johannes Weiner
2013-07-25 15:20 ` Paul Bolle
2013-07-29 17:48 ` Andrea Arcangeli
2013-07-29 22:24 ` Johannes Weiner
2013-08-01 2:56 ` Minchan Kim
2013-08-01 4:31 ` Rik van Riel
2013-08-01 15:51 ` Andrea Arcangeli
2013-08-01 19:58 ` Johannes Weiner
2013-08-01 22:16 ` Andrea Arcangeli
2013-08-02 6:22 ` Johannes Weiner
2013-08-02 7:32 ` Minchan Kim
2013-07-22 16:48 ` [patch 0/3] mm: improve page aging fairness between zones/nodes Zlatko Calusic
2013-07-22 17:01 ` Johannes Weiner
2013-07-22 17:14 ` Zlatko Calusic [this message]
2013-07-24 11:18 ` Zlatko Calusic
2013-07-24 12:46 ` Hush Bensen
2013-07-24 13:59 ` Zlatko Calusic
2013-07-31 9:33 ` Zlatko Calusic
2013-07-26 22:45 ` Andrew Morton
2013-07-26 23:14 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51ED686E.6030606@bitsync.net \
--to=zcalusic@bitsync.net \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).