From: Zlatko Calusic <zcalusic@bitsync.net>
To: Mel Gorman <mgorman@suse.de>
Cc: linux-mm <linux-mm@kvack.org>
Subject: The pagecache unloved in zone NORMAL?
Date: Thu, 11 Apr 2013 22:30:05 +0200 [thread overview]
Message-ID: <51671D4D.9080003@bitsync.net> (raw)
[-- Attachment #1: Type: text/plain, Size: 2840 bytes --]
This is something that I've been chasing for months, and I'm getting
tired of it. :(
The issue has been observed on 4GB RAM x86_64 machines (one server, one
desktop) without swap subsystem (not even compiled in). The important
thing to remember about a 4GB x86_64 machine is that the NORMAL zone is
about 6 times smaller than the DMA32 zone.
As picture is 10000 words, I've attached two graphs that nicely show
what I've observed. As memory usage slowly rises, the MM subsystem
gradually evicts pagecache pages from the NORMAL zone, trying to
eventually get rid of all of them! This process takes days, typically
more than 5 on this particular server. Of course, this means that
eventually the zone will be choke full of anon pages, and without swap,
the kernel can't do much about it. But as it tries to balance the zone,
various bad things will happen. On the server I've seen sudden freeing
of hundreds of MB of pagecache, on the desktop there's a general
slowdown, sound dropouts (HTTP streaming) and so...
The first graph was probably 3.8 kernel, the second one is 3.9.0-rc4+
patched with the kswapd series v2. Obviously not much has changes wrt
this problem, although it seems to me that kernel now hesitates freeing
a large amounts of memory needlessly, or does it less often. But on the
desktop there's no improvement, as soon as the pagecache gets really low
in the NORMAL zone, there's severe slowdown, dropouts, etc... One other
thing, the lower graphs say "Normal zone file pages", what is actually
graphed is nr_active_file + nr_inactive_file from the NORMAL zone!
I've also attached two zoneinfo outputs. Notice how DMA32 zones have
hundreds of thousand of pagecache pages, but only a few dozens are in
the NORMAL zone! Also nr_vmscan_write is telling. Much higher values for
zone NORMAL (especially when you take in account how little pagecache is
there!), I guess those poor pagecacache pages that survives there get
written a millisecond after they're dirtied, a probable cause of the
slowdown I experience on the desktop.
There's a reasonable possibility that this imbalance between zones was
introduced somewhere between 3.3 and 3.4, because VM behaves slightly
differently in 3.3 (doesn't evict pagecache from the NORMAL zone so
aggresively). Unfortunately, I have some userspace incompatibilities
when running 3.3, so I'm not 100% sure (didn't run it long enough to be
absolutely sure). I tried to find the problematic commit, and
cc715d99e529 certainly looked like it's the culprit, but it's not!
buffer_heads_over_limit is NEVER true on the machine, not even close. So
that commit is basically a noop. Also it's not important if THP is on or
off, the behaviour stays the same.
My apologies for the long email, I tried to provide as much information
as possible.
--
Zlatko
[-- Attachment #2: server-3.8.png --]
[-- Type: image/png, Size: 33133 bytes --]
[-- Attachment #3: server-kswapd-v2.png --]
[-- Type: image/png, Size: 34895 bytes --]
[-- Attachment #4: zoneinfo-desktop.txt --]
[-- Type: text/plain, Size: 4261 bytes --]
Node 0, zone DMA
pages free 3974
min 128
low 160
high 192
scanned 0
spanned 4080
present 3912
managed 3976
nr_free_pages 3974
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 2
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 3259, 4015, 4015)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 2
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 3
count: 0
high: 0
batch: 1
vm stats threshold: 6
all_unreclaimable: 1
start_pfn: 16
inactive_ratio: 1
Node 0, zone DMA32
pages free 135587
min 27326
low 34157
high 40989
scanned 0
spanned 1044480
present 834513
managed 828967
nr_free_pages 135587
nr_inactive_anon 8165
nr_active_anon 264237
nr_inactive_file 190424
nr_active_file 198798
nr_unevictable 1
nr_mlock 1
nr_anon_pages 219052
nr_mapped 33586
nr_file_pages 397576
nr_dirty 82
nr_writeback 0
nr_slab_reclaimable 21757
nr_slab_unreclaimable 3505
nr_page_table_pages 3293
nr_kernel_stack 134
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 8354
nr_dirtied 5734689
nr_written 5557592
nr_anon_transparent_hugepages 88
nr_free_cma 0
protection: (0, 0, 756, 756)
pagesets
cpu: 0
count: 181
high: 186
batch: 31
vm stats threshold: 36
cpu: 1
count: 103
high: 186
batch: 31
vm stats threshold: 36
cpu: 2
count: 154
high: 186
batch: 31
vm stats threshold: 36
cpu: 3
count: 149
high: 186
batch: 31
vm stats threshold: 36
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 5
Node 0, zone Normal
pages free 7954
min 6337
low 7921
high 9505
scanned 0
spanned 196608
present 193536
managed 178447
nr_free_pages 7954
nr_inactive_anon 1916
nr_active_anon 136297
nr_inactive_file 32
nr_active_file 0
nr_unevictable 7767
nr_mlock 7767
nr_anon_pages 118628
nr_mapped 3090
nr_file_pages 3784
nr_dirty 4
nr_writeback 0
nr_slab_reclaimable 5476
nr_slab_unreclaimable 5581
nr_page_table_pages 2785
nr_kernel_stack 254
nr_unstable 0
nr_bounce 0
nr_vmscan_write 2693969
nr_vmscan_immediate_reclaim 10529
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 2348
nr_dirtied 1912471
nr_written 1784816
nr_anon_transparent_hugepages 46
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 151
high: 186
batch: 31
vm stats threshold: 24
cpu: 1
count: 171
high: 186
batch: 31
vm stats threshold: 24
cpu: 2
count: 143
high: 186
batch: 31
vm stats threshold: 24
cpu: 3
count: 54
high: 186
batch: 31
vm stats threshold: 24
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 1
[-- Attachment #5: zoneinfo-server.txt --]
[-- Type: text/plain, Size: 3628 bytes --]
Node 0, zone DMA
pages free 3975
min 132
low 165
high 198
scanned 0
spanned 4080
present 3983
managed 3977
nr_free_pages 3975
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 2
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 3236, 3934, 3934)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 4
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 4
all_unreclaimable: 1
start_pfn: 16
inactive_ratio: 1
Node 0, zone DMA32
pages free 198806
min 27693
low 34616
high 41539
scanned 0
spanned 1044480
present 847429
managed 828646
nr_free_pages 198806
nr_inactive_anon 152
nr_active_anon 296082
nr_inactive_file 159143
nr_active_file 148277
nr_unevictable 0
nr_mlock 0
nr_anon_pages 212100
nr_mapped 30139
nr_file_pages 325028
nr_dirty 61
nr_writeback 0
nr_slab_reclaimable 23373
nr_slab_unreclaimable 1418
nr_page_table_pages 1044
nr_kernel_stack 55
nr_unstable 0
nr_bounce 0
nr_vmscan_write 203475
nr_vmscan_immediate_reclaim 1159794
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 17608
nr_dirtied 120403187
nr_written 119379429
nr_anon_transparent_hugepages 130
nr_free_cma 0
protection: (0, 0, 697, 697)
pagesets
cpu: 0
count: 121
high: 186
batch: 31
vm stats threshold: 24
cpu: 1
count: 107
high: 186
batch: 31
vm stats threshold: 24
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 5
Node 0, zone Normal
pages free 7449
min 5965
low 7456
high 8947
scanned 0
spanned 196607
present 196607
managed 178497
nr_free_pages 7449
nr_inactive_anon 280
nr_active_anon 149997
nr_inactive_file 121
nr_active_file 33
nr_unevictable 0
nr_mlock 0
nr_anon_pages 138419
nr_mapped 2050
nr_file_pages 2796
nr_dirty 4
nr_writeback 0
nr_slab_reclaimable 2388
nr_slab_unreclaimable 2284
nr_page_table_pages 1203
nr_kernel_stack 156
nr_unstable 0
nr_bounce 0
nr_vmscan_write 12486086
nr_vmscan_immediate_reclaim 1290613
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 2642
nr_dirtied 16946001
nr_written 16543553
nr_anon_transparent_hugepages 18
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 93
high: 186
batch: 31
vm stats threshold: 16
cpu: 1
count: 114
high: 186
batch: 31
vm stats threshold: 16
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 1
next reply other threads:[~2013-04-11 20:30 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-11 20:30 Zlatko Calusic [this message]
2013-05-05 21:50 ` The pagecache unloved in zone NORMAL? Zlatko Calusic
2013-05-09 20:24 ` Zlatko Calusic
2013-05-12 17:53 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51671D4D.9080003@bitsync.net \
--to=zcalusic@bitsync.net \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).