From: Kuo-Hsin Yang <vovoy@chromium.org>
To: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>, Sonny Rao <sonnyrao@chromium.org>,
Kuo-Hsin Yang <vovoy@chromium.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH] mm: vmscan: fix not scanning anonymous pages when detecting file refaults
Date: Wed, 19 Jun 2019 16:08:35 +0800 [thread overview]
Message-ID: <20190619080835.GA68312@google.com> (raw)
When file refaults are detected and there are many inactive file pages,
the system never reclaim anonymous pages, the file pages are dropped
aggressively when there are still a lot of cold anonymous pages and
system thrashes. This issue impacts the performance of applications with
large executable, e.g. chrome.
When file refaults are detected. inactive_list_is_low() may return
different values depends on the actual_reclaim parameter, the following
2 conditions could be satisfied at the same time.
1) inactive_list_is_low() returns false in get_scan_count() to trigger
scanning file lists only.
2) inactive_list_is_low() returns true in shrink_list() to allow
scanning active file list.
In that case vmscan would only scan file lists, and as active file list
is also scanned, inactive_list_is_low() may keep returning false in
get_scan_count() until file cache is very low.
Before commit 2a2e48854d70 ("mm: vmscan: fix IO/refault regression in
cache workingset transition"), inactive_list_is_low() never returns
different value in get_scan_count() and shrink_list() in one
shrink_node_memcg() run. The original design should be that when
inactive_list_is_low() returns false for file lists, vmscan only scan
inactive file list. As only inactive file list is scanned,
inactive_list_is_low() would soon return true.
This patch makes the return value of inactive_list_is_low() independent
of actual_reclaim.
The problem can be reproduced by the following test program.
---8<---
void fallocate_file(const char *filename, off_t size)
{
struct stat st;
int fd;
if (!stat(filename, &st) && st.st_size >= size)
return;
fd = open(filename, O_WRONLY | O_CREAT, 0600);
if (fd < 0) {
perror("create file");
exit(1);
}
if (posix_fallocate(fd, 0, size)) {
perror("fallocate");
exit(1);
}
close(fd);
}
long *alloc_anon(long size)
{
long *start = malloc(size);
memset(start, 1, size);
return start;
}
long access_file(const char *filename, long size, long rounds)
{
int fd, i;
volatile char *start1, *end1, *start2;
const int page_size = getpagesize();
long sum = 0;
fd = open(filename, O_RDONLY);
if (fd == -1) {
perror("open");
exit(1);
}
/*
* Some applications, e.g. chrome, use a lot of executable file
* pages, map some of the pages with PROT_EXEC flag to simulate
* the behavior.
*/
start1 = mmap(NULL, size / 2, PROT_READ | PROT_EXEC, MAP_SHARED,
fd, 0);
if (start1 == MAP_FAILED) {
perror("mmap");
exit(1);
}
end1 = start1 + size / 2;
start2 = mmap(NULL, size / 2, PROT_READ, MAP_SHARED, fd, size / 2);
if (start2 == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (i = 0; i < rounds; ++i) {
struct timeval before, after;
volatile char *ptr1 = start1, *ptr2 = start2;
gettimeofday(&before, NULL);
for (; ptr1 < end1; ptr1 += page_size, ptr2 += page_size)
sum += *ptr1 + *ptr2;
gettimeofday(&after, NULL);
printf("File access time, round %d: %f (sec)\n", i,
(after.tv_sec - before.tv_sec) +
(after.tv_usec - before.tv_usec) / 1000000.0);
}
return sum;
}
int main(int argc, char *argv[])
{
const long MB = 1024 * 1024;
long anon_mb, file_mb, file_rounds;
const char filename[] = "large";
long *ret1;
long ret2;
if (argc != 4) {
printf("usage: thrash ANON_MB FILE_MB FILE_ROUNDS\n");
exit(0);
}
anon_mb = atoi(argv[1]);
file_mb = atoi(argv[2]);
file_rounds = atoi(argv[3]);
fallocate_file(filename, file_mb * MB);
printf("Allocate %ld MB anonymous pages\n", anon_mb);
ret1 = alloc_anon(anon_mb * MB);
printf("Access %ld MB file pages\n", file_mb);
ret2 = access_file(filename, file_mb * MB, file_rounds);
printf("Print result to prevent optimization: %ld\n",
*ret1 + ret2);
return 0;
}
---8<---
Running the test program on 2GB RAM VM with kernel 5.2.0-rc5, the
program fills ram with 2048 MB memory, access a 200 MB file for 10
times. Without this patch, the file cache is dropped aggresively and
every access to the file is from disk.
$ ./thrash 2048 200 10
Allocate 2048 MB anonymous pages
Access 200 MB file pages
File access time, round 0: 2.489316 (sec)
File access time, round 1: 2.581277 (sec)
File access time, round 2: 2.487624 (sec)
File access time, round 3: 2.449100 (sec)
File access time, round 4: 2.420423 (sec)
File access time, round 5: 2.343411 (sec)
File access time, round 6: 2.454833 (sec)
File access time, round 7: 2.483398 (sec)
File access time, round 8: 2.572701 (sec)
File access time, round 9: 2.493014 (sec)
With this patch, these file pages can be cached.
$ ./thrash 2048 200 10
Allocate 2048 MB anonymous pages
Access 200 MB file pages
File access time, round 0: 2.475189 (sec)
File access time, round 1: 2.440777 (sec)
File access time, round 2: 2.411671 (sec)
File access time, round 3: 1.955267 (sec)
File access time, round 4: 0.029924 (sec)
File access time, round 5: 0.000808 (sec)
File access time, round 6: 0.000771 (sec)
File access time, round 7: 0.000746 (sec)
File access time, round 8: 0.000738 (sec)
File access time, round 9: 0.000747 (sec)
Fixes: 2a2e48854d70 ("mm: vmscan: fix IO/refault regression in cache workingset transition")
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7889f583ced9f..b95d05fe828d1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2151,7 +2151,7 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
* rid of the stale workingset quickly.
*/
refaults = lruvec_page_state_local(lruvec, WORKINGSET_ACTIVATE);
- if (file && actual_reclaim && lruvec->refaults != refaults) {
+ if (file && lruvec->refaults != refaults) {
inactive_ratio = 0;
} else {
gb = (inactive + active) >> (30 - PAGE_SHIFT);
--
2.22.0.410.gd8fdbe21b5-goog
next reply other threads:[~2019-06-19 8:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-19 8:08 Kuo-Hsin Yang [this message]
2019-06-27 4:03 ` [PATCH] mm: vmscan: fix not scanning anonymous pages when detecting file refaults Andrew Morton
2019-06-27 18:41 ` Johannes Weiner
2019-06-28 6:51 ` Minchan Kim
2019-06-28 8:44 ` Vovo Yang
2019-06-28 14:22 ` Johannes Weiner
2019-06-28 23:34 ` Minchan Kim
2019-06-28 6:58 ` Minchan Kim
2019-06-28 11:16 ` [PATCH v2] " Kuo-Hsin Yang
2019-06-28 14:32 ` Johannes Weiner
2019-06-28 23:36 ` Minchan Kim
2019-07-01 7:56 ` Kuo-Hsin Yang
2019-07-01 8:10 ` [PATCH] mm: vmscan: scan anonymous pages on " Kuo-Hsin Yang
2019-07-03 14:30 ` Michal Hocko
2019-07-04 9:47 ` Kuo-Hsin Yang
2019-07-04 11:04 ` Michal Hocko
2019-07-05 12:45 ` Kuo-Hsin Yang
2019-07-12 7:13 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190619080835.GA68312@google.com \
--to=vovoy@chromium.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=sonnyrao@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.