From: Dave Hansen <dave.hansen@linux.intel.com>
To: linux-kernel@vger.kernel.org
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
mhocko@suse.com, jannh@google.com, vbabka@suse.cz,
minchan@kernel.org, dancol@google.com, joel@joelfernandes.org,
akpm@linux-foundation.org
Subject: [PATCH 2/2] mm/madvise: skip MADV_PAGEOUT on shared swap cache pages
Date: Mon, 23 Mar 2020 16:41:51 -0700 [thread overview]
Message-ID: <20200323234151.10AF5617@viggo.jf.intel.com> (raw)
In-Reply-To: <20200323234147.558EBA81@viggo.jf.intel.com>
From: Dave Hansen <dave.hansen@linux.intel.com>
MADV_PAGEOUT might interfere with other processes if it is
allowed to reclaim pages shared with other processses. A
previous patch tried to avoid this for anonymous pages
which were shared by a fork(). It did this by checking
page_mapcount().
That works great for mapped pages. But, it can not detect
unmapped swap cache pages. This has not been a problem,
until the previous patch which added the ability for
MADV_PAGEOUT to *find* swap cache pages.
A process doing MADV_PAGEOUT which finds an unmapped swap
cache page and evicts it might interfere with another process
which had the same page mapped. But, such a page would have
a page_mapcount() of 1 since the page is only actually mapped
in the *other* process. The page_mapcount() test would fail
to detect the situation.
Thankfully, there is a reference count for swap entries.
To fix this, simply consult both page_mapcount() and the swap
reference count via page_swapcount().
I rigged up a little test program to try to create these
situations. Basically, if the parent "reader" RSS changes
in response to MADV_PAGEOUT actions in the child, there is
a problem.
https://www.sr71.net/~dave/intel/madv-pageout.c
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
b/mm/madvise.c | 37 +++++++++++++++++++++++++++++--------
1 file changed, 29 insertions(+), 8 deletions(-)
diff -puN mm/madvise.c~madv-pageout-ignore-shared-swap-cache mm/madvise.c
--- a/mm/madvise.c~madv-pageout-ignore-shared-swap-cache 2020-03-23 16:30:52.022385888 -0700
+++ b/mm/madvise.c 2020-03-23 16:41:15.448384333 -0700
@@ -261,6 +261,7 @@ static struct page *pte_get_reclaim_page
{
swp_entry_t entry;
struct page *page;
+ int nr_page_references = 0;
/* Totally empty PTE: */
if (pte_none(ptent))
@@ -271,7 +272,7 @@ static struct page *pte_get_reclaim_page
page = vm_normal_page(vma, addr, ptent);
if (page)
get_page(page);
- return page;
+ goto got_page;
}
/*
@@ -292,7 +293,33 @@ static struct page *pte_get_reclaim_page
* The PTE was a true swap entry. The page may be in
* the swap cache.
*/
- return lookup_swap_cache(entry, vma, addr);
+ page = lookup_swap_cache(entry, vma, addr);
+ if (!page)
+ return NULL;
+got_page:
+ /*
+ * Account for references to the swap entry. These
+ * might be "upgraded" to a normal mapping at any
+ * time.
+ */
+ if (PageSwapCache(page))
+ nr_page_references += page_swapcount(page);
+
+ /*
+ * Account for all mappings of the page, including
+ * when it is in the swap cache. This ensures that
+ * MADV_PAGOUT not interfere with anything shared
+ * with another process.
+ */
+ nr_page_references += page_mapcount(page);
+
+ /* Any extra references? Do not reclaim it. */
+ if (nr_page_references > 1) {
+ put_page(page);
+ return NULL;
+ }
+
+ return page;
}
/*
@@ -477,12 +504,6 @@ regular_page:
continue;
}
- /* Do not interfere with other mappings of this page */
- if (page_mapcount(page) != 1) {
- put_page(page);
- continue;
- }
-
VM_BUG_ON_PAGE(PageTransCompound(page), page);
if (!is_swap_pte(ptent) && pte_young(ptent)) {
_
next prev parent reply other threads:[~2020-03-23 23:43 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-23 23:41 [PATCH 0/2] mm/madvise: teach MADV_PAGEOUT about swap cache Dave Hansen
2020-03-23 23:41 ` [PATCH 1/2] mm/madvise: help MADV_PAGEOUT to find swap cache pages Dave Hansen
2020-03-26 6:24 ` Minchan Kim
2020-03-23 23:41 ` Dave Hansen [this message]
2020-03-26 6:28 ` [PATCH 2/2] mm/madvise: skip MADV_PAGEOUT on shared " Minchan Kim
2020-03-26 23:00 ` Dave Hansen
2020-03-27 6:42 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200323234151.10AF5617@viggo.jf.intel.com \
--to=dave.hansen@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=dancol@google.com \
--cc=jannh@google.com \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox