linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Andres Freund <andres@anarazel.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: Unhelpful caching decisions, possibly related to active/inactive sizing
Date: Wed, 17 Feb 2016 16:17:44 -0500	[thread overview]
Message-ID: <20160217161744.6ce0b1e5@annuminas.surriel.com> (raw)
In-Reply-To: <20160212193553.6pugckvamgtk4x5q@alap3.anarazel.de>

On Fri, 12 Feb 2016 20:35:53 +0100
Andres Freund <andres@anarazel.de> wrote:

> On 2016-02-12 13:46:53 +0100, Andres Freund wrote:
> > I'm wondering why pages that are repeatedly written to, in units above
> > the page size, are promoted to the active list? I mean if there never
> > are any reads or re-dirtying an already-dirty page, what's the benefit
> > of moving that page onto the active list?
> 
> We chatted about this on IRC and you proposed testing this by removing
> FGP_ACCESSED in grab_cache_page_write_begin.  I ran tests with that,
> after removing the aforementioned code to issue posix_fadvise(DONTNEED)
> in postgres.

That looks promising.

> Here the active/inactive lists didn't change as much as I hoped. A bit
> of reading made it apparent that the workingset logic in
> add_to_page_cache_lru() defated that attempt,

The patch below should help with that.

Does the GFP_ACCESSED change still help with the patch
below applied?

If so, we can add the partial write logic everywhere,
and use it here, too.

---8<---

Subject: mm,workingset: only do workingset activations on reads

When rewriting a page, the data in that page is replaced with new
data. This means that evicting something else from the active file
list, in order to cache data that will be replaced by something else,
is likely to be a waste of memory.

It is better to save the active list for frequently read pages, because
reads actually use the data that is in the page.

This patch ignores partial writes, because it is unclear whether the
complexity of identifying those is worth any potential performance
gain obtained from better caching pages that see repeated partial
writes at large enough intervals to not get caught by the use-twice
promotion code used for the inactive file list.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Andres Freund <andres@anarazel.de>
---
 mm/filemap.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index bc943867d68c..1235d27b2c97 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -703,8 +703,12 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
 		 * The page might have been evicted from cache only
 		 * recently, in which case it should be activated like
 		 * any other repeatedly accessed page.
+		 * The exception is pages getting rewritten; evicting other
+		 * data from the working set, only to cache data that will
+		 * get overwritten with something else, is a waste of memory.
 		 */
-		if (shadow && workingset_refault(shadow)) {
+		if (shadow && !(gfp_mask & GFP_WRITE) &&
+					workingset_refault(shadow)) {
 			SetPageActive(page);
 			workingset_activation(page);
 		} else

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-02-17 21:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-09 16:52 Unhelpful caching decisions, possibly related to active/inactive sizing Andres Freund
2016-02-09 22:42 ` Johannes Weiner
2016-02-11 20:34   ` Rik van Riel
2016-02-12 12:46     ` Andres Freund
2016-02-12 19:35       ` Andres Freund
2016-02-16 19:29         ` Johannes Weiner
2016-02-17 21:17         ` Rik van Riel [this message]
2016-02-19 22:19           ` Andres Freund
2016-02-12 12:56     ` Andres Freund
2016-02-12 20:24     ` Johannes Weiner
2016-02-19 22:07       ` Andres Freund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160217161744.6ce0b1e5@annuminas.surriel.com \
    --to=riel@redhat.com \
    --cc=andres@anarazel.de \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).