public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: John Stultz <john.stultz@linaro.org>
To: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Neil Brown <neilb@suse.de>,
	Andrea Righi <andrea@betterlinux.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Taras Glek <tgek@mozilla.com>, Mike Hommey <mh@glandium.org>,
	Jan Kara <jack@suse.cz>
Subject: Re: [PATCH 3/3] [RFC] tmpfs: Add FALLOC_FL_MARK_VOLATILE/UNMARK_VOLATILE handlers
Date: Fri, 08 Jun 2012 20:45:09 -0700	[thread overview]
Message-ID: <4FD2C6C5.1070900@linaro.org> (raw)
In-Reply-To: <4FD1848B.7040102@gmail.com>

On 06/07/2012 09:50 PM, KOSAKI Motohiro wrote:
> (6/7/12 11:03 PM), John Stultz wrote:
>
>> So I'm falling back to using a shrinker for now, but I think Dmitry's
>> point is an interesting one, and am interested in finding a better
>> place to trigger purging volatile ranges from the mm code. If anyone 
>> has any
>> suggestions, let me know, otherwise I'll go back to trying to better 
>> grok the mm code.
>
> I hate vm feature to abuse shrink_slab(). because of, it was not 
> designed generic callback.
> it was designed for shrinking filesystem metadata. Therefore, vm 
> keeping a balance between
> page scanning and slab scanning. then, a lot of shrink_slab misuse may 
> lead to break balancing
> logic. i.e. drop icache/dcache too many and makes perfomance impact.
>
> As far as a code impact is small, I'm prefer to connect w/ vm reclaim 
> code directly.

I can see your concern about mis-using the shrinker code. Also your 
other email's point about the problem of having LRU range purging 
behavior on a NUMA system makes some sense too.  Unfortunately I'm not 
yet familiar enough with the reclaim core to sort out how to best track 
and connect the volatile range purging in the vm's reclaim core yet.

So for now, I've moved the code back to using the shrinker (along with 
fixing a few bugs along the way).
Thus, currently we manage the ranges as so:
     [per fs volatile range lru head] -> [volatile range] -> [volatile 
range] -> [volatile range]
With the per-fs shrinker zaping the volatile ranges from the lru.

I *think* ideally, the pages in a volatile range should be similar to 
non-dirty file-backed pages.  There is a cost to restore them, but 
freeing them is very cheap.  The trick is that volatile ranges 
introduces a new relationship between pages. Since the neighboring 
virtual pages in a volatile range are in effect tied together, purging 
one effectively ruins the value of keeping the others, regardless of 
which zone they are physically.

So maybe the right appraoch give up the per-fs volatile range lru, and 
try a varient of what DaveC and DaveH have suggested: Letting the page 
based lru reclamation handle the selection on a physical page basis, but 
then zapping the entirety of the neighboring range if any one page is 
reclaimed.  In order to try to preserve the range based LRU behavior, 
activate all the pages in the range together when the range is marked 
volatile.  Since we assume ranges are un-touched when volatile, that 
should preserve LRU purging behavior on single node systems and on 
multi-node systems it will approximate fairly closely.

My main concern with this approach is marking and unmarking volatile 
ranges needs to be fast, so I'm worried about the additional overhead of 
activating each of the containing pages on mark_volatile.

The other question I have with this approach is if we're on a system 
that doesn't have swap, it *seems* (not totally sure I understand it 
yet) the tmpfs file pages will be skipped over when we call 
shrink_lruvec.  So it seems we may need to add a new lru_list enum and 
nr[] entry (maybe LRU_VOLATILE?).   So then it may be that when we mark 
a range as volatile, instead of just activating it, we move it to the 
volatile lru, and then when we shrink from that list, we call back to 
the filesystem to trigger the entire range purging.

Does that sound reasonable?  Any other suggested approaches?  I'll think 
some more about it this weekend and try to get a patch scratched out 
early next week.

thanks
-john














  reply	other threads:[~2012-06-09  3:45 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-01 18:29 [PATCH 0/3] [RFC] Fallocate Volatile Ranges v2 John Stultz
2012-06-01 18:29 ` [PATCH 1/3] [RFC] Interval tree implementation John Stultz
2012-06-01 18:29 ` [PATCH 2/3] [RFC] Add volatile range management code John Stultz
2012-06-01 18:29 ` [PATCH 3/3] [RFC] tmpfs: Add FALLOC_FL_MARK_VOLATILE/UNMARK_VOLATILE handlers John Stultz
2012-06-01 20:17   ` KOSAKI Motohiro
2012-06-01 21:03     ` John Stultz
2012-06-01 21:37       ` KOSAKI Motohiro
2012-06-01 21:44         ` John Stultz
2012-06-01 22:34           ` KOSAKI Motohiro
2012-06-01 23:25             ` John Stultz
2012-06-06 19:52               ` KOSAKI Motohiro
2012-06-06 23:56                 ` John Stultz
2012-06-07 10:55                   ` Dmitry Adamushko
2012-06-07 23:41                     ` Dave Hansen
2012-06-08  3:03                       ` John Stultz
2012-06-08  4:50                         ` KOSAKI Motohiro
2012-06-09  3:45                           ` John Stultz [this message]
2012-06-10  6:35                             ` Dmitry Adamushko
2012-06-10 21:47                             ` Rik van Riel
2012-06-11 18:35                               ` John Stultz
2012-06-12  1:21                                 ` John Stultz
2012-06-12  7:16                             ` Minchan Kim
2012-06-12 16:03                               ` KOSAKI Motohiro
2012-06-12 19:35                               ` John Stultz
2012-06-13  0:10                                 ` Minchan Kim
2012-06-13  1:21                                   ` John Stultz
2012-06-13  4:42                                     ` Minchan Kim
2012-06-08  6:39                   ` KOSAKI Motohiro
  -- strict thread matches above, loose matches on Subject: below --
2012-06-01 23:38 [PATCH 0/3] [RFC] Fallocate Volatile Ranges v3 John Stultz
2012-06-01 23:38 ` [PATCH 3/3] [RFC] tmpfs: Add FALLOC_FL_MARK_VOLATILE/UNMARK_VOLATILE handlers John Stultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FD2C6C5.1070900@linaro.org \
    --to=john.stultz@linaro.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@betterlinux.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=dmitry.adamushko@gmail.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kernel-team@android.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=mh@glandium.org \
    --cc=neilb@suse.de \
    --cc=riel@redhat.com \
    --cc=rlove@google.com \
    --cc=tgek@mozilla.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox