Re: [PATCH 0/3] Volatile Ranges (v11)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Minchan Kim <minchan@kernel.org>
To: Dave Hansen <dave@sr71.net>
Cc: Michal Hocko <mhocko@suse.cz>,
	John Stultz <john.stultz@linaro.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Android Kernel Team <kernel-team@android.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Neil Brown <neilb@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>,
	Dhaval Giani <dgiani@mozilla.com>, Jan Kara <jack@suse.cz>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Michel Lespinasse <walken@google.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 0/3] Volatile Ranges (v11)
Date: Thu, 20 Mar 2014 16:45:29 +0900	[thread overview]
Message-ID: <20140320074529.GB5902@bbox> (raw)
In-Reply-To: <532A3872.1080101@sr71.net>

Hello Dave,

On Wed, Mar 19, 2014 at 05:38:10PM -0700, Dave Hansen wrote:
> On 03/18/2014 05:24 AM, Michal Hocko wrote:
> > On Fri 14-03-14 11:33:30, John Stultz wrote:
> > [...]
> >> Volatile ranges provides a method for userland to inform the kernel that
> >> a range of memory is safe to discard (ie: can be regenerated) but
> >> userspace may want to try access it in the future.  It can be thought of
> >> as similar to MADV_DONTNEED, but that the actual freeing of the memory
> >> is delayed and only done under memory pressure, and the user can try to
> >> cancel the action and be able to quickly access any unpurged pages. The
> >> idea originated from Android's ashmem, but I've since learned that other
> >> OSes provide similar functionality.
> > 
> > Maybe I have missed something (I've only glanced through the patches)
> > but it seems that marking a range volatile doesn't alter neither
> > reference bits nor position in the LRU. I thought that a volatile page
> > would be moved to the end of inactive LRU with the reference bit
> > dropped. Or is this expectation wrong and volatility is not supposed to
> > touch page aging?
> 
> I'm not really convinced it should alter the aging.  Things could
> potentially go in and out of volatile state frequently, and requiring
> aging means we've got to go after them page-by-page or pte-by-pte at
> best.  That doesn't seem like something we want to do in a path we want
> to be fast.

Since vrange syscall design was changed from range-based to pte-based,
it shouldn't be fast. Sure, vrange(VOLAILTE) could be fast with just
mark it VMA_VOALTILE to vma->vm_flags but vrange(NOVOLATILE) should
look every pages in the range so it could be slow.
Even vrange(VOLATILE) call is fast now, I want to accout volatile
pages to expose it to the user by vmstat so that user could see
current status of the system memory, which makes userspace more happy
and predicatble. If we add such stat, vrange(VOLATILE) should look
every pages in the range so it could be slow, too.

> 
> Why not just let normal page aging deal with them?  It seems to me like
> like trying to infer intended lru position from volatility is the wrong
> thing.  It's quite possible we'd have two pages in the same range that
> we want in completely different parts of the LRU.  Maybe the structure
> has a hot page and a cold one, and we would ideally want the cold one
> swapped out and not the hot one.

Yes, it would be really arguble and it depends on the user's usecase.
That's why I'd like to add VRANGE_NORMAL_AGING which just don't move
the page in curret position of the LRU. It would be useful when it used
with VRANGE_SIGBUS because they could handle partial pages.

Otherwise, I'd like to move that pages into inacive's tail so that it
should prevent reclaiming of the hot pages.
If there is no memory pressure, we could get a chance to reuse volatile
pages so it could rotate back to the head of LRU when VM reclaim logic is
triggered.

I agree with John's opinion that just make approach simple as possible
and extend it later so that we should make a room in syscall semantic
and make an agreement what should be default at the moment.

Thanks.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2014-03-20  7:45 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-14 18:33 [PATCH 0/3] Volatile Ranges (v11) John Stultz
2014-03-14 18:33 ` [PATCH 1/3] vrange: Add vrange syscall and handle splitting/merging and marking vmas John Stultz
2014-03-17  9:21   ` Jan Kara
2014-03-17  9:43     ` Jan Kara
2014-03-18  0:36       ` John Stultz
2014-03-17 22:19     ` John Stultz
2014-03-14 18:33 ` [PATCH 2/3] vrange: Add purged page detection on setting memory non-volatile John Stultz
2014-03-17  9:39   ` Jan Kara
2014-03-17 22:22     ` John Stultz
2014-03-14 18:33 ` [PATCH 3/3] vrange: Add page purging logic & SIGBUS trap John Stultz
2014-03-18 12:24 ` [PATCH 0/3] Volatile Ranges (v11) Michal Hocko
2014-03-18 17:53   ` John Stultz
2014-03-20  0:38   ` Dave Hansen
2014-03-20  0:57     ` John Stultz
2014-03-20  7:45     ` Minchan Kim [this message]
2014-03-18 15:11 ` Minchan Kim
2014-03-18 18:07   ` John Stultz
2014-03-19  0:49     ` Minchan Kim
2014-03-19 10:12       ` Jan Kara
2014-03-20  1:09         ` Minchan Kim
2014-03-20  8:13           ` Jan Kara
2014-03-21  5:29             ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140320074529.GB5902@bbox \
    --to=minchan@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@sr71.net \
    --cc=dgiani@mozilla.com \
    --cc=dmitry.adamushko@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=john.stultz@linaro.org \
    --cc=kernel-team@android.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mh@glandium.org \
    --cc=mhocko@suse.cz \
    --cc=neilb@suse.de \
    --cc=riel@redhat.com \
    --cc=rlove@google.com \
    --cc=tglek@mozilla.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).