All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: John Stultz <john.stultz@linaro.org>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
	Hugh Dickins <hughd@google.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags
Date: Wed, 15 Feb 2012 10:51:06 +1100	[thread overview]
Message-ID: <20120214235106.GL7479@dastard> (raw)
In-Reply-To: <1329198932.2753.62.camel@work-vm>

On Mon, Feb 13, 2012 at 09:55:32PM -0800, John Stultz wrote:
> On Tue, 2012-02-14 at 16:16 +1100, Dave Chinner wrote:
> > On Thu, Feb 09, 2012 at 04:16:33PM -0800, John Stultz wrote:
> > > This patch provides new fadvise flags that can be used to mark
> > > file pages as volatile, which will allow it to be discarded if the
> > > kernel wants to reclaim memory.
> > > 
> > > This is useful for userspace to allocate things like caches, and lets
> > > the kernel destructively (but safely) reclaim them when there's memory
> > > pressure.
> > .....
> > > @@ -655,6 +656,8 @@ struct address_space {
> > >  	spinlock_t		private_lock;	/* for use by the address_space */
> > >  	struct list_head	private_list;	/* ditto */
> > >  	struct address_space	*assoc_mapping;	/* ditto */
> > > +	struct range_tree_node	*volatile_root;	/* volatile range list */
> > > +	struct mutex		vlist_mutex;	/* protect volatile_list */
> > >  } __attribute__((aligned(sizeof(long))));
> > 
> > So you're adding roughly 32 bytes to every cached inode in the
> > system? This will increasing the memory footprint of the inode cache
> > by 2-5% (depending on the filesystem). Almost no-one will be using
> > this functionality on most inodes that are cached in the system, so
> > that seems like a pretty bad trade-off to me...
> 
> Yea. Bloating the address_space is a concern I'm aware of, but for the
> initial passes I left it to see where folks would rather I keep it.
> Pushing the mutex into a range_tree_root structure or something could
> cut this down, but I still suspect it won't be loved. Another idea would
> be to manage the mapping -> range tree separately via something like a
> hash.  Do you have any preferences or suggestions here?

Given that it is a single state bit per page (volatile/non volatile)
you could just use a radix tree tag for keeping the state. Changing
the state isn't a performance critical operation, and tagging large
ranges isn't that expensive (e.g. we do that in the writeback code),
so I'm not sure the overhead of a separate tree is necessary here....

That doesn't help with the reclaim side of things, but I would have
thought that such functioanlity would be better integrated into the
VM page cache/lru scanning code than adding a shrinker to shrink the
page cache additionally on top of what the VM has already done
before calling the shrinkers. I'm not sure what is best here,
though...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2012-02-14 23:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-10  0:16 [PATCH 1/2] [RFC] Range tree implementation John Stultz
2012-02-10  0:16 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
     [not found]   ` <CAO6Zf6B6nGqsz5zpT3ixbO-+JWxMsScABasnwo-CVHuMKPqpLQ@mail.gmail.com>
2012-02-12 12:54     ` Fwd: " Dmitry Adamushko
2012-02-17  3:43     ` John Stultz
2012-02-17  5:24       ` John Stultz
2012-02-12 14:08   ` Dmitry Adamushko
2012-02-17  3:49     ` John Stultz
2012-02-14  5:16   ` Dave Chinner
2012-02-14  5:55     ` John Stultz
2012-02-14 23:51       ` Dave Chinner [this message]
2012-02-15  0:29         ` John Stultz
2012-02-15  1:37           ` NeilBrown
2012-02-17  4:45             ` Dave Chinner
2012-02-17  5:27               ` NeilBrown
2012-02-17  5:38               ` John Stultz
2012-02-17  5:21             ` John Stultz
2012-02-20  7:34               ` NeilBrown
2012-02-20 23:25                 ` Dave Hansen
  -- strict thread matches above, loose matches on Subject: below --
2012-03-16 22:51 [PATCH 0/2] [RFC] Volatile ranges (v4) John Stultz
2012-03-16 22:51 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
2012-03-17 16:21   ` Dmitry Adamushko
2012-03-18  9:13     ` Dmitry Adamushko
2012-03-20  0:18     ` John Stultz
2012-03-21  4:15 [PATCH 0/2] [RFC] fadivse volatile & range tree (v5) John Stultz
2012-03-21  4:15 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
2012-04-07  0:08 [PATCH 0/2] [RFC] Volatile Ranges (v6) John Stultz
2012-04-07  0:08 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
2012-04-14  1:07 [PATCH 0/2][RFC] Volatile Ranges (v7) John Stultz
2012-04-14  1:08 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120214235106.GL7479@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hughd@google.com \
    --cc=john.stultz@linaro.org \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    --cc=rlove@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.