From: NeilBrown <neilb@suse.de>
To: Dave Chinner <david@fromorbit.com>
Cc: John Stultz <john.stultz@linaro.org>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Android Kernel Team <kernel-team@android.com>,
Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
Hugh Dickins <hughd@google.com>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags
Date: Fri, 17 Feb 2012 16:27:14 +1100 [thread overview]
Message-ID: <20120217162714.09250710@notabene.brown> (raw)
In-Reply-To: <20120217044557.GI14132@dastard>
[-- Attachment #1: Type: text/plain, Size: 3465 bytes --]
On Fri, 17 Feb 2012 15:45:57 +1100 Dave Chinner <david@fromorbit.com> wrote:
> On Wed, Feb 15, 2012 at 12:37:50PM +1100, NeilBrown wrote:
> > On Tue, 14 Feb 2012 16:29:10 -0800 John Stultz <john.stultz@linaro.org> wrote:
> >
> > > But I'm open to other ideas and arguments.
> >
> > I didn't notice the original patch, but found it at
> > https://lwn.net/Articles/468837/
> > and had a look.
> >
> > My first comment is -ENODOC. A bit background always helps, so let me try to
> > construct that:
> >
> > The goal is to allow applications to interact with the kernel's cache
> > management infrastructure. In particular an application can say "this
> > memory contains data that might be useful in the future, but can be
> > reconstructed if necessary, and it is cheaper to reconstruct it than to read
> > it back from disk, so don't bother writing it out".
> >
> > The proposed mechanism - at a high level - is for user-space to be able to
> > say "This memory is volatile" and then later "this memory is no longer
> > volatile". If the content of the memory is still available the second
> > request succeeds. If not, it fails.. Well, actually it succeeds but reports
> > that some content has been lost. (not sure what happens then - can the app do
> > a binary search to find which pages it still has or something).
> >
> > (technically we should probably include the cost to reconstruct the page,
> > which the kernel measures as 'seeks' but maybe that isn't necessary).
> >
> > This is implemented by using files in a 'tmpfs' filesystem. These file
> > support three new flags to fadvise:
> >
> > POSIX_FADV_VOLATILE - this marks a range of pages as 'volatile'. They may be
> > removed from the page cache as needed, even if they are not 'clean'.
> > POSIX_FADV_NONVOLATILE - this marks a range of pages as non-volatile.
> > If any pages in the range were previously volatile but have since been
> > removed, then a status is returned reporting this.
> > POSIX_FADV_ISVOLATILE - this does not actually give any advice to the kernel
> > but rather asks a question: Are any of these pages volatile?
>
> What about for files that aren't on tmpfs? the fadvise() interface
> is not tmpfs specific, and given that everyone is talking about
> volatility of page cache pages, I fail to see what is tmpfs specific
> about this proposal.
It seems I was looking at an earlier version of the patch which only seemed
to affect tmpfs file. I see now that the latest version can affect all
filesystems.
>
> So what are the semantics that are supposed to apply to a file that
> is on a filesystem with stable storage that is cached in the page
> cache?
This is my question too. Does this make any sense at all for a
storage-backed filesystem?
If I understand the current code (which is by no means certain), then
there is nothing concrete that stops volatile pages from being written back
to storage. Whether they are or not would be the result of a race between
the 'volatile_shrinker' purging them, and the VM cleaning them.
Given that the volatile_shrinker sets 'seeks = DEFAULT_SEEKS * 4', I would
guess that the VM would get to the pages before the shrinker, but that is
mostly just a guess.
If this really what we want?
Certainly having this clarified in Documents/volatile.txt would help a lot :-)
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-02-17 5:27 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-10 0:16 [PATCH 1/2] [RFC] Range tree implementation John Stultz
2012-02-10 0:16 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
2012-02-12 14:08 ` Dmitry Adamushko
2012-02-17 3:49 ` John Stultz
2012-02-14 5:16 ` Dave Chinner
2012-02-14 5:55 ` John Stultz
2012-02-14 23:51 ` Dave Chinner
2012-02-15 0:29 ` John Stultz
2012-02-15 1:37 ` NeilBrown
2012-02-17 4:45 ` Dave Chinner
2012-02-17 5:27 ` NeilBrown [this message]
2012-02-17 5:38 ` John Stultz
2012-02-17 5:21 ` John Stultz
2012-02-20 7:34 ` NeilBrown
2012-02-20 23:25 ` Dave Hansen
[not found] ` <CAO6Zf6B6nGqsz5zpT3ixbO-+JWxMsScABasnwo-CVHuMKPqpLQ@mail.gmail.com>
2012-02-12 12:54 ` Fwd: " Dmitry Adamushko
2012-02-17 3:43 ` John Stultz
2012-02-17 5:24 ` John Stultz
-- strict thread matches above, loose matches on Subject: below --
2012-03-16 22:51 [PATCH 0/2] [RFC] Volatile ranges (v4) John Stultz
2012-03-16 22:51 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
2012-03-17 16:21 ` Dmitry Adamushko
2012-03-18 9:13 ` Dmitry Adamushko
2012-03-20 0:18 ` John Stultz
2012-03-21 4:15 [PATCH 0/2] [RFC] fadivse volatile & range tree (v5) John Stultz
2012-03-21 4:15 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
2012-04-07 0:08 [PATCH 0/2] [RFC] Volatile Ranges (v6) John Stultz
2012-04-07 0:08 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
2012-04-14 1:07 [PATCH 0/2][RFC] Volatile Ranges (v7) John Stultz
2012-04-14 1:08 ` [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILE flags John Stultz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120217162714.09250710@notabene.brown \
--to=neilb@suse.de \
--cc=akpm@linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=david@fromorbit.com \
--cc=hughd@google.com \
--cc=john.stultz@linaro.org \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mel@csn.ul.ie \
--cc=riel@redhat.com \
--cc=rlove@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).