From: Andrew Morton <akpm@linux-foundation.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
LKML <linux-kernel@vger.kernel.org>,
Nick Piggin <npiggin@suse.de>,
Stewart Smith <stewart@flamingspork.com>,
linux-mm@kvack.org, linux-arch@vger.kernel.org
Subject: Re: [patch 1/2] mm: fincore()
Date: Fri, 15 Feb 2013 15:42:35 -0800 [thread overview]
Message-ID: <20130215154235.0fb36f53.akpm@linux-foundation.org> (raw)
In-Reply-To: <20130215231304.GB23930@cmpxchg.org>
On Fri, 15 Feb 2013 18:13:04 -0500
Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Fri, Feb 15, 2013 at 01:27:38PM -0800, Andrew Morton wrote:
> > On Fri, 15 Feb 2013 01:34:50 -0500
> > Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > > + * The status is returned in a vector of bytes. The least significant
> > > + * bit of each byte is 1 if the referenced page is in memory, otherwise
> > > + * it is zero.
> >
> > Also, this is going to be dreadfully inefficient for some obvious cases.
> >
> > We could address that by returning the info in some more efficient
> > representation. That will be run-length encoded in some fashion.
> >
> > The obvious way would be to populate an array of
> >
> > struct page_status {
> > u32 present:1;
> > u32 count:31;
> > };
> >
> > or whatever.
>
> I'm having a hard time seeing how this could be extended to more
> status bits without stifling the optimization too much.
See other email: add a syscall arg which specifies the boolean status
which we're searching for.
> If we just
> add more status bits to one page_status, the likelihood of long runs
> where all bits are in agreement decreases. But as the optimization
> becomes less and less effective, we are stuck with an interface that
> is more PITA than just using mmap and mincore again.
>
> The user has to supply a worst-case-sized vector with one struct
> page_status per page in the range, but the per-page item will be
> bigger than with the byte vector because of the additional run length
> variable.
Yes, we'd need to tell the kernel how much storage is available for the
structures.
> However, one struct page_status per run leaves you with a worst case
> of one syscall per page in the range.
Yes.
> I dunno. The byte vector might not be optimal but its worst cases
> seem more attractive, is just as extensible, and dead simple to use.
But I think "which pages from this 4TB file are in core" will not be an
uncommon usage, and writing a gig of memory to find three pages is just
awful.
I wonder what the most common usage would be (one should know this
before merging the syscall :)). I guess "is this relatively-small
range of the file in core" and/or "which pages from this
relatively-small range of the file will I need to read", etc.
The syscall should handle the common usages very well. But it
shouldn't handle uncommon usages very badly!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-02-15 23:42 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <87a9rbh7b4.fsf@rustcorp.com.au>
[not found] ` <20130211162701.GB13218@cmpxchg.org>
[not found] ` <20130211141239.f4decf03.akpm@linux-foundation.org>
2013-02-15 6:34 ` [patch 1/2] mm: fincore() Johannes Weiner
2013-02-15 20:39 ` David Miller
2013-02-15 21:14 ` Andrew Morton
2013-02-15 22:28 ` Johannes Weiner
2013-02-15 22:34 ` Andrew Morton
2013-02-15 21:27 ` Andrew Morton
2013-02-15 23:13 ` Johannes Weiner
2013-02-15 23:42 ` Andrew Morton [this message]
2013-02-16 4:23 ` Rusty Russell
2013-02-17 22:51 ` Johannes Weiner
2013-02-17 22:54 ` Andrew Morton
2013-05-29 14:53 ` Andres Freund
2013-05-29 17:32 ` Johannes Weiner
2013-05-29 17:52 ` Andres Freund
2013-02-18 5:41 ` Rusty Russell
2013-02-19 10:25 ` Simon Jeons
2013-02-15 6:35 ` [patch 2/2] x86-64: hook up fincore() syscall Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130215154235.0fb36f53.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=rusty@rustcorp.com.au \
--cc=stewart@flamingspork.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).