linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Collingbourne <pcc@google.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: John Hubbard <jhubbard@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	 Evgenii Stepanov <eugenis@google.com>,
	Jann Horn <jannh@google.com>,
	 Linux ARM <linux-arm-kernel@lists.infradead.org>,
	 Linux Memory Management List <linux-mm@kvack.org>,
	kernel test robot <lkp@intel.com>,
	 Linux API <linux-api@vger.kernel.org>,
	linux-doc@vger.kernel.org
Subject: Re: [PATCH v4] mm: introduce reference pages
Date: Fri, 16 Jul 2021 19:58:31 -0700	[thread overview]
Message-ID: <CAMn1gO4P_VcLsFi4kSNNeqgtcWANdhyMGDb2TOj6VjC0mSgV1g@mail.gmail.com> (raw)
In-Reply-To: <20210628122455.sqo77q4jfxtiwt5b@box.shutemov.name>

On Mon, Jun 28, 2021 at 5:24 AM Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> On Sat, Jun 19, 2021 at 02:20:02AM -0700, Peter Collingbourne wrote:
> >   #include <stdio.h>
> >   #include <stdlib.h>
> >   #include <string.h>
> >   #include <sys/mman.h>
> >   #include <unistd.h>
> >
> >   constexpr unsigned char pattern_byte = 0xaa;
> >
> >   #define PAGE_SIZE 4096
> >
> >   _Alignas(PAGE_SIZE) static unsigned char pattern[PAGE_SIZE];
> >
> >   int main(int argc, char **argv) {
> >     if (argc < 3)
> >       return 1;
> >     bool use_refpage = argc > 3;
> >     size_t mmap_size = atoi(argv[1]);
> >     size_t touch_size = atoi(argv[2]);
> >
> >     int refpage_fd;
> >     if (use_refpage) {
> >       memset(pattern, pattern_byte, PAGE_SIZE);
> >       refpage_fd = syscall(448, pattern, 0);
> >     }
> >     for (unsigned i = 0; i != 1000; ++i) {
> >       char *p;
> >       if (use_refpage) {
> >         p = (char *)mmap(0, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE,
> >                          refpage_fd, 0);
> >       } else {
> >         p = (char *)mmap(0, mmap_size, PROT_READ | PROT_WRITE,
> >                          MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> >         memset(p, pattern_byte, mmap_size);
> >       }
> >       for (unsigned j = 0; j < touch_size; j += PAGE_SIZE)
> >         p[j] = 0;
> >       munmap(p, mmap_size);
> >     }
> >   }
>
> I don't like the inteface. It is tied to PAGE_SIZE and this doesn't seem
> to be very future looking. How would it work with THPs?

The idea with this interface is that the FD would be passed to mmap,
and anything that uses mmap already needs to be tied to the page size
to some extent.

For THPs I would expect that the kernel would duplicate the contents
of the page as needed.

Another reason that I thought to use a page size based interface was
to allow future optimizations that may reuse the actual page passed to
the syscall. So for example if libc.so contained a page filled with
the required pattern and the allocator passed a pointer to that page
then it could be shared between all of the processes on the system
that link against that libc.

But I suppose that such optimizations would not require passing in a
whole page like that. For pattern based optimizations we could use a
reference counted hash table or something, and for larger patterns we
could activate the optimization only if the size argument were equal
to the page size.

> Maybe we should cosider passing down a filling pattern to kernel and let
> kernel allocate appropriate page size on read page fault? The pattern has
> to be power of 2 and limited in lenght.

Okay, so this sounds like my idea for handling THPs except applied to
any size. This seems reasonable enough to me, however in order to
optimize use cases where the page is only ever read, let's have the
kernel prepare the reference page instead of recreating it every time.
In v5 I've adopted Matthew's proposed prototype:

int refpage_create(const void *__user content, unsigned int size,
                unsigned long pattern, unsigned long flags);

Peter


  reply	other threads:[~2021-07-17  2:58 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-19  9:20 [PATCH v4] mm: introduce reference pages Peter Collingbourne
2021-06-28 12:24 ` Kirill A. Shutemov
2021-07-17  2:58   ` Peter Collingbourne [this message]
2021-06-28 13:10 ` Matthew Wilcox
2021-07-17  2:58   ` Peter Collingbourne
2021-06-28 19:33 ` Matthew Wilcox
2021-06-28 19:44   ` John Hubbard
2021-06-28 19:56     ` Matthew Wilcox
2021-07-17  2:58       ` Peter Collingbourne
2021-06-29  7:19 ` John Hubbard
2021-06-29 11:58   ` Matthew Wilcox
2021-06-29 17:48     ` John Hubbard
2021-06-29 18:21       ` Matthew Wilcox
2021-06-29 18:28         ` John Hubbard
2021-07-17  2:59   ` Peter Collingbourne
2021-07-19 20:47 ` David Hildenbrand
2021-07-19 20:50 ` David Hildenbrand
2021-07-19 22:26   ` Peter Collingbourne
2021-07-19 22:30     ` John Hubbard
2021-07-20  7:28     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMn1gO4P_VcLsFi4kSNNeqgtcWANdhyMGDb2TOj6VjC0mSgV1g@mail.gmail.com \
    --to=pcc@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=eugenis@google.com \
    --cc=jannh@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill@shutemov.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).