From: Jerome Glisse <jglisse@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: John Hubbard <jhubbard@nvidia.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux-MM <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Yu Zhao <yuzhao@google.com>, Andy Lutomirski <luto@kernel.org>,
Peter Xu <peterx@redhat.com>, Pavel Emelyanov <xemul@openvz.org>,
Mike Kravetz <mike.kravetz@oracle.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Minchan Kim <minchan@kernel.org>, Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Hugh Dickins <hughd@google.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Matthew Wilcox <willy@infradead.org>,
Oleg Nesterov <oleg@redhat.com>, Jann Horn <jannh@google.com>,
Kees Cook <keescook@chromium.org>,
Leon Romanovsky <leonro@nvidia.com>,
Jason Gunthorpe <jgg@ziepe.ca>, Jan Kara <jack@suse.cz>,
Kirill Tkhai <ktkhai@virtuozzo.com>,
Nadav Amit <nadav.amit@gmail.com>, Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH 0/1] mm: restore full accuracy in COW page reuse
Date: Tue, 12 Jan 2021 18:51:04 -0500 [thread overview]
Message-ID: <20210112235104.GA490399@redhat.com> (raw)
In-Reply-To: <CAHk-=wje9r3fREBdZcOu=NihGczBtkqkhXRPDhY-ZkNVv=thiQ@mail.gmail.com>
On Mon, Jan 11, 2021 at 02:18:13PM -0800, Linus Torvalds wrote:
> On Mon, Jan 11, 2021 at 11:19 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Sun, Jan 10, 2021 at 11:27 PM John Hubbard <jhubbard@nvidia.com> wrote:
> > > IMHO, a lot of the bits in page _refcount are still being wasted (even
> > > after GUP_PIN_COUNTING_BIAS overloading), because it's unlikely that
> > > there are many callers of gup/pup per page.
> >
> > It may be unlikely under real loads.
> >
> > But we've actually had overflow issues on this because rather than
> > real loads you can do attack loads (ie "lots of processes, lots of
> > pipe file descriptors, lots of vmsplice() operations on the same
> > page".
> >
> > We had to literally add that conditional "try_get_page()" that
> > protects against overflow..
>
> Actually, what I think might be a better model is to actually
> strengthen the rules even more, and get rid of GUP_PIN_COUNTING_BIAS
> entirely.
>
> What we could do is just make a few clear rules explicit (most of
> which we already basically hold to). Starting from that basic
>
> (a) Anonymous pages are made writable (ie COW) only when they have a
> page_count() of 1
>
> That very simple rule then automatically results in the corollary
>
> (b) a writable page in a COW mapping always starts out reachable
> _only_ from the page tables
>
> and now we could have a couple of really simple new rules:
>
> (c) we never ever make a writable page in a COW mapping read-only
> _unless_ it has a page_count() of 1
This breaks mprotect(R_ONLY) i do not think we want to do that. This
might break security scheme for user space application which expect
mprotect to make CPU mapping reads only.
Maybe an alternative would be to copy page on mprotect for pages that
do not have a page_count of 1 ? But that makes me uneasy toward short
lived GUP (direct IO racing with a mprotect or maybe simply even page
migration) versus unbound one (like RDMA).
Also I want to make sure i properly understand what happens on fork()
on a COW mapping for a page that has a page_count > 1 ? We copy the
page instead of write protecting the page ?
I believe better here would be to protect the page on the CPU but
forbid child to reuse the page ie if the child ever inherit the page
(parent unmapped the page for instance) it will have to make a copy
and the GUP reference (taken before the fork) might linger on a page
that is no longer associated with any VM. This way we keep fast fork.
Jérôme
next prev parent reply other threads:[~2021-01-12 23:51 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-10 0:44 [PATCH 0/1] mm: restore full accuracy in COW page reuse Andrea Arcangeli
2021-01-10 0:44 ` [PATCH 1/1] " Andrea Arcangeli
2021-01-10 2:54 ` Andrea Arcangeli
2021-01-11 14:11 ` Kirill A. Shutemov
2021-01-10 0:55 ` [PATCH 0/1] " Linus Torvalds
2021-01-10 1:19 ` Linus Torvalds
2021-01-10 1:37 ` Linus Torvalds
2021-01-10 3:24 ` Andrea Arcangeli
2021-01-10 2:51 ` Andrea Arcangeli
2021-01-10 3:51 ` Linus Torvalds
2021-01-10 19:30 ` Linus Torvalds
2021-01-11 1:18 ` Jason Gunthorpe
2021-01-11 7:26 ` John Hubbard
2021-01-11 12:42 ` Matthew Wilcox
2021-01-11 16:05 ` Jason Gunthorpe
2021-01-11 16:15 ` Michal Hocko
2021-01-11 19:19 ` Linus Torvalds
2021-01-11 22:18 ` Linus Torvalds
2021-01-12 17:07 ` Andy Lutomirski
2021-01-12 23:51 ` Jerome Glisse [this message]
2021-01-13 2:16 ` Matthew Wilcox
2021-01-13 2:43 ` Linus Torvalds
2021-01-13 3:31 ` Linus Torvalds
2021-01-13 8:52 ` David Hildenbrand
2021-01-13 8:57 ` David Hildenbrand
2021-01-13 12:32 ` Kirill A. Shutemov
2021-01-13 12:55 ` Matthew Wilcox
2021-01-13 19:54 ` Linus Torvalds
2021-01-13 23:54 ` Peter Xu
2021-01-11 15:52 ` Jason Gunthorpe
2021-01-15 8:59 ` David Hildenbrand
2021-01-15 18:37 ` Jason Gunthorpe
2021-01-15 19:46 ` David Hildenbrand
2021-01-15 19:53 ` Jason Gunthorpe
2021-01-16 3:40 ` John Hubbard
2021-01-16 11:42 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210112235104.GA490399@redhat.com \
--to=jglisse@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=keescook@chromium.org \
--cc=kirill@shutemov.name \
--cc=ktkhai@virtuozzo.com \
--cc=leonro@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=minchan@kernel.org \
--cc=nadav.amit@gmail.com \
--cc=oleg@redhat.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=rppt@linux.vnet.ibm.com \
--cc=torvalds@linux-foundation.org \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=xemul@openvz.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox