All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Mike Kravetz
	<mike.kravetz-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	linux-mm <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	Florian Weimer <fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	colm-ZXBCfW2eEe/k1uMJSBkQmQ@public.gmane.org,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Kirill A. Shutemov"
	<kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org>,
	Dave Hansen <dave.hansen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK
Date: Fri, 11 Aug 2017 16:27:45 -0400	[thread overview]
Message-ID: <1502483265.6577.52.camel@redhat.com> (raw)
In-Reply-To: <CA+55aFzA+7CeCdUi-13DfOeE3FfhtTPMMmBA4UQx8FixXiD4YA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, 2017-08-11 at 12:42 -0700, Linus Torvalds wrote:
> On Fri, Aug 11, 2017 at 12:19 PM,  <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 0e517be91a89..f9b0ad7feb57 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1134,6 +1134,16 @@ int copy_page_range(struct mm_struct
> > *dst_mm, struct mm_struct *src_mm,
> >                         !vma->anon_vma)
> >                 return 0;
> > 
> > +       /*
> > +        * With VM_WIPEONFORK, the child inherits the VMA from the
> > +        * parent, but not its contents.
> > +        *
> > +        * A child accessing VM_WIPEONFORK memory will see all
> > zeroes;
> > +        * a child accessing VM_DONTCOPY memory receives a
> > segfault.
> > +        */
> > +       if (vma->vm_flags & VM_WIPEONFORK)
> > +               return 0;
> > +
> 
> Is this right?
> 
> Yes, you don't do the page table copies. Fine. But you leave vma with
> the the anon_vma pointer - doesn't that mean that it's still
> connected
> to the original anonvma chain, and we might end up swapping something
> in?

Swapping something in would require there to be a swap entry in
the page table entries, which we are not copying, so this should
not be a correctness issue.

> And even if that ends up not being an issue, I'd expect that you'd
> want to break the anon_vma chain just to not make it grow
> unnecessarily.

This is a good point. I can send a v4 that skips the anon_vma_fork()
call if VM_WIPEONFORK, and calls anon_vma_prepare(), instead.

> So my gut feel is that doing this in "copy_page_range()" is wrong,
> and
> the logic should be moved up to dup_mmap(), where we can also
> short-circuit the anon_vma chain entirely.
> 
> No?

There is another test in copy_page_range already which ends up
skipping the page table copy when it should not be done.

If you want, I can move that test into a should_copy_page_range()
function, and call that from dup_mmap(), skipping the call to
copy_page_range() if should_copy_page_range() returns false.

Having only one of the two sets of tests in dup_mmap(), and
the other in copy_page_range() seems wrong.

Just let me know what you prefer, and I'll put that in v4.

> The madvice() interface looks fine to me.

That was the main reason for adding you to the thread :)

kind regards,

Rik

WARNING: multiple messages have this Message-ID (diff)
From: Rik van Riel <riel@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@kernel.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	linux-mm <linux-mm@kvack.org>,
	Florian Weimer <fweimer@redhat.com>,
	colm@allcosts.net, Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>, Ingo Molnar <mingo@kernel.org>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Dave Hansen <dave.hansen@intel.com>,
	Linux API <linux-api@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK
Date: Fri, 11 Aug 2017 16:27:45 -0400	[thread overview]
Message-ID: <1502483265.6577.52.camel@redhat.com> (raw)
In-Reply-To: <CA+55aFzA+7CeCdUi-13DfOeE3FfhtTPMMmBA4UQx8FixXiD4YA@mail.gmail.com>

On Fri, 2017-08-11 at 12:42 -0700, Linus Torvalds wrote:
> On Fri, Aug 11, 2017 at 12:19 PM,A A <riel@redhat.com> wrote:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 0e517be91a89..f9b0ad7feb57 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1134,6 +1134,16 @@ int copy_page_range(struct mm_struct
> > *dst_mm, struct mm_struct *src_mm,
> > A A A A A A A A A A A A A A A A A A A A A A A A !vma->anon_vma)
> > A A A A A A A A A A A A A A A A return 0;
> > 
> > +A A A A A A A /*
> > +A A A A A A A A * With VM_WIPEONFORK, the child inherits the VMA from the
> > +A A A A A A A A * parent, but not its contents.
> > +A A A A A A A A *
> > +A A A A A A A A * A child accessing VM_WIPEONFORK memory will see all
> > zeroes;
> > +A A A A A A A A * a child accessing VM_DONTCOPY memory receives a
> > segfault.
> > +A A A A A A A A */
> > +A A A A A A A if (vma->vm_flags & VM_WIPEONFORK)
> > +A A A A A A A A A A A A A A A return 0;
> > +
> 
> Is this right?
> 
> Yes, you don't do the page table copies. Fine. But you leave vma with
> the the anon_vma pointer - doesn't that mean that it's still
> connected
> to the original anonvma chain, and we might end up swapping something
> in?

Swapping something in would require there to be a swap entry in
the page table entries, which we are not copying, so this should
not be a correctness issue.

> And even if that ends up not being an issue, I'd expect that you'd
> want to break the anon_vma chain just to not make it grow
> unnecessarily.

This is a good point. I can send a v4 that skips the anon_vma_fork()
call if VM_WIPEONFORK, and calls anon_vma_prepare(), instead.

> So my gut feel is that doing this in "copy_page_range()" is wrong,
> and
> the logic should be moved up to dup_mmap(), where we can also
> short-circuit the anon_vma chain entirely.
> 
> No?

There is another test in copy_page_range already which ends up
skipping the page table copy when it should not be done.

If you want, I can move that test into a should_copy_page_range()
function, and call that from dup_mmap(), skipping the call to
copy_page_range() if should_copy_page_range() returns false.

Having only one of the two sets of tests in dup_mmap(), and
the other in copy_page_range() seems wrong.

Just let me know what you prefer, and I'll put that in v4.

> The madvice() interface looks fine to me.

That was the main reason for adding you to the thread :)

kind regards,

Rik

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Rik van Riel <riel@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@kernel.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	linux-mm <linux-mm@kvack.org>,
	Florian Weimer <fweimer@redhat.com>,
	colm@allcosts.net, Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>, Ingo Molnar <mingo@kernel.org>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Dave Hansen <dave.hansen@intel.com>,
	Linux API <linux-api@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK
Date: Fri, 11 Aug 2017 16:27:45 -0400	[thread overview]
Message-ID: <1502483265.6577.52.camel@redhat.com> (raw)
In-Reply-To: <CA+55aFzA+7CeCdUi-13DfOeE3FfhtTPMMmBA4UQx8FixXiD4YA@mail.gmail.com>

On Fri, 2017-08-11 at 12:42 -0700, Linus Torvalds wrote:
> On Fri, Aug 11, 2017 at 12:19 PM,  <riel@redhat.com> wrote:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 0e517be91a89..f9b0ad7feb57 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1134,6 +1134,16 @@ int copy_page_range(struct mm_struct
> > *dst_mm, struct mm_struct *src_mm,
> >                         !vma->anon_vma)
> >                 return 0;
> > 
> > +       /*
> > +        * With VM_WIPEONFORK, the child inherits the VMA from the
> > +        * parent, but not its contents.
> > +        *
> > +        * A child accessing VM_WIPEONFORK memory will see all
> > zeroes;
> > +        * a child accessing VM_DONTCOPY memory receives a
> > segfault.
> > +        */
> > +       if (vma->vm_flags & VM_WIPEONFORK)
> > +               return 0;
> > +
> 
> Is this right?
> 
> Yes, you don't do the page table copies. Fine. But you leave vma with
> the the anon_vma pointer - doesn't that mean that it's still
> connected
> to the original anonvma chain, and we might end up swapping something
> in?

Swapping something in would require there to be a swap entry in
the page table entries, which we are not copying, so this should
not be a correctness issue.

> And even if that ends up not being an issue, I'd expect that you'd
> want to break the anon_vma chain just to not make it grow
> unnecessarily.

This is a good point. I can send a v4 that skips the anon_vma_fork()
call if VM_WIPEONFORK, and calls anon_vma_prepare(), instead.

> So my gut feel is that doing this in "copy_page_range()" is wrong,
> and
> the logic should be moved up to dup_mmap(), where we can also
> short-circuit the anon_vma chain entirely.
> 
> No?

There is another test in copy_page_range already which ends up
skipping the page table copy when it should not be done.

If you want, I can move that test into a should_copy_page_range()
function, and call that from dup_mmap(), skipping the call to
copy_page_range() if should_copy_page_range() returns false.

Having only one of the two sets of tests in dup_mmap(), and
the other in copy_page_range() seems wrong.

Just let me know what you prefer, and I'll put that in v4.

> The madvice() interface looks fine to me.

That was the main reason for adding you to the thread :)

kind regards,

Rik

  parent reply	other threads:[~2017-08-11 20:27 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-11 19:19 [PATCH v3 0/2] mm,fork,security: introduce MADV_WIPEONFORK riel-H+wXaHxf7aLQT0dZR+AlfA
2017-08-11 19:19 ` riel
2017-08-11 19:19 ` riel
     [not found] ` <20170811191942.17487-1-riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-11 19:19   ` [PATCH 1/2] x86,mpx: make mpx depend on x86-64 to free up VMA flag riel-H+wXaHxf7aLQT0dZR+AlfA
2017-08-11 19:19     ` riel
2017-08-11 19:19     ` riel
2017-08-11 19:19 ` [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK riel
2017-08-11 19:19   ` riel
2017-08-11 19:19   ` riel
     [not found]   ` <20170811191942.17487-3-riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-11 19:42     ` Linus Torvalds
2017-08-11 19:42       ` Linus Torvalds
2017-08-11 19:42       ` Linus Torvalds
     [not found]       ` <CA+55aFzA+7CeCdUi-13DfOeE3FfhtTPMMmBA4UQx8FixXiD4YA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-11 20:27         ` Rik van Riel [this message]
2017-08-11 20:27           ` Rik van Riel
2017-08-11 20:27           ` Rik van Riel
2017-08-11 20:50           ` Linus Torvalds
2017-08-11 20:50             ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2017-08-11 21:28 [PATCH v4 0/2] mm,fork,security: " riel-H+wXaHxf7aLQT0dZR+AlfA
2017-08-11 21:28 ` [PATCH 2/2] mm,fork: " riel
2017-08-11 21:28   ` riel
2017-08-15 22:51   ` Andrew Morton
2017-08-15 22:51     ` Andrew Morton
2017-08-16  2:18     ` Rik van Riel
2017-08-16  2:18       ` Rik van Riel
2017-08-16  2:18       ` Rik van Riel
2017-08-17 22:50       ` Andrew Morton
2017-08-17 22:50         ` Andrew Morton
2017-08-18 16:28         ` Rik van Riel
2017-08-18 16:28           ` Rik van Riel
2017-08-18 16:28           ` Rik van Riel
     [not found]           ` <1503073709.6577.68.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-18 18:15             ` Andrew Morton
2017-08-18 18:15               ` Andrew Morton
2017-08-18 18:15               ` Andrew Morton
     [not found]               ` <20170818111545.ab371cfedb71d13d76590030-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2017-08-19  0:02                 ` Rik van Riel
2017-08-19  0:02                   ` Rik van Riel
2017-08-19  0:02                   ` Rik van Riel
2017-08-18 17:25   ` Mike Kravetz
2017-08-18 17:25     ` Mike Kravetz
2017-08-06 14:04 [PATCH v2 0/2] mm,fork,security: " riel
2017-08-06 14:04 ` [PATCH 2/2] mm,fork: " riel
2017-08-06 14:04   ` riel
2017-08-10 15:23   ` Michal Hocko
2017-08-10 15:23     ` Michal Hocko
2017-08-11 15:23     ` Rik van Riel
2017-08-11 15:23       ` Rik van Riel
2017-08-11 16:36       ` Mike Kravetz
2017-08-11 16:36         ` Mike Kravetz
2017-08-11 16:59         ` Rik van Riel
2017-08-11 16:59           ` Rik van Riel
2017-08-11 17:07           ` Mike Kravetz
2017-08-11 17:07             ` Mike Kravetz
2017-08-04 19:07 [PATCH 0/2] mm,fork,security: " riel
2017-08-04 19:07 ` [PATCH 2/2] mm,fork: " riel
2017-08-04 19:07   ` riel
2017-08-04 23:09   ` Mike Kravetz
2017-08-04 23:09     ` Mike Kravetz
2017-08-05 14:05     ` Rik van Riel
2017-08-05 14:05       ` Rik van Riel
2017-08-14 15:45   ` kbuild test robot
2017-08-04 19:01 [PATCH 0/2] mm,fork: MADV_WIPEONFORK - an empty VMA in the child riel
2017-08-04 19:01 ` [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK riel
2017-08-05 18:46   ` kbuild test robot
2017-08-05 19:33   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1502483265.6577.52.camel@redhat.com \
    --to=riel-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=colm-ZXBCfW2eEe/k1uMJSBkQmQ@public.gmane.org \
    --cc=dave.hansen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=mike.kravetz-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.