All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Hugh Dickins <hughd@google.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 0/2] RFC: READ/WRITE_ONCE vma/mm cleanups
Date: Fri, 1 Mar 2019 11:54:52 -0500	[thread overview]
Message-ID: <20190301165452.GP14294@redhat.com> (raw)
In-Reply-To: <3e8b2ff0-d188-5259-b488-e31355e1e8ad@suse.cz>

Hello Kirill and Vlastimil,

On Fri, Mar 01, 2019 at 02:04:38PM +0100, Vlastimil Babka wrote:
> On 3/1/19 10:37 AM, Kirill A. Shutemov wrote:
> > On Thu, Feb 28, 2019 at 10:55:48PM -0500, Andrea Arcangeli wrote:
> >> Hello,
> >>
> >> This was a well known issue for more than a decade, but until a few
> >> months ago we relied on the compiler to stick to atomic accesses and
> >> updates while walking and updating pagetables.
> >>
> >> However now the 64bit native_set_pte finally uses WRITE_ONCE and
> >> gup_pmd_range uses READ_ONCE as well.
> >>
> >> This convert more racy VM places to avoid depending on the expected
> >> compiler behavior to achieve kernel runtime correctness.
> >>
> >> It mostly guarantees gcc to do atomic updates at 64bit granularity
> >> (practically not needed) and it also prevents gcc to emit code that
> >> risks getting confused if the memory unexpectedly changes under it
> >> (unlikely to ever be needed).
> >>
> >> The list of vm_start/end/pgoff to update isn't complete, I covered the
> >> most obvious places, but before wasting too much time at doing a full
> >> audit I thought it was safer to post it and get some comment. More
> >> updates can be posted incrementally anyway.
> > 
> > The intention is described well to my eyes.
> > 
> > Do I understand correctly, that it's attempt to get away with modifying
> > vma's fields under down_read(mmap_sem)?

The issue is that we already get away with it, but we do it without
READ/WRITE_ONCE. The patch should changes nothing, it should only
reduce the dependency on the compiler to do what we expect.

> If that's the intention, then IMHO it's not that well described. It
> talks about "racy VM places" but e.g. the __mm_populate() changes are
> for code protected by down_read(). So what's going on here?

expand_stack can move anonymous vma vm_end up or vm_start/pgoff down,
while we hold the mmap_sem for writing. See the location of the three
WRITE_ONCE in the patch.

So whenever we deal with a vma that we don't know if it's filebacked
(filebacked vmas cannot growsup/down) and that we don't know if it has
VM_GROWSDOWN/UP set, we shall use READ_ONCE to access
vm_start/end/pgoff. This is the only thing the patch is about, and it
should make no runtime difference at all, but then the WRITE_ONCE in
native_set_pte also should make no runtime difference just like the
READ_ONCE in gup_pmd_range should make no runtime difference. I mean
we don't trust the compiler with gup_fast but then we trust it with
expand_stack vs find_vma.


  reply	other threads:[~2019-03-01 16:54 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-01  3:55 [PATCH 0/2] RFC: READ/WRITE_ONCE vma/mm cleanups Andrea Arcangeli
2019-03-01  3:55 ` [PATCH 1/2] coredump: use READ_ONCE to read mm->flags Andrea Arcangeli
2019-03-01  3:55 ` [PATCH 2/2] mm: use READ/WRITE_ONCE to access anonymous vmas vm_start/vm_end/vm_pgoff Andrea Arcangeli
2019-03-01  9:37 ` [PATCH 0/2] RFC: READ/WRITE_ONCE vma/mm cleanups Kirill A. Shutemov
2019-03-01 13:04   ` Vlastimil Babka
2019-03-01 16:54     ` Andrea Arcangeli [this message]
2019-03-01 18:49       ` Davidlohr Bueso
2019-03-04 10:12       ` Kirill A. Shutemov
2019-03-05 13:00         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190301165452.GP14294@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.