linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Hugh Dickins <hughd@google.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 0/2] RFC: READ/WRITE_ONCE vma/mm cleanups
Date: Fri, 1 Mar 2019 11:54:52 -0500	[thread overview]
Message-ID: <20190301165452.GP14294@redhat.com> (raw)
In-Reply-To: <3e8b2ff0-d188-5259-b488-e31355e1e8ad@suse.cz>

Hello Kirill and Vlastimil,

On Fri, Mar 01, 2019 at 02:04:38PM +0100, Vlastimil Babka wrote:
> On 3/1/19 10:37 AM, Kirill A. Shutemov wrote:
> > On Thu, Feb 28, 2019 at 10:55:48PM -0500, Andrea Arcangeli wrote:
> >> Hello,
> >>
> >> This was a well known issue for more than a decade, but until a few
> >> months ago we relied on the compiler to stick to atomic accesses and
> >> updates while walking and updating pagetables.
> >>
> >> However now the 64bit native_set_pte finally uses WRITE_ONCE and
> >> gup_pmd_range uses READ_ONCE as well.
> >>
> >> This convert more racy VM places to avoid depending on the expected
> >> compiler behavior to achieve kernel runtime correctness.
> >>
> >> It mostly guarantees gcc to do atomic updates at 64bit granularity
> >> (practically not needed) and it also prevents gcc to emit code that
> >> risks getting confused if the memory unexpectedly changes under it
> >> (unlikely to ever be needed).
> >>
> >> The list of vm_start/end/pgoff to update isn't complete, I covered the
> >> most obvious places, but before wasting too much time at doing a full
> >> audit I thought it was safer to post it and get some comment. More
> >> updates can be posted incrementally anyway.
> > 
> > The intention is described well to my eyes.
> > 
> > Do I understand correctly, that it's attempt to get away with modifying
> > vma's fields under down_read(mmap_sem)?

The issue is that we already get away with it, but we do it without
READ/WRITE_ONCE. The patch should changes nothing, it should only
reduce the dependency on the compiler to do what we expect.

> If that's the intention, then IMHO it's not that well described. It
> talks about "racy VM places" but e.g. the __mm_populate() changes are
> for code protected by down_read(). So what's going on here?

expand_stack can move anonymous vma vm_end up or vm_start/pgoff down,
while we hold the mmap_sem for writing. See the location of the three
WRITE_ONCE in the patch.

So whenever we deal with a vma that we don't know if it's filebacked
(filebacked vmas cannot growsup/down) and that we don't know if it has
VM_GROWSDOWN/UP set, we shall use READ_ONCE to access
vm_start/end/pgoff. This is the only thing the patch is about, and it
should make no runtime difference at all, but then the WRITE_ONCE in
native_set_pte also should make no runtime difference just like the
READ_ONCE in gup_pmd_range should make no runtime difference. I mean
we don't trust the compiler with gup_fast but then we trust it with
expand_stack vs find_vma.


  reply	other threads:[~2019-03-01 16:54 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-01  3:55 [PATCH 0/2] RFC: READ/WRITE_ONCE vma/mm cleanups Andrea Arcangeli
2019-03-01  3:55 ` [PATCH 1/2] coredump: use READ_ONCE to read mm->flags Andrea Arcangeli
2019-03-01  3:55 ` [PATCH 2/2] mm: use READ/WRITE_ONCE to access anonymous vmas vm_start/vm_end/vm_pgoff Andrea Arcangeli
2019-03-01  9:37 ` [PATCH 0/2] RFC: READ/WRITE_ONCE vma/mm cleanups Kirill A. Shutemov
2019-03-01 13:04   ` Vlastimil Babka
2019-03-01 16:54     ` Andrea Arcangeli [this message]
2019-03-01 18:49       ` Davidlohr Bueso
2019-03-04 10:12       ` Kirill A. Shutemov
2019-03-05 13:00         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190301165452.GP14294@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).