public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, michel@lespinasse.org,
	jglisse@google.com, mhocko@suse.com, vbabka@suse.cz,
	hannes@cmpxchg.org, mgorman@techsingularity.net,
	dave@stgolabs.net, liam.howlett@oracle.com, peterz@infradead.org,
	ldufour@linux.ibm.com, paulmck@kernel.org, mingo@redhat.com,
	will@kernel.org, luto@kernel.org, songliubraving@fb.com,
	peterx@redhat.com, david@redhat.com, dhowells@redhat.com,
	hughd@google.com, bigeasy@linutronix.de,
	kent.overstreet@linux.dev, punit.agrawal@bytedance.com,
	lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com,
	chriscli@google.com, axelrasmussen@google.com, joelaf@google.com,
	minchan@google.com, rppt@kernel.org, jannh@google.com,
	shakeelb@google.com, tatashin@google.com, edumazet@google.com,
	gthelen@google.com, gurua@google.com, arjunroy@google.com,
	soheil@google.com, leewalsh@google.com, posk@google.com,
	michalechner92@googlemail.com, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set
Date: Mon, 3 Apr 2023 20:49:22 +0100	[thread overview]
Message-ID: <ZCstwjKhn76/mxIa@casper.infradead.org> (raw)
In-Reply-To: <CAJuCfpG_ZWJs3mZkL0z7m-bBe1SmeoTZydfFocZaRbHob_89Hg@mail.gmail.com>

On Fri, Feb 17, 2023 at 08:10:35AM -0800, Suren Baghdasaryan wrote:
> On Fri, Feb 17, 2023 at 8:05 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Thu, Feb 16, 2023 at 06:14:59PM -0800, Suren Baghdasaryan wrote:
> > > On Thu, Feb 16, 2023 at 11:43 AM Suren Baghdasaryan <surenb@google.com> wrote:
> > > >
> > > > On Thu, Feb 16, 2023 at 7:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > >
> > > > > On Wed, Feb 15, 2023 at 09:17:41PM -0800, Suren Baghdasaryan wrote:
> > > > > > When vma->anon_vma is not set, page fault handler will set it by either
> > > > > > reusing anon_vma of an adjacent VMA if VMAs are compatible or by
> > > > > > allocating a new one. find_mergeable_anon_vma() walks VMA tree to find
> > > > > > a compatible adjacent VMA and that requires not only the faulting VMA
> > > > > > to be stable but also the tree structure and other VMAs inside that tree.
> > > > > > Therefore locking just the faulting VMA is not enough for this search.
> > > > > > Fall back to taking mmap_lock when vma->anon_vma is not set. This
> > > > > > situation happens only on the first page fault and should not affect
> > > > > > overall performance.
> > > > >
> > > > > I think I asked this before, but don't remember getting an aswer.
> > > > > Why do we defer setting anon_vma to the first fault?  Why don't we
> > > > > set it up at mmap time?
> > > >
> > > > Yeah, I remember that conversation Matthew and I could not find the
> > > > definitive answer at the time. I'll look into that again or maybe
> > > > someone can answer it here.
> > >
> > > After looking into it again I'm still under the impression that
> > > vma->anon_vma is populated lazily (during the first page fault rather
> > > than at mmap time) to avoid doing extra work for areas which are never
> > > faulted. Though I might be missing some important detail here.
> >
> > How often does userspace call mmap() and then _never_ fault on it?
> > I appreciate that userspace might mmap() gigabytes of address space and
> > then only end up using a small amount of it, so populating it lazily
> > makes sense.  But creating a region and never faulting on it?  The only
> > use-case I can think of is loading shared libraries:
> >
> > openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
> > (...)
> > mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f0ce612e000
> > mmap(0x7f0ce6154000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7f0ce6154000
> > mmap(0x7f0ce62a9000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7f0ce62a9000
> > mmap(0x7f0ce62fc000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7f0ce62fc000
> > mmap(0x7f0ce6302000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f0ce6302000
> >
> > but that's a file-backed VMA, not an anon VMA.
> 
> Might the case of dup_mmap() while forking be the reason why a VMA in
> the child process might be never used while parent uses it (or visa
> versa)? Again, I'm not sure this is the reason but I can find no other
> good explanation.

I found an explanation!  Well, a partial one.  If we MAP_PRIVATE a file
mapping (like, er those ones up there) and only take read faults on it,
we can postpone allocation of the anon_vma indefinitely.  But once we
take a write fault in that VMA, we need to allocate an anon_vma for it
so that we can track the anonymous pages that have been allocated to
satisfy the copy-on-write (see do_cow_fault()).

However, I think in that caase, we could probably skip the
find_mergeable_anon_vma() step.  We don't today; we check whether
a->vm_file == b->vm_file in anon_vma_compatible, but I wonder if that
triggers often.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-04-03 19:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230216051750.3125598-1-surenb@google.com>
     [not found] ` <20230216051750.3125598-22-surenb@google.com>
2023-02-16 15:34   ` [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area Liam R. Howlett
     [not found]     ` <CAJuCfpEkujbHNxNWcWr8bmrsMhXGcpDyraOfQaPAcOH=RQPv5A@mail.gmail.com>
2023-02-17 14:50       ` Liam R. Howlett
     [not found] ` <20230216051750.3125598-27-surenb@google.com>
2023-02-16 15:44   ` [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set Matthew Wilcox
2023-02-16 19:43     ` Suren Baghdasaryan
2023-02-17  2:14       ` Suren Baghdasaryan
2023-02-17 16:05         ` Matthew Wilcox
2023-02-17 16:10           ` Suren Baghdasaryan
2023-04-03 19:49             ` Matthew Wilcox [this message]
     [not found] ` <20230216051750.3125598-24-surenb@google.com>
2023-02-23 20:06   ` [PATCH v3 23/35] mm/mmap: prevent pagefault handler from racing with mmu_notifier registration Liam R. Howlett
     [not found] ` <20230216051750.3125598-25-surenb@google.com>
2023-02-23 20:08   ` [PATCH v3 24/35] mm: introduce vma detached flag Liam R. Howlett
     [not found] ` <20230216051750.3125598-18-surenb@google.com>
2023-02-23 20:20   ` [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it Liam R. Howlett
2023-02-23 20:28     ` Liam R. Howlett
     [not found]       ` <CAJuCfpE3YtSQuXJwOYWKe1z9O4GASS9pA_FTWGkdveHb3bcMXA@mail.gmail.com>
2023-02-24  1:46         ` Liam R. Howlett
     [not found]           ` <CAJuCfpG4JOv4aeJ6KJDi7R649vuhc0h75230ZRJgUg8spqti8w@mail.gmail.com>
2023-02-24 16:14             ` Liam R. Howlett
2023-02-24  9:21 ` [PATCH v3 00/35] Per-VMA locks freak07
2023-02-27 16:50   ` Davidlohr Bueso
2023-02-27 17:22     ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCstwjKhn76/mxIa@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=arjunroy@google.com \
    --cc=axelrasmussen@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=chriscli@google.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=gthelen@google.com \
    --cc=gurua@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=jglisse@google.com \
    --cc=joelaf@google.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kernel-team@android.com \
    --cc=ldufour@linux.ibm.com \
    --cc=leewalsh@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lstoakes@gmail.com \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=michalechner92@googlemail.com \
    --cc=michel@lespinasse.org \
    --cc=minchan@google.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterjung1337@gmail.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=posk@google.com \
    --cc=punit.agrawal@bytedance.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=shakeelb@google.com \
    --cc=soheil@google.com \
    --cc=songliubraving@fb.com \
    --cc=surenb@google.com \
    --cc=tatashin@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox