From: Matthew Wilcox <willy@infradead.org>
To: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>,
Michal Hocko <mhocko@kernel.org>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/8] mm: mmap: unmap large mapping by section
Date: Thu, 22 Mar 2018 08:40:55 -0700 [thread overview]
Message-ID: <20180322154055.GB28468@bombadil.infradead.org> (raw)
In-Reply-To: <18a727fd-f006-9fae-d9ca-74b9004f0a8b@linux.vnet.ibm.com>
On Thu, Mar 22, 2018 at 04:32:00PM +0100, Laurent Dufour wrote:
> On 21/03/2018 23:46, Matthew Wilcox wrote:
> > On Wed, Mar 21, 2018 at 02:45:44PM -0700, Yang Shi wrote:
> >> Marking vma as deleted sounds good. The problem for my current approach is
> >> the concurrent page fault may succeed if it access the not yet unmapped
> >> section. Marking deleted vma could tell page fault the vma is not valid
> >> anymore, then return SIGSEGV.
> >>
> >>> does not care; munmap will need to wait for the existing munmap operation
> >>
> >> Why mmap doesn't care? How about MAP_FIXED? It may fail unexpectedly, right?
> >
> > The other thing about MAP_FIXED that we'll need to handle is unmapping
> > conflicts atomically. Say a program has a 200GB mapping and then
> > mmap(MAP_FIXED) another 200GB region on top of it. So I think page faults
> > are also going to have to wait for deleted vmas (then retry the fault)
> > rather than immediately raising SIGSEGV.
>
> Regarding the page fault, why not relying on the PTE locking ?
>
> When munmap() will unset the PTE it will have to held the PTE lock, so this
> will serialize the access.
> If the page fault occurs before the mmap(MAP_FIXED), the page mapped will be
> removed when mmap(MAP_FIXED) would do the cleanup. Fair enough.
The page fault handler will walk the VMA tree to find the correct
VMA and then find that the VMA is marked as deleted. If it assumes
that the VMA has been deleted because of munmap(), then it can raise
SIGSEGV immediately. But if the VMA is marked as deleted because of
mmap(MAP_FIXED), it must wait until the new VMA is in place.
I think I was wrong to describe VMAs as being *deleted*. I think we
instead need the concept of a *locked* VMA that page faults will block on.
Conceptually, it's a per-VMA rwsem, but I'd use a completion instead of
an rwsem since the only reason to write-lock the VMA is because it is
being deleted.
next prev parent reply other threads:[~2018-03-22 15:41 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-20 21:31 [RFC PATCH 0/8] Drop mmap_sem during unmapping large map Yang Shi
2018-03-20 21:31 ` [RFC PATCH 1/8] mm: mmap: unmap large mapping by section Yang Shi
2018-03-21 13:08 ` Michal Hocko
2018-03-21 16:31 ` Yang Shi
2018-03-21 17:29 ` Matthew Wilcox
2018-03-21 21:45 ` Yang Shi
2018-03-21 22:15 ` Matthew Wilcox
2018-03-21 22:40 ` Yang Shi
2018-03-21 22:46 ` Matthew Wilcox
2018-03-22 15:32 ` Laurent Dufour
2018-03-22 15:40 ` Matthew Wilcox [this message]
2018-03-22 15:54 ` Laurent Dufour
2018-03-22 16:05 ` Matthew Wilcox
2018-03-22 16:18 ` Laurent Dufour
2018-03-22 16:46 ` Yang Shi
2018-03-23 13:03 ` Laurent Dufour
2018-03-22 16:51 ` Matthew Wilcox
2018-03-22 16:49 ` Yang Shi
2018-03-22 17:34 ` Yang Shi
2018-03-22 18:48 ` Matthew Wilcox
2018-03-24 18:24 ` Jerome Glisse
2018-03-21 13:14 ` Michal Hocko
2018-03-21 16:50 ` Yang Shi
2018-03-21 17:16 ` Yang Shi
2018-03-21 21:23 ` Michal Hocko
2018-03-21 22:36 ` Yang Shi
2018-03-22 9:10 ` Michal Hocko
2018-03-22 16:06 ` Yang Shi
2018-03-22 16:12 ` Michal Hocko
2018-03-22 16:13 ` Matthew Wilcox
2018-03-22 16:28 ` Laurent Dufour
2018-03-22 16:36 ` David Laight
2018-03-20 21:31 ` [RFC PATCH 2/8] mm: mmap: pass atomic parameter to do_munmap() call sites Yang Shi
2018-03-20 21:31 ` [RFC PATCH 3/8] mm: mremap: pass atomic parameter to do_munmap() Yang Shi
2018-03-20 21:31 ` [RFC PATCH 4/8] mm: nommu: add " Yang Shi
2018-03-20 21:31 ` [RFC PATCH 5/8] ipc: shm: pass " Yang Shi
2018-03-20 21:31 ` [RFC PATCH 6/8] fs: proc/vmcore: " Yang Shi
2018-03-20 21:31 ` [RFC PATCH 7/8] x86: mpx: " Yang Shi
2018-03-20 22:35 ` Thomas Gleixner
2018-03-21 16:53 ` Yang Shi
2018-03-20 21:31 ` [RFC PATCH 8/8] x86: vma: " Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180322154055.GB28468@bombadil.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=ldufour@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).