From: Michel Lespinasse <walken@google.com>
To: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>,
sergey.senozhatsky.work@gmail.com,
Peter Zijlstra <peterz@infradead.org>,
Will Deacon <will.deacon@arm.com>,
Michal Hocko <mhocko@kernel.org>, linux-mm <linux-mm@kvack.org>,
Paul Mackerras <paulus@samba.org>,
Punit Agrawal <punitagrawal@gmail.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Mike Rapoport <rppt@linux.ibm.com>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andi Kleen <ak@linux.intel.com>, Minchan Kim <minchan@kernel.org>,
aneesh.kumar@linux.ibm.com, x86@kernel.org,
Matthew Wilcox <willy@infradead.org>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
Ingo Molnar <mingo@redhat.com>,
David Rientjes <rientjes@google.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Haiyan Song <haiyanx.song@intel.com>,
Nick Piggin <npiggin@gmail.com>,
sj38.park@gmail.com, Jerome Glisse <jglisse@redhat.com>,
dave@stgolabs.net, kemi.wang@intel.com,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Thomas Gleixner <tglx@linutronix.de>,
zhong jiang <zhongjiang@huawei.com>,
Ganesh Mahendran <opensource.ganesh@gmail.com>,
Yang Shi <yang.shi@linux.alibaba.com>,
linuxppc-dev@lists.ozlabs.org,
LKML <linux-kernel@vger.kernel.org>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
vinayak menon <vinayakm.list@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Tim Chen <tim.c.chen@linux.intel.com>,
haren@linux.vnet.ibm.com
Subject: Re: [PATCH v12 00/31] Speculative page faults
Date: Fri, 26 Apr 2019 18:53:15 -0700 [thread overview]
Message-ID: <20190427015315.GA174296@google.com> (raw)
In-Reply-To: <05df6720-7130-62fe-a71f-074b6fafff3e@linux.ibm.com>
On Wed, Apr 24, 2019 at 09:33:44AM +0200, Laurent Dufour wrote:
> Le 23/04/2019 à 11:38, Peter Zijlstra a écrit :
> > On Mon, Apr 22, 2019 at 02:29:16PM -0700, Michel Lespinasse wrote:
> > > The proposed spf mechanism only handles anon vmas. Is there a
> > > fundamental reason why it couldn't handle mapped files too ?
> > > My understanding is that the mechanism of verifying the vma after
> > > taking back the ptl at the end of the fault would work there too ?
> > > The file has to stay referenced during the fault, but holding the vma's
> > > refcount could be made to cover that ? the vm_file refcount would have
> > > to be released in __free_vma() instead of remove_vma; I'm not quite sure
> > > if that has more implications than I realize ?
> >
> > IIRC (and I really don't remember all that much) the trickiest bit was
> > vs unmount. Since files can stay open past the 'expected' duration,
> > umount could be delayed.
> >
> > But yes, I think I had a version that did all that just 'fine'. Like
> > mentioned, I didn't keep the refcount because it sucked just as hard as
> > the mmap_sem contention, but the SRCU callback did the fput() just fine
> > (esp. now that we have delayed_fput).
>
> I had to use a refcount for the VMA because I'm using RCU in place of SRCU
> and only protecting the RB tree using RCU.
>
> Regarding the file pointer, I decided to release it synchronously to avoid
> the latency of RCU during the file closing. As you mentioned this could
> delayed the umount but not only, as Linus Torvald demonstrated by the past
> [1]. Anyway, since the file support is not yet here there is no need for
> that currently.
>
> [1] https://lore.kernel.org/linux-mm/alpine.LFD.2.00.1001041904250.3630@localhost.localdomain/
Just to make sure I understand this correctly. If a program tries to
munmap a region while page faults are occuring (which means that the
program has a race condition in the first place), before spf the
mmap_sem would delay the munmap until the page fault completes. With
spf the munmap will happen immediately, while the vm_ops->fault()
is running, with spf holding a ref to the file. vm_ops->fault is
expected to execute a read from the file to the page cache, and the
page cache page will never be mapped into the process because after
taking the ptl, spf will notice the vma changed. So, the side effects
that may be observed after munmap completes would be:
- side effects from reading a file into the page cache - I'm not sure
what they are, the main one I can think of is that userspace may observe
the file's atime changing ?
- side effects from holding a reference to the file - which userspace
may observe by trying to unmount().
Is that the extent of the side effects, or are there more that I have
not thought of ?
> Regarding the file mapping support, the concern is to ensure that
> vm_ops->fault() will not try to release the mmap_sem. This is true for most
> of the file system operation using the generic one, but there is currently
> no clever way to identify that except by checking the vm_ops->fault pointer.
> Adding a flag to the vm_operations_struct structure is another option.
>
> that's doable as far as the underlying fault() function is not dealing with
> the mmap_sem, and I made a try by the past but was thinking that first the
> anonymous case should be accepted before moving forward this way.
Yes, that makes sense. Updating all of the fault handlers would be a
lot of work - but there doesn't seem to be anything fundamental that
wouldn't work there (except for the side effects of reordering spf
against munmap, as discussed above, which doesn't look easy to fully hide.).
--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
next prev parent reply other threads:[~2019-04-27 1:55 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-16 13:44 [PATCH v12 00/31] Speculative page faults Laurent Dufour
2019-04-16 13:44 ` [PATCH v12 01/31] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
2019-04-18 21:47 ` Jerome Glisse
2019-04-23 15:21 ` Laurent Dufour
2019-04-16 13:44 ` [PATCH v12 02/31] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2019-04-18 21:48 ` Jerome Glisse
2019-04-16 13:44 ` [PATCH v12 03/31] powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2019-04-18 21:49 ` Jerome Glisse
2019-04-16 13:44 ` [PATCH v12 04/31] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2019-04-16 14:27 ` Mark Rutland
2019-04-16 14:31 ` Laurent Dufour
2019-04-16 14:41 ` Mark Rutland
2019-04-18 21:51 ` Jerome Glisse
2019-04-23 15:36 ` Laurent Dufour
2019-04-23 16:19 ` Mark Rutland
2019-04-24 10:34 ` Laurent Dufour
2019-04-16 13:44 ` [PATCH v12 05/31] mm: prepare for FAULT_FLAG_SPECULATIVE Laurent Dufour
2019-04-18 22:04 ` Jerome Glisse
2019-04-23 15:45 ` Laurent Dufour
2019-04-16 13:44 ` [PATCH v12 06/31] mm: introduce pte_spinlock " Laurent Dufour
2019-04-18 22:05 ` Jerome Glisse
2019-04-16 13:44 ` [PATCH v12 07/31] mm: make pte_unmap_same compatible with SPF Laurent Dufour
2019-04-18 22:10 ` Jerome Glisse
2019-04-23 15:43 ` Matthew Wilcox
2019-04-23 15:47 ` Laurent Dufour
2019-04-16 13:44 ` [PATCH v12 08/31] mm: introduce INIT_VMA() Laurent Dufour
2019-04-18 22:22 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 09/31] mm: VMA sequence count Laurent Dufour
2019-04-18 22:48 ` Jerome Glisse
2019-04-19 15:45 ` Laurent Dufour
2019-04-22 15:51 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 10/31] mm: protect VMA modifications using " Laurent Dufour
2019-04-22 19:43 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 11/31] mm: protect mremap() against SPF hanlder Laurent Dufour
2019-04-22 19:51 ` Jerome Glisse
2019-04-23 15:51 ` Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 12/31] mm: protect SPF handler against anon_vma changes Laurent Dufour
2019-04-22 19:53 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 13/31] mm: cache some VMA fields in the vm_fault structure Laurent Dufour
2019-04-22 20:06 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 14/31] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
2019-04-22 20:09 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 15/31] mm: introduce __lru_cache_add_active_or_unevictable Laurent Dufour
2019-04-22 20:11 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 16/31] mm: introduce __vm_normal_page() Laurent Dufour
2019-04-22 20:15 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 17/31] mm: introduce __page_add_new_anon_rmap() Laurent Dufour
2019-04-22 20:18 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 18/31] mm: protect against PTE changes done by dup_mmap() Laurent Dufour
2019-04-22 20:32 ` Jerome Glisse
2019-04-24 10:33 ` Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 19/31] mm: protect the RB tree with a sequence lock Laurent Dufour
2019-04-22 20:33 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 20/31] mm: introduce vma reference counter Laurent Dufour
2019-04-22 20:36 ` Jerome Glisse
2019-04-24 14:26 ` Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 21/31] mm: Introduce find_vma_rcu() Laurent Dufour
2019-04-22 20:57 ` Jerome Glisse
2019-04-24 14:39 ` Laurent Dufour
2019-04-23 9:27 ` Peter Zijlstra
2019-04-23 18:13 ` Davidlohr Bueso
2019-04-24 7:57 ` Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 22/31] mm: provide speculative fault infrastructure Laurent Dufour
2019-04-22 21:26 ` Jerome Glisse
2019-04-24 14:56 ` Laurent Dufour
2019-04-24 15:13 ` Jerome Glisse
2019-04-16 13:45 ` [PATCH v12 23/31] mm: don't do swap readahead during speculative page fault Laurent Dufour
2019-04-22 21:36 ` Jerome Glisse
2019-04-24 14:57 ` Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 24/31] mm: adding speculative page fault failure trace events Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 25/31] perf: add a speculative page fault sw event Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 26/31] perf tools: add support for the SPF perf event Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 27/31] mm: add speculative page fault vmstats Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 28/31] x86/mm: add speculative pagefault handling Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 29/31] powerpc/mm: add speculative page fault Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 30/31] arm64/mm: " Laurent Dufour
2019-04-16 13:45 ` [PATCH v12 31/31] mm: Add a speculative page fault switch in sysctl Laurent Dufour
2019-04-22 21:29 ` [PATCH v12 00/31] Speculative page faults Michel Lespinasse
2019-04-23 9:38 ` Peter Zijlstra
2019-04-24 7:33 ` Laurent Dufour
2019-04-27 1:53 ` Michel Lespinasse [this message]
2019-04-23 10:47 ` Michal Hocko
2019-04-23 12:41 ` Matthew Wilcox
2019-04-23 12:48 ` Peter Zijlstra
2019-04-23 13:42 ` Michal Hocko
2019-04-24 18:01 ` Laurent Dufour
2019-04-27 6:00 ` Michel Lespinasse
2019-04-23 11:35 ` Anshuman Khandual
2019-06-06 6:51 ` Haiyan Song
2019-06-14 8:37 ` Laurent Dufour
2019-06-14 8:44 ` Laurent Dufour
2019-06-20 8:19 ` Haiyan Song
2020-07-06 9:25 ` Chinwen Chang
2020-07-06 12:27 ` Laurent Dufour
2020-07-07 5:31 ` Chinwen Chang
2020-12-14 2:03 ` Joel Fernandes
2020-12-14 9:36 ` Laurent Dufour
2020-12-14 18:10 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190427015315.GA174296@google.com \
--to=walken@google.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=daniel.m.jordan@oracle.com \
--cc=dave@stgolabs.net \
--cc=haiyanx.song@intel.com \
--cc=haren@linux.vnet.ibm.com \
--cc=hpa@zytor.com \
--cc=jack@suse.cz \
--cc=jglisse@redhat.com \
--cc=kemi.wang@intel.com \
--cc=kirill@shutemov.name \
--cc=ldufour@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mhocko@kernel.org \
--cc=minchan@kernel.org \
--cc=mingo@redhat.com \
--cc=npiggin@gmail.com \
--cc=opensource.ganesh@gmail.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=punitagrawal@gmail.com \
--cc=rientjes@google.com \
--cc=rppt@linux.ibm.com \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=sj38.park@gmail.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=vinayakm.list@gmail.com \
--cc=will.deacon@arm.com \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=yang.shi@linux.alibaba.com \
--cc=zhongjiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).