* Re: [LSF/MM TOPIC] Addressing mmap_sem contention
2018-01-29 14:29 [LSF/MM TOPIC] Addressing mmap_sem contention Laurent Dufour
@ 2018-01-31 14:33 ` Davidlohr Bueso
0 siblings, 0 replies; 2+ messages in thread
From: Davidlohr Bueso @ 2018-01-31 14:33 UTC (permalink / raw)
To: Laurent Dufour
Cc: lsf-pc, Linux MM, Andrea Arcangeli, Michal Hocko,
Anshuman Khandual, Andrew Morton, Andi Kleen, Matthew Wilcox,
Peter Zijlstra
On Mon, 29 Jan 2018, Laurent Dufour wrote:
>Hi,
>
>I would like to talk about the way to remove the mmap_sem contention we
>could see on large threaded systems.
'cause what's lsfmm without mmap_sem, right? ;)
>
>I already resurrected the Speculative Page Fault patchset from Peter
>Zijlstra [1]. This series allows concurrency between page fault handler and
>the other thread's activity. Running a massively threaded benchmark like
>ebizzy [2] on top of this kernel shows that there is an opportunity to
>scale far better on large systems (x2). But the SPF series is addressing
>only one part of the issue, and there is a need to address the other part
>of picture.
>
>There have been some discussions last year about the range locking but this
>has been put in hold, especially because this implies huge change in the
>kernel as the mmap_sem is used to protect so many resources (should we need
>to protect the process command line with the mmap_sem ?), and sometimes the
>assumption is made that the mmap_sem is protecting code against concurrency
>while it is not dealing clearly with the mmap_sem.
>
>This will be a massive change and rebasing such a series will be hard, so
>it may be far better to first agreed on best options to improve mmap_sem's
>performance and scalability. There are several additional options on the
>table, including range locking, multiple fine-grained locks, etc...
>In addition, I would like to discuss the options and the best way to make
>the move smooth in breaking or replacing the mmap_sem.
I'd also like to discuss this stuff. In particular I've been focusing on
range locking the mm. With the range_lock primitive ready by now (including
rbtree optimizations), my priority as been converting mmap_sem and getting
adequate performance data for the worst case scenario (full range).
Also, fyi recently by means of auditing handle_mm_fault() and gup family,
two new naughty users were found that were doing gup() without mmap_sem:
https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?h=wip/dl-for-next&id=487f6683f1b738e40aca2386b9f73da4ebb8223d
https://lkml.org/lkml/2018/1/22/640
With a from-scratch conversion, it's been mostly pretty straightforward
although I've done some hacks along the way. In particular avoiding having
to teach file_operations about mmrange, which would be a gazillion times
more changes than what we already have. So removing the is_locked() check
for calls like zap_pmd_range(), thp (pmd_trans_huge_lock()),
vm_insert_page() (which I audited and all ->fault() users seem to correctly
set VM_MIXEDMAP, so we might be able to get rid of it, dunno).
All this said, yes, I hope to have the patches and numbers asap (way
before lsfmm).
>Peoples (sorry if I missed someone) :
> Andrea Arcangeli
> Davidlohr Bueso
> Michal Hocko
> Anshuman Khandual
> Andi Kleen
> Andrew Morton
> Matthew Wilcox
> Peter Zijlstra
I'd also add Kirill A. Shutemov.
Thanks,
Davidlohr
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread