From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D47AE42AA1 for ; Thu, 26 Dec 2024 21:15:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735247741; cv=none; b=aBqmpUPQ24VDDxwLXYAiU2sJioeyhsLQZL0rauNMfCm0Krazsabm7P81KIwGXRf6frXC+taBcRVfy3VOdNsNaELGdRufxx/QU2duQKoJ2VGWBzlSDoASu8EOdbRVON6yGrt+RVQ8q9UKcYxcMQg114ItSVHuQgrO9ckXYqlQi2g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735247741; c=relaxed/simple; bh=z7o5LS+J+AOWFNk3FMJ9sAkAqJZxuhvi0NBsca2/Rxc=; h=Date:To:From:Subject:Message-Id; b=ufUqHcyGu+TS6T5NKAaoXLtYKM/LYX+M5523uWIPW1pDBp3WoBPbxv0dmRGNn1koas+qv8023mVsyBOTLz11CI/6REV3YNDtLnwIrpDRguPyMsC+HjMTPvsNbnaUJ1WnH7JeGBBkex0l/tb9wsoc6iJVg1MHN5uxxwWb4qzEqRA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=bXVqEkSP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="bXVqEkSP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78B03C4CED1; Thu, 26 Dec 2024 21:15:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1735247741; bh=z7o5LS+J+AOWFNk3FMJ9sAkAqJZxuhvi0NBsca2/Rxc=; h=Date:To:From:Subject:From; b=bXVqEkSPMYm085plbx3RX1m1pChGVO2CfS6OtBWs8SobD02kmWgTAXvjndBATz+1M msvy1ti6oKz9lYpNm8OYMVVaJzLbyrWlBGPFu2Guo3ow2uaYINtYHN0NcO9zqwr2G/ EX7qF4QRhh/NOnng2KBDp9etycJXqzstjnkOw7Qc= Date: Thu, 26 Dec 2024 13:15:40 -0800 To: mm-commits@vger.kernel.org,willy@infradead.org,vbabka@suse.cz,souravpanda@google.com,shakeel.butt@linux.dev,peterz@infradead.org,peterx@redhat.com,paulmck@kernel.org,pasha.tatashin@soleen.com,oliver.sang@intel.com,oleg@redhat.com,mjguzik@gmail.com,minchan@google.com,mhocko@suse.com,mgorman@techsingularity.net,lorenzo.stoakes@oracle.com,lokeshgidra@google.com,Liam.Howlett@Oracle.com,klarasmodin@gmail.com,jannh@google.com,hughd@google.com,hdanton@sina.com,hannes@cmpxchg.org,dhowells@redhat.com,david@redhat.com,dave@stgolabs.net,corbet@lwn.net,brauner@kernel.org,surenb@google.com,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-introduce-vma_start_read_locked_nested-helpers.patch added to mm-unstable branch Message-Id: <20241226211541.78B03C4CED1@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm: introduce vma_start_read_locked{_nested} helpers has been added to the -mm mm-unstable branch. Its filename is mm-introduce-vma_start_read_locked_nested-helpers.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-introduce-vma_start_read_locked_nested-helpers.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Suren Baghdasaryan Subject: mm: introduce vma_start_read_locked{_nested} helpers Date: Thu, 26 Dec 2024 09:06:53 -0800 Patch series "move per-vma lock into vm_area_struct", v7. Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. Splitting single logical structure into multiple ones leads to more complicated management, extra pointer dereferences and overall less maintainable code. When that split-away part is a lock, it complicates things even further. With no performance benefits, there are no reasons for this split. Merging the vm_lock back into vm_area_struct also allows vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. This patchset: 1. moves vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cacheline-aligned to minimize cacheline sharing; 2. changes vm_area_struct initialization to mark new vma as detached until it is inserted into vma tree; 3. replaces vm_lock and vma->detached flag with a reference counter; 4. changes vm_area_struct cache to SLAB_TYPESAFE_BY_RCU to allow for their reuse and to minimize call_rcu() calls. Pagefault microbenchmarks show performance improvement: Hmean faults/cpu-1 507926.5547 ( 0.00%) 506519.3692 * -0.28%* Hmean faults/cpu-4 479119.7051 ( 0.00%) 481333.6802 * 0.46%* Hmean faults/cpu-7 452880.2961 ( 0.00%) 455845.6211 * 0.65%* Hmean faults/cpu-12 347639.1021 ( 0.00%) 352004.2254 * 1.26%* Hmean faults/cpu-21 200061.2238 ( 0.00%) 229597.0317 * 14.76%* Hmean faults/cpu-30 145251.2001 ( 0.00%) 164202.5067 * 13.05%* Hmean faults/cpu-48 106848.4434 ( 0.00%) 120641.5504 * 12.91%* Hmean faults/cpu-56 92472.3835 ( 0.00%) 103464.7916 * 11.89%* Hmean faults/sec-1 507566.1468 ( 0.00%) 506139.0811 * -0.28%* Hmean faults/sec-4 1880478.2402 ( 0.00%) 1886795.6329 * 0.34%* Hmean faults/sec-7 3106394.3438 ( 0.00%) 3140550.7485 * 1.10%* Hmean faults/sec-12 4061358.4795 ( 0.00%) 4112477.0206 * 1.26%* Hmean faults/sec-21 3988619.1169 ( 0.00%) 4577747.1436 * 14.77%* Hmean faults/sec-30 3909839.5449 ( 0.00%) 4311052.2787 * 10.26%* Hmean faults/sec-48 4761108.4691 ( 0.00%) 5283790.5026 * 10.98%* Hmean faults/sec-56 4885561.4590 ( 0.00%) 5415839.4045 * 10.85%* [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/ This patch (of 17): Introduce helper functions which can be used to read-lock a VMA when holding mmap_lock for read. Replace direct accesses to vma->vm_lock with these new helpers. Link: https://lkml.kernel.org/r/20241226170710.1159679-1-surenb@google.com Link: https://lkml.kernel.org/r/20241226170710.1159679-2-surenb@google.com Signed-off-by: Suren Baghdasaryan Reviewed-by: Lorenzo Stoakes Reviewed-by: Davidlohr Bueso Reviewed-by: Shakeel Butt Reviewed-by: Vlastimil Babka Cc: Christian Brauner Cc: David Hildenbrand Cc: David Howells Cc: Hillf Danton Cc: Hugh Dickins Cc: Jann Horn Cc: Johannes Weiner Cc: Jonathan Corbet Cc: kernel test robot Cc: Klara Modin Cc: Liam R. Howlett Cc: Lokesh Gidra Cc: Mateusz Guzik Cc: Matthew Wilcox (Oracle) Cc: Mel Gorman Cc: Michal Hocko Cc: Minchan Kim Cc: Oleg Nesterov Cc: Pasha Tatashin Cc: Paul E. McKenney Cc: Peter Xu Cc: Peter Zijlstra Cc: Sourav Panda Signed-off-by: Andrew Morton --- include/linux/mm.h | 24 ++++++++++++++++++++++++ mm/userfaultfd.c | 22 +++++----------------- 2 files changed, 29 insertions(+), 17 deletions(-) --- a/include/linux/mm.h~mm-introduce-vma_start_read_locked_nested-helpers +++ a/include/linux/mm.h @@ -735,6 +735,30 @@ static inline bool vma_start_read(struct return true; } +/* + * Use only while holding mmap read lock which guarantees that locking will not + * fail (nobody can concurrently write-lock the vma). vma_start_read() should + * not be used in such cases because it might fail due to mm_lock_seq overflow. + * This functionality is used to obtain vma read lock and drop the mmap read lock. + */ +static inline void vma_start_read_locked_nested(struct vm_area_struct *vma, int subclass) +{ + mmap_assert_locked(vma->vm_mm); + down_read_nested(&vma->vm_lock->lock, subclass); +} + +/* + * Use only while holding mmap read lock which guarantees that locking will not + * fail (nobody can concurrently write-lock the vma). vma_start_read() should + * not be used in such cases because it might fail due to mm_lock_seq overflow. + * This functionality is used to obtain vma read lock and drop the mmap read lock. + */ +static inline void vma_start_read_locked(struct vm_area_struct *vma) +{ + mmap_assert_locked(vma->vm_mm); + down_read(&vma->vm_lock->lock); +} + static inline void vma_end_read(struct vm_area_struct *vma) { rcu_read_lock(); /* keeps vma alive till the end of up_read */ --- a/mm/userfaultfd.c~mm-introduce-vma_start_read_locked_nested-helpers +++ a/mm/userfaultfd.c @@ -84,16 +84,8 @@ static struct vm_area_struct *uffd_lock_ mmap_read_lock(mm); vma = find_vma_and_prepare_anon(mm, address); - if (!IS_ERR(vma)) { - /* - * We cannot use vma_start_read() as it may fail due to - * false locked (see comment in vma_start_read()). We - * can avoid that by directly locking vm_lock under - * mmap_lock, which guarantees that nobody can lock the - * vma for write (vma_start_write()) under us. - */ - down_read(&vma->vm_lock->lock); - } + if (!IS_ERR(vma)) + vma_start_read_locked(vma); mmap_read_unlock(mm); return vma; @@ -1491,14 +1483,10 @@ static int uffd_move_lock(struct mm_stru mmap_read_lock(mm); err = find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, src_vmap); if (!err) { - /* - * See comment in uffd_lock_vma() as to why not using - * vma_start_read() here. - */ - down_read(&(*dst_vmap)->vm_lock->lock); + vma_start_read_locked(*dst_vmap); if (*dst_vmap != *src_vmap) - down_read_nested(&(*src_vmap)->vm_lock->lock, - SINGLE_DEPTH_NESTING); + vma_start_read_locked_nested(*src_vmap, + SINGLE_DEPTH_NESTING); } mmap_read_unlock(mm); return err; _ Patches currently in -mm which might be from surenb@google.com are seqlock-add-raw_seqcount_try_begin.patch mm-convert-mm_lock_seq-to-a-proper-seqcount.patch mm-introduce-mmap_lock_speculate_try_beginretry.patch mm-introduce-vma_start_read_locked_nested-helpers.patch mm-move-per-vma-lock-into-vm_area_struct.patch mm-mark-vma-as-detached-until-its-added-into-vma-tree.patch mm-modify-vma_iter_store_gfp-to-indicate-if-its-storing-a-new-vma.patch mm-mark-vmas-detached-upon-exit.patch mm-nommu-fix-the-last-places-where-vma-is-not-locked-before-being-attached.patch types-move-struct-rcuwait-into-typesh.patch mm-allow-vma_start_read_locked-vma_start_read_locked_nested-to-fail.patch mm-move-mmap_init_lock-out-of-the-header-file.patch mm-uninline-the-main-body-of-vma_start_write.patch refcount-introduce-__refcount_addinc_not_zero_limited.patch mm-replace-vm_lock-and-detached-flag-with-a-reference-count.patch mm-debug-print-vm_refcnt-state-when-dumping-the-vma.patch mm-debug-print-vm_refcnt-state-when-dumping-the-vma-fix.patch mm-remove-extra-vma_numab_state_init-call.patch mm-prepare-lock_vma_under_rcu-for-vma-reuse-possibility.patch mm-make-vma-cache-slab_typesafe_by_rcu.patch docs-mm-document-latest-changes-to-vm_lock.patch