From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF64537DE90 for ; Fri, 27 Mar 2026 23:13:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774653180; cv=none; b=saw3slYwXDiRt2mKe0PbWzECE/67/gkuzw/+3nedc/zuvmdxYg3E86OB3OKXhIjQaIi4tfKggXhNLWdJfulE9WF6ltQBcf+Jkw0WsKc96V1/bO8EGzSClvUegiYcwtNBEFcOcX/uh6OLb2qcGmihfIadTsVaqJw0C4XXmji6mrE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774653180; c=relaxed/simple; bh=bEZ0Z/Hsb7/FuyNlULdHRcihW+1o0ejFxNCa1mWg4P0=; h=Date:To:From:Subject:Message-Id; b=sUOwwiUYXnnykWUaNmeW9Xidr2Ap9U7I5twKXR/St01Vqe/47t6vjUpvyu86NM4gLjDEM+uOwiFCLXpXTVE1smN28hH0Gervq7WA4KTrY6D65l1JMmGvfvM4lYqO6qLnT1Mm/hwuuuavBSE3jInDA/3+YZKXdnvbbCrix4QKJ6g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=PG/SZaR5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="PG/SZaR5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 70105C19423; Fri, 27 Mar 2026 23:13:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1774653180; bh=bEZ0Z/Hsb7/FuyNlULdHRcihW+1o0ejFxNCa1mWg4P0=; h=Date:To:From:Subject:From; b=PG/SZaR5OJ+wim+b0eKvuRxq1fq3DwJIJj3TxzWbJTxKCefxLnrtPhwC6mRD0Hi45 GyPz0u6fNX11AHUqrjRGtPUA3RU/KhYjqC3jR5A53o/mlZdtG6Rv8o+OSwNdmKEsg8 gVi3AVLt5Ix4H4rGIhrr3VPL0FtdH45DriSWn88w= Date: Fri, 27 Mar 2026 16:12:59 -0700 To: mm-commits@vger.kernel.org,surenb@google.com,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-vma-use-vma_start_write_killable-in-vma-operations.patch added to mm-new branch Message-Id: <20260327231300.70105C19423@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm/vma: use vma_start_write_killable() in vma operations has been added to the -mm mm-new branch. Its filename is mm-vma-use-vma_start_write_killable-in-vma-operations.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-vma-use-vma_start_write_killable-in-vma-operations.patch This patch will later appear in the mm-new branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Note, mm-new is a provisional staging ground for work-in-progress patches, and acceptance into mm-new is a notification for others take notice and to finish up reviews. Please do not hesitate to respond to review feedback and post updated versions to replace or incrementally fixup patches in mm-new. The mm-new branch of mm.git is not included in linux-next If a few days of testing in mm-new is successful, the patch will me moved into mm.git's mm-unstable branch, which is included in linux-next Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via various branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there most days ------------------------------------------------------ From: Suren Baghdasaryan Subject: mm/vma: use vma_start_write_killable() in vma operations Date: Fri, 27 Mar 2026 13:54:55 -0700 Replace vma_start_write() with vma_start_write_killable(), improving reaction time to the kill signal. Replace vma_start_write() calls when we operate on VMAs. To propagate errors from vma_merge_existing_range() and vma_expand() we fake an ENOMEM error when we fail due to a pending fatal signal. This is a temporary workaround. Fixing this requires some refactoring and will be done separately in the future. In a number of places we now lock VMA earlier than before to avoid doing work and undoing it later if a fatal signal is pending. This is safe because the moves are happening within sections where we already hold the mmap_write_lock, so the moves do not change the locking order relative to other kernel locks. Link: https://lkml.kernel.org/r/20260327205457.604224-5-surenb@google.com Signed-off-by: Suren Baghdasaryan Suggested-by: Matthew Wilcox Cc: Alexander Gordeev Cc: Alistair Popple Cc: Baolin Wang Cc: Barry Song Cc: Byungchul Park Cc: Christian Borntraeger Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: Dev Jain Cc: Gerald Schaefer Cc: Gregory Price Cc: Heiko Carstens Cc: "Huang, Ying" Cc: Jann Horn Cc: Janosch Frank Cc: Joshua Hahn Cc: Kees Cook Cc: Lance Yang Cc: Liam R. Howlett Cc: Lorenzo Stoakes Cc: Lorenzo Stoakes (Oracle) Cc: Madhavan Srinivasan Cc: Matthew Brost Cc: Michael Ellerman Cc: Michal Hocko Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Nico Pache Cc: Pedro Falcato Cc: Rakie Kim Cc: Ritesh Harjani (IBM) Cc: Ryan Roberts Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Vlastimil Babka Cc: Zi Yan Signed-off-by: Andrew Morton --- mm/vma.c | 146 +++++++++++++++++++++++++++++++++++++----------- mm/vma_exec.c | 6 + 2 files changed, 117 insertions(+), 35 deletions(-) --- a/mm/vma.c~mm-vma-use-vma_start_write_killable-in-vma-operations +++ a/mm/vma.c @@ -524,6 +524,21 @@ __split_vma(struct vma_iterator *vmi, st new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT); } + /* + * Lock VMAs before cloning to avoid extra work if fatal signal + * is pending. + */ + err = vma_start_write_killable(vma); + if (err) + goto out_free_vma; + /* + * Locking a new detached VMA will always succeed but it's just a + * detail of the current implementation, so handle it all the same. + */ + err = vma_start_write_killable(new); + if (err) + goto out_free_vma; + err = -ENOMEM; vma_iter_config(vmi, new->vm_start, new->vm_end); if (vma_iter_prealloc(vmi, new)) @@ -543,9 +558,6 @@ __split_vma(struct vma_iterator *vmi, st if (new->vm_ops && new->vm_ops->open) new->vm_ops->open(new); - vma_start_write(vma); - vma_start_write(new); - init_vma_prep(&vp, vma); vp.insert = new; vma_prepare(&vp); @@ -900,12 +912,22 @@ static __must_check struct vm_area_struc } /* No matter what happens, we will be adjusting middle. */ - vma_start_write(middle); + err = vma_start_write_killable(middle); + if (err) { + /* Ensure error propagates. */ + vmg->give_up_on_oom = false; + goto abort; + } if (merge_right) { vma_flags_t next_sticky; - vma_start_write(next); + err = vma_start_write_killable(next); + if (err) { + /* Ensure error propagates. */ + vmg->give_up_on_oom = false; + goto abort; + } vmg->target = next; next_sticky = vma_flags_and_mask(&next->flags, VMA_STICKY_FLAGS); vma_flags_set_mask(&sticky_flags, next_sticky); @@ -914,7 +936,12 @@ static __must_check struct vm_area_struc if (merge_left) { vma_flags_t prev_sticky; - vma_start_write(prev); + err = vma_start_write_killable(prev); + if (err) { + /* Ensure error propagates. */ + vmg->give_up_on_oom = false; + goto abort; + } vmg->target = prev; prev_sticky = vma_flags_and_mask(&prev->flags, VMA_STICKY_FLAGS); @@ -1170,10 +1197,18 @@ int vma_expand(struct vma_merge_struct * vma_flags_t sticky_flags = vma_flags_and_mask(&vmg->vma_flags, VMA_STICKY_FLAGS); vma_flags_t target_sticky; - int err = 0; + int err; mmap_assert_write_locked(vmg->mm); - vma_start_write(target); + err = vma_start_write_killable(target); + if (err) { + /* + * Override VMA_MERGE_NOMERGE to prevent callers from + * falling back to a new VMA allocation. + */ + vmg->state = VMA_MERGE_ERROR_NOMEM; + return err; + } target_sticky = vma_flags_and_mask(&target->flags, VMA_STICKY_FLAGS); @@ -1201,6 +1236,19 @@ int vma_expand(struct vma_merge_struct * * we don't need to account for vmg->give_up_on_mm here. */ if (remove_next) { + /* + * Lock the VMA early to avoid extra work if fatal signal + * is pending. + */ + err = vma_start_write_killable(next); + if (err) { + /* + * Override VMA_MERGE_NOMERGE to prevent callers from + * falling back to a new VMA allocation. + */ + vmg->state = VMA_MERGE_ERROR_NOMEM; + return err; + } err = dup_anon_vma(target, next, &anon_dup); if (err) return err; @@ -1214,7 +1262,6 @@ int vma_expand(struct vma_merge_struct * if (remove_next) { vma_flags_t next_sticky; - vma_start_write(next); vmg->__remove_next = true; next_sticky = vma_flags_and_mask(&next->flags, VMA_STICKY_FLAGS); @@ -1252,9 +1299,14 @@ int vma_shrink(struct vma_iterator *vmi, unsigned long start, unsigned long end, pgoff_t pgoff) { struct vma_prepare vp; + int err; WARN_ON((vma->vm_start != start) && (vma->vm_end != end)); + err = vma_start_write_killable(vma); + if (err) + return err; + if (vma->vm_start < start) vma_iter_config(vmi, vma->vm_start, start); else @@ -1263,8 +1315,6 @@ int vma_shrink(struct vma_iterator *vmi, if (vma_iter_prealloc(vmi, NULL)) return -ENOMEM; - vma_start_write(vma); - init_vma_prep(&vp, vma); vma_prepare(&vp); vma_adjust_trans_huge(vma, start, end, NULL); @@ -1453,7 +1503,9 @@ static int vms_gather_munmap_vmas(struct if (error) goto end_split_failed; } - vma_start_write(next); + error = vma_start_write_killable(next); + if (error) + goto munmap_gather_failed; mas_set(mas_detach, vms->vma_count++); error = mas_store_gfp(mas_detach, next, GFP_KERNEL); if (error) @@ -1848,12 +1900,16 @@ static void vma_link_file(struct vm_area static int vma_link(struct mm_struct *mm, struct vm_area_struct *vma) { VMA_ITERATOR(vmi, mm, 0); + int err; + + err = vma_start_write_killable(vma); + if (err) + return err; vma_iter_config(&vmi, vma->vm_start, vma->vm_end); if (vma_iter_prealloc(&vmi, vma)) return -ENOMEM; - vma_start_write(vma); vma_iter_store_new(&vmi, vma); vma_link_file(vma, /* hold_rmap_lock= */false); mm->map_count++; @@ -2239,9 +2295,8 @@ int mm_take_all_locks(struct mm_struct * * is reached. */ for_each_vma(vmi, vma) { - if (signal_pending(current)) + if (signal_pending(current) || vma_start_write_killable(vma)) goto out_unlock; - vma_start_write(vma); } vma_iter_init(&vmi, mm, 0); @@ -2540,8 +2595,8 @@ static int __mmap_new_vma(struct mmap_st struct mmap_action *action) { struct vma_iterator *vmi = map->vmi; - int error = 0; struct vm_area_struct *vma; + int error; /* * Determine the object being mapped and call the appropriate @@ -2552,6 +2607,14 @@ static int __mmap_new_vma(struct mmap_st if (!vma) return -ENOMEM; + /* + * Lock the VMA early to avoid extra work if fatal signal + * is pending. + */ + error = vma_start_write_killable(vma); + if (error) + goto free_vma; + vma_iter_config(vmi, map->addr, map->end); vma_set_range(vma, map->addr, map->end, map->pgoff); vma->flags = map->vma_flags; @@ -2582,8 +2645,6 @@ static int __mmap_new_vma(struct mmap_st WARN_ON_ONCE(!arch_validate_flags(map->vm_flags)); #endif - /* Lock the VMA since it is modified after insertion into VMA tree */ - vma_start_write(vma); vma_iter_store_new(vmi, vma); map->mm->map_count++; vma_link_file(vma, action->hide_from_rmap_until_complete); @@ -2878,6 +2939,7 @@ int do_brk_flags(struct vma_iterator *vm unsigned long addr, unsigned long len, vma_flags_t vma_flags) { struct mm_struct *mm = current->mm; + int err; /* * Check against address space limits by the changed size @@ -2910,24 +2972,33 @@ int do_brk_flags(struct vma_iterator *vm if (vma_merge_new_range(&vmg)) goto out; - else if (vmg_nomem(&vmg)) + if (vmg_nomem(&vmg)) { + err = -ENOMEM; goto unacct_fail; + } } if (vma) vma_iter_next_range(vmi); /* create a vma struct for an anonymous mapping */ vma = vm_area_alloc(mm); - if (!vma) + if (!vma) { + err = -ENOMEM; goto unacct_fail; + } vma_set_anonymous(vma); vma_set_range(vma, addr, addr + len, addr >> PAGE_SHIFT); vma->flags = vma_flags; vma->vm_page_prot = vm_get_page_prot(vma_flags_to_legacy(vma_flags)); - vma_start_write(vma); - if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL)) + if (vma_start_write_killable(vma)) { + err = -EINTR; + goto vma_lock_fail; + } + if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL)) { + err = -ENOMEM; goto mas_store_fail; + } mm->map_count++; validate_mm(mm); @@ -2942,10 +3013,11 @@ out: return 0; mas_store_fail: +vma_lock_fail: vm_area_free(vma); unacct_fail: vm_unacct_memory(len >> PAGE_SHIFT); - return -ENOMEM; + return err; } /** @@ -3112,8 +3184,8 @@ int expand_upwards(struct vm_area_struct struct mm_struct *mm = vma->vm_mm; struct vm_area_struct *next; unsigned long gap_addr; - int error = 0; VMA_ITERATOR(vmi, mm, vma->vm_start); + int error; if (!vma_test(vma, VMA_GROWSUP_BIT)) return -EFAULT; @@ -3149,12 +3221,14 @@ int expand_upwards(struct vm_area_struct /* We must make sure the anon_vma is allocated. */ if (unlikely(anon_vma_prepare(vma))) { - vma_iter_free(&vmi); - return -ENOMEM; + error = -ENOMEM; + goto vma_prep_fail; } /* Lock the VMA before expanding to prevent concurrent page faults */ - vma_start_write(vma); + error = vma_start_write_killable(vma); + if (error) + goto vma_lock_fail; /* We update the anon VMA tree. */ anon_vma_lock_write(vma->anon_vma); @@ -3183,8 +3257,10 @@ int expand_upwards(struct vm_area_struct } } anon_vma_unlock_write(vma->anon_vma); - vma_iter_free(&vmi); validate_mm(mm); +vma_lock_fail: +vma_prep_fail: + vma_iter_free(&vmi); return error; } #endif /* CONFIG_STACK_GROWSUP */ @@ -3197,8 +3273,8 @@ int expand_downwards(struct vm_area_stru { struct mm_struct *mm = vma->vm_mm; struct vm_area_struct *prev; - int error = 0; VMA_ITERATOR(vmi, mm, vma->vm_start); + int error; if (!vma_test(vma, VMA_GROWSDOWN_BIT)) return -EFAULT; @@ -3228,12 +3304,14 @@ int expand_downwards(struct vm_area_stru /* We must make sure the anon_vma is allocated. */ if (unlikely(anon_vma_prepare(vma))) { - vma_iter_free(&vmi); - return -ENOMEM; + error = -ENOMEM; + goto vma_prep_fail; } /* Lock the VMA before expanding to prevent concurrent page faults */ - vma_start_write(vma); + error = vma_start_write_killable(vma); + if (error) + goto vma_lock_fail; /* We update the anon VMA tree. */ anon_vma_lock_write(vma->anon_vma); @@ -3263,8 +3341,10 @@ int expand_downwards(struct vm_area_stru } } anon_vma_unlock_write(vma->anon_vma); - vma_iter_free(&vmi); validate_mm(mm); +vma_lock_fail: +vma_prep_fail: + vma_iter_free(&vmi); return error; } --- a/mm/vma_exec.c~mm-vma-use-vma_start_write_killable-in-vma-operations +++ a/mm/vma_exec.c @@ -41,6 +41,7 @@ int relocate_vma_down(struct vm_area_str struct vm_area_struct *next; struct mmu_gather tlb; PAGETABLE_MOVE(pmc, vma, vma, old_start, new_start, length); + int err; BUG_ON(new_start > new_end); @@ -56,8 +57,9 @@ int relocate_vma_down(struct vm_area_str * cover the whole range: [new_start, old_end) */ vmg.target = vma; - if (vma_expand(&vmg)) - return -ENOMEM; + err = vma_expand(&vmg); + if (err) + return err; /* * move the page tables downwards, on failure we rely on _ Patches currently in -mm which might be from surenb@google.com are mm-vma-cleanup-error-handling-path-in-vma_expand.patch mm-use-vma_start_write_killable-in-mm-syscalls.patch mm-khugepaged-use-vma_start_write_killable-in-collapse_huge_page.patch mm-vma-use-vma_start_write_killable-in-vma-operations.patch mm-use-vma_start_write_killable-in-process_vma_walk_lock.patch kvm-ppc-use-vma_start_write_killable-in-kvmppc_memslot_page_merge.patch mm-vmscan-prevent-mglru-reclaim-from-pinning-address-space.patch