From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D1DE36EA9B for ; Fri, 27 Mar 2026 17:00:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774630829; cv=none; b=h16jPMeJMTRNz/InkbLK0FJ1L5UhoYso3tp9qcqTVi6f/dXrz2RO93Izfy689DX/RwtC9nhFuL4VHWIKYAyPk/G3vUiGKn1R66+s5l6tdsDWZxcuBmyHVOQx6UFmIrEmFuC8pZWOzCSm9MT2YWTbynb+4//xzI+TdcSa3YwiU3U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774630829; c=relaxed/simple; bh=/atZdpbkIZmO4y5EXEo+GHhMe1oQTT9/fkFVHbB/SLo=; h=Date:To:From:Subject:Message-Id; b=svcwuYxdrcj7nbxuBh09KBLsvzrPeezva9FWHO4QGxQm6D1Jeeje84DyQ/m5J1TgjOqTRvFXWleIvcBMEEiChaUKgVquwFxQKdZAoZmoGcb2kSsCBQJbn6nFmEPTWKjihBRI1BSPxI+/EvckA6dWm4w/vdfr5+cBOGvmFiX5RU8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=v+kCXuzg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="v+kCXuzg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D319BC19423; Fri, 27 Mar 2026 17:00:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1774630828; bh=/atZdpbkIZmO4y5EXEo+GHhMe1oQTT9/fkFVHbB/SLo=; h=Date:To:From:Subject:From; b=v+kCXuzgtfQRG1rSBgGPyooMh915CV6n47HEfYk0nh5A+Jg1yns+oAiTH//0iZYAT ZxBfI8aKwx0Nm2MQtezrlB0PC6UxXDaHiz7pA3EIDHl2vBD/06fZGX12VQIgHd2urY K8+Noi+tVaVzGkVR4Fl5PD1Gz/mHEoCbIDUyuI/s= Date: Fri, 27 Mar 2026 10:00:28 -0700 To: mm-commits@vger.kernel.org,surenb@google.com,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-use-vma_start_write_killable-in-mm-syscalls.patch added to mm-new branch Message-Id: <20260327170028.D319BC19423@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm: use vma_start_write_killable() in mm syscalls has been added to the -mm mm-new branch. Its filename is mm-use-vma_start_write_killable-in-mm-syscalls.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-use-vma_start_write_killable-in-mm-syscalls.patch This patch will later appear in the mm-new branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Note, mm-new is a provisional staging ground for work-in-progress patches, and acceptance into mm-new is a notification for others take notice and to finish up reviews. Please do not hesitate to respond to review feedback and post updated versions to replace or incrementally fixup patches in mm-new. The mm-new branch of mm.git is not included in linux-next If a few days of testing in mm-new is successful, the patch will me moved into mm.git's mm-unstable branch, which is included in linux-next Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via various branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there most days ------------------------------------------------------ From: Suren Baghdasaryan Subject: mm: use vma_start_write_killable() in mm syscalls Date: Thu, 26 Mar 2026 01:08:32 -0700 Replace vma_start_write() with vma_start_write_killable() in syscalls, improving reaction time to the kill signal. In a number of places we now lock VMA earlier than before to avoid doing work and undoing it later if a fatal signal is pending. This is safe because the moves are happening within sections where we already hold the mmap_write_lock, so the moves do not change the locking order relative to other kernel locks. Link: https://lkml.kernel.org/r/20260326080836.695207-3-surenb@google.com Signed-off-by: Suren Baghdasaryan Suggested-by: Matthew Wilcox Cc: Alexander Gordeev Cc: Alistair Popple Cc: Baolin Wang Cc: Barry Song Cc: Byungchul Park Cc: Christian Borntraeger Cc: "Christophe Leroy (CS GROUP)" Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: Dev Jain Cc: Gerald Schaefer Cc: Gregory Price Cc: Heiko Carstens Cc: "Huang, Ying" Cc: Jann Horn Cc: Janosch Frank Cc: Joshua Hahn Cc: Kees Cook Cc: Lance Yang Cc: Liam R. Howlett Cc: Lorenzo Stoakes Cc: Lorenzo Stoakes (Oracle) Cc: Madhavan Srinivasan Cc: Matthew Brost Cc: Michael Ellerman Cc: Michal Hocko Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Nico Pache Cc: Pedro Falcato Cc: Rakie Kim Cc: Ritesh Harjani (IBM) Cc: Ryan Roberts Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Vlastimil Babka Cc: Zi Yan Signed-off-by: Andrew Morton --- mm/madvise.c | 4 +++- mm/memory.c | 2 ++ mm/mempolicy.c | 11 +++++++++-- mm/mlock.c | 28 ++++++++++++++++++++++------ mm/mprotect.c | 5 ++++- mm/mremap.c | 8 +++++--- mm/mseal.c | 5 ++++- 7 files changed, 49 insertions(+), 14 deletions(-) --- a/mm/madvise.c~mm-use-vma_start_write_killable-in-mm-syscalls +++ a/mm/madvise.c @@ -175,7 +175,9 @@ static int madvise_update_vma(vm_flags_t madv_behavior->vma = vma; /* vm_flags is protected by the mmap_lock held in write mode. */ - vma_start_write(vma); + if (vma_start_write_killable(vma)) + return -EINTR; + vma->flags = new_vma_flags; if (set_new_anon_name) return replace_anon_vma_name(vma, anon_name); --- a/mm/memory.c~mm-use-vma_start_write_killable-in-mm-syscalls +++ a/mm/memory.c @@ -366,6 +366,8 @@ void free_pgd_range(struct mmu_gather *t * page tables that should be removed. This can differ from the vma mappings on * some archs that may have mappings that need to be removed outside the vmas. * Note that the prev->vm_end and next->vm_start are often used. + * We don't use vma_start_write_killable() because page tables should be freed + * even if the task is being killed. * * The vma_end differs from the pg_end when a dup_mmap() failed and the tree has * unrelated data to the mm_struct being torn down. --- a/mm/mempolicy.c~mm-use-vma_start_write_killable-in-mm-syscalls +++ a/mm/mempolicy.c @@ -1784,7 +1784,8 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, return -EINVAL; if (end == start) return 0; - mmap_write_lock(mm); + if (mmap_write_lock_killable(mm)) + return -EINTR; prev = vma_prev(&vmi); for_each_vma_range(vmi, vma, end) { /* @@ -1801,13 +1802,19 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, err = -EOPNOTSUPP; break; } + /* + * Lock the VMA early to avoid extra work if fatal signal + * is pending. + */ + err = vma_start_write_killable(vma); + if (err) + break; new = mpol_dup(old); if (IS_ERR(new)) { err = PTR_ERR(new); break; } - vma_start_write(vma); new->home_node = home_node; err = mbind_range(&vmi, vma, &prev, start, end, new); mpol_put(new); --- a/mm/mlock.c~mm-use-vma_start_write_killable-in-mm-syscalls +++ a/mm/mlock.c @@ -419,8 +419,10 @@ out: * * Called for mlock(), mlock2() and mlockall(), to set @vma VM_LOCKED; * called for munlock() and munlockall(), to clear VM_LOCKED from @vma. + * + * Return: 0 on success, -EINTR if fatal signal is pending. */ -static void mlock_vma_pages_range(struct vm_area_struct *vma, +static int mlock_vma_pages_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, vma_flags_t *new_vma_flags) { @@ -442,7 +444,9 @@ static void mlock_vma_pages_range(struct */ if (vma_flags_test(new_vma_flags, VMA_LOCKED_BIT)) vma_flags_set(new_vma_flags, VMA_IO_BIT); - vma_start_write(vma); + if (vma_start_write_killable(vma)) + return -EINTR; + vma_flags_reset_once(vma, new_vma_flags); lru_add_drain(); @@ -453,6 +457,7 @@ static void mlock_vma_pages_range(struct vma_flags_clear(new_vma_flags, VMA_IO_BIT); vma_flags_reset_once(vma, new_vma_flags); } + return 0; } /* @@ -506,11 +511,13 @@ static int mlock_fixup(struct vma_iterat */ if (vma_flags_test(&new_vma_flags, VMA_LOCKED_BIT) && vma_flags_test(&old_vma_flags, VMA_LOCKED_BIT)) { + ret = vma_start_write_killable(vma); + if (ret) + goto out; /* mm->locked_vm is fine as nr_pages == 0 */ /* No work to do, and mlocking twice would be wrong */ - vma_start_write(vma); vma->flags = new_vma_flags; } else { - mlock_vma_pages_range(vma, start, end, &new_vma_flags); + ret = mlock_vma_pages_range(vma, start, end, &new_vma_flags); } out: *prev = vma; @@ -739,9 +746,18 @@ static int apply_mlockall_flags(int flag error = mlock_fixup(&vmi, vma, &prev, vma->vm_start, vma->vm_end, newflags); - /* Ignore errors, but prev needs fixing up. */ - if (error) + if (error) { + /* + * If we failed due to a pending fatal signal, return + * now. If we locked the vma before signal arrived, it + * will be unlocked when we drop mmap_write_lock. + */ + if (fatal_signal_pending(current)) + return -EINTR; + + /* Ignore errors, but prev needs fixing up. */ prev = vma; + } cond_resched(); } out: --- a/mm/mprotect.c~mm-use-vma_start_write_killable-in-mm-syscalls +++ a/mm/mprotect.c @@ -784,7 +784,10 @@ mprotect_fixup(struct vma_iterator *vmi, * vm_flags and vm_page_prot are protected by the mmap_lock * held in write mode. */ - vma_start_write(vma); + error = vma_start_write_killable(vma); + if (error) + goto fail; + vma_flags_reset_once(vma, &new_vma_flags); if (vma_wants_manual_pte_write_upgrade(vma)) mm_cp_flags |= MM_CP_TRY_CHANGE_WRITABLE; --- a/mm/mremap.c~mm-use-vma_start_write_killable-in-mm-syscalls +++ a/mm/mremap.c @@ -1348,6 +1348,11 @@ static unsigned long move_vma(struct vma if (err) return err; + /* We don't want racing faults. */ + err = vma_start_write_killable(vrm->vma); + if (err) + return err; + /* * If accounted, determine the number of bytes the operation will * charge. @@ -1355,9 +1360,6 @@ static unsigned long move_vma(struct vma if (!vrm_calc_charge(vrm)) return -ENOMEM; - /* We don't want racing faults. */ - vma_start_write(vrm->vma); - /* Perform copy step. */ err = copy_vma_and_data(vrm, &new_vma); /* --- a/mm/mseal.c~mm-use-vma_start_write_killable-in-mm-syscalls +++ a/mm/mseal.c @@ -70,6 +70,7 @@ static int mseal_apply(struct mm_struct if (!vma_test(vma, VMA_SEALED_BIT)) { vma_flags_t vma_flags = vma->flags; + int err; vma_flags_set(&vma_flags, VMA_SEALED_BIT); @@ -77,7 +78,9 @@ static int mseal_apply(struct mm_struct curr_end, &vma_flags); if (IS_ERR(vma)) return PTR_ERR(vma); - vma_start_write(vma); + err = vma_start_write_killable(vma); + if (err) + return err; vma_set_flags(vma, VMA_SEALED_BIT); curr_end = vma->vm_end; /* Merge may have updated. */ } _ Patches currently in -mm which might be from surenb@google.com are mm-vma-cleanup-error-handling-path-in-vma_expand.patch mm-use-vma_start_write_killable-in-mm-syscalls.patch mm-khugepaged-use-vma_start_write_killable-in-collapse_huge_page.patch mm-vma-use-vma_start_write_killable-in-vma-operations.patch mm-use-vma_start_write_killable-in-process_vma_walk_lock.patch kvm-ppc-use-vma_start_write_killable-in-kvmppc_memslot_page_merge.patch