From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 706E4C32774 for ; Mon, 22 Aug 2022 19:24:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238700AbiHVTYk (ORCPT ); Mon, 22 Aug 2022 15:24:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238340AbiHVTWu (ORCPT ); Mon, 22 Aug 2022 15:22:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1904252814 for ; Mon, 22 Aug 2022 12:21:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id ABF3161243 for ; Mon, 22 Aug 2022 19:21:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F2CF6C4314A; Mon, 22 Aug 2022 19:21:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1661196109; bh=BAZBEAc8Y2tj3Gc6ETifsvDVd1RS54dcLL8uIY38oFM=; h=Date:To:From:Subject:From; b=wgHzvxChcsbiMjO0P+hbKe9w9lu7sZ8r3jiuQ9T8PBQGKCpCXkJpuARc43zhRs8+u SME7jLF4JaGGCJsnYqPiQvqC9Dr1zrNzfnBXWjZylQW+ORTc8hzxgUFcLIjqcq3Nxe JFKgz9fJT7mlniimlYno6GZFfMZTuibBEWAtqxDM= Date: Mon, 22 Aug 2022 12:21:48 -0700 To: mm-commits@vger.kernel.org, yuzhao@google.com, will@kernel.org, vbabka@suse.cz, svens@linux.ibm.com, sj@kernel.org, Liam.Howlett@Oracle.com, dhowells@redhat.com, david@redhat.com, catalin.marinas@arm.com, willy@infradead.org, akpm@linux-foundation.org From: Andrew Morton Subject: + nommu-remove-uses-of-vma-linked-list.patch added to mm-unstable branch Message-Id: <20220822192148.F2CF6C4314A@smtp.kernel.org> Precedence: bulk Reply-To: linux-kernel@vger.kernel.org List-ID: X-Mailing-List: mm-commits@vger.kernel.org The patch titled Subject: nommu: remove uses of VMA linked list has been added to the -mm mm-unstable branch. Its filename is nommu-remove-uses-of-vma-linked-list.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/nommu-remove-uses-of-vma-linked-list.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: "Matthew Wilcox (Oracle)" Subject: nommu: remove uses of VMA linked list Date: Mon, 22 Aug 2022 15:06:32 +0000 Use the maple tree or VMA iterator instead. This is faster and will allow us to shrink the VMA. Link: https://lkml.kernel.org/r/20220822150128.1562046-66-Liam.Howlett@oracle.com Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Liam R. Howlett Acked-by: Vlastimil Babka Cc: Catalin Marinas Cc: David Hildenbrand Cc: David Howells Cc: SeongJae Park Cc: Sven Schnelle Cc: Will Deacon Cc: Yu Zhao Signed-off-by: Andrew Morton --- mm/nommu.c | 135 ++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 101 insertions(+), 34 deletions(-) --- a/mm/nommu.c~nommu-remove-uses-of-vma-linked-list +++ a/mm/nommu.c @@ -557,26 +557,14 @@ void vma_mas_remove(struct vm_area_struc mas_store_prealloc(mas, NULL); } -/* - * add a VMA into a process's mm_struct in the appropriate place in the list - * and tree and add to the address space's page tree also if not an anonymous - * page - * - should be called with mm->mmap_lock held writelocked - */ -static void add_vma_to_mm(struct mm_struct *mm, struct vm_area_struct *vma) +static void setup_vma_to_mm(struct vm_area_struct *vma, struct mm_struct *mm) { - struct address_space *mapping; - struct vm_area_struct *prev; - MA_STATE(mas, &mm->mm_mt, vma->vm_start, vma->vm_end); - - BUG_ON(!vma->vm_region); - mm->map_count++; vma->vm_mm = mm; /* add the VMA to the mapping */ if (vma->vm_file) { - mapping = vma->vm_file->f_mapping; + struct address_space *mapping = vma->vm_file->f_mapping; i_mmap_lock_write(mapping); flush_dcache_mmap_lock(mapping); @@ -584,21 +572,52 @@ static void add_vma_to_mm(struct mm_stru flush_dcache_mmap_unlock(mapping); i_mmap_unlock_write(mapping); } +} + +/* + * mas_add_vma_to_mm() - Maple state variant of add_mas_to_mm(). + * @mas: The maple state with preallocations. + * @mm: The mm_struct + * @vma: The vma to add + * + */ +static void mas_add_vma_to_mm(struct ma_state *mas, struct mm_struct *mm, + struct vm_area_struct *vma) +{ + struct vm_area_struct *prev; + + BUG_ON(!vma->vm_region); - prev = mas_prev(&mas, 0); - mas_reset(&mas); + setup_vma_to_mm(vma, mm); + + prev = mas_prev(mas, 0); + mas_reset(mas); /* add the VMA to the tree */ - vma_mas_store(vma, &mas); + vma_mas_store(vma, mas); __vma_link_list(mm, vma, prev); } /* - * delete a VMA from its owning mm_struct and address space + * add a VMA into a process's mm_struct in the appropriate place in the list + * and tree and add to the address space's page tree also if not an anonymous + * page + * - should be called with mm->mmap_lock held writelocked */ -static void delete_vma_from_mm(struct vm_area_struct *vma) +static int add_vma_to_mm(struct mm_struct *mm, struct vm_area_struct *vma) { - MA_STATE(mas, &vma->vm_mm->mm_mt, 0, 0); + MA_STATE(mas, &mm->mm_mt, vma->vm_start, vma->vm_end); + + if (mas_preallocate(&mas, vma, GFP_KERNEL)) { + pr_warn("Allocation of vma tree for process %d failed\n", + current->pid); + return -ENOMEM; + } + mas_add_vma_to_mm(&mas, mm, vma); + return 0; +} +static void cleanup_vma_from_mm(struct vm_area_struct *vma) +{ vma->vm_mm->map_count--; /* remove the VMA from the mapping */ if (vma->vm_file) { @@ -611,10 +630,25 @@ static void delete_vma_from_mm(struct vm flush_dcache_mmap_unlock(mapping); i_mmap_unlock_write(mapping); } +} +/* + * delete a VMA from its owning mm_struct and address space + */ +static int delete_vma_from_mm(struct vm_area_struct *vma) +{ + MA_STATE(mas, &vma->vm_mm->mm_mt, 0, 0); + + if (mas_preallocate(&mas, vma, GFP_KERNEL)) { + pr_warn("Allocation of vma tree for process %d failed\n", + current->pid); + return -ENOMEM; + } + cleanup_vma_from_mm(vma); /* remove from the MM's tree and list */ vma_mas_remove(vma, &mas); __vma_unlink_list(vma->vm_mm, vma); + return 0; } /* @@ -1024,6 +1058,7 @@ unsigned long do_mmap(struct file *file, vm_flags_t vm_flags; unsigned long capabilities, result; int ret; + MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); *populate = 0; @@ -1042,6 +1077,7 @@ unsigned long do_mmap(struct file *file, * now know into VMA flags */ vm_flags = determine_vm_flags(file, prot, flags, capabilities); + /* we're going to need to record the mapping */ region = kmem_cache_zalloc(vm_region_jar, GFP_KERNEL); if (!region) @@ -1051,6 +1087,9 @@ unsigned long do_mmap(struct file *file, if (!vma) goto error_getting_vma; + if (mas_preallocate(&mas, vma, GFP_KERNEL)) + goto error_maple_preallocate; + region->vm_usage = 1; region->vm_flags = vm_flags; region->vm_pgoff = pgoff; @@ -1191,7 +1230,7 @@ unsigned long do_mmap(struct file *file, current->mm->total_vm += len >> PAGE_SHIFT; share: - add_vma_to_mm(current->mm, vma); + mas_add_vma_to_mm(&mas, current->mm, vma); /* we flush the region from the icache only when the first executable * mapping of it is made */ @@ -1217,6 +1256,7 @@ error: sharing_violation: up_write(&nommu_region_sem); + mas_destroy(&mas); pr_warn("Attempt to share mismatched mappings\n"); ret = -EINVAL; goto error; @@ -1233,6 +1273,14 @@ error_getting_region: len, current->pid); show_free_areas(0, NULL); return -ENOMEM; + +error_maple_preallocate: + kmem_cache_free(vm_region_jar, region); + vm_area_free(vma); + pr_warn("Allocation of vma tree for process %d failed\n", current->pid); + show_free_areas(0, NULL); + return -ENOMEM; + } unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len, @@ -1298,6 +1346,7 @@ int split_vma(struct mm_struct *mm, stru struct vm_area_struct *new; struct vm_region *region; unsigned long npages; + MA_STATE(mas, &mm->mm_mt, vma->vm_start, vma->vm_end); /* we're only permitted to split anonymous regions (these should have * only a single usage on the region) */ @@ -1333,7 +1382,6 @@ int split_vma(struct mm_struct *mm, stru if (new->vm_ops && new->vm_ops->open) new->vm_ops->open(new); - delete_vma_from_mm(vma); down_write(&nommu_region_sem); delete_nommu_region(vma->vm_region); if (new_below) { @@ -1346,8 +1394,17 @@ int split_vma(struct mm_struct *mm, stru add_nommu_region(vma->vm_region); add_nommu_region(new->vm_region); up_write(&nommu_region_sem); - add_vma_to_mm(mm, vma); - add_vma_to_mm(mm, new); + if (mas_preallocate(&mas, vma, GFP_KERNEL)) { + pr_warn("Allocation of vma tree for process %d failed\n", + current->pid); + return -ENOMEM; + } + + setup_vma_to_mm(vma, mm); + setup_vma_to_mm(new, mm); + mas_set_range(&mas, vma->vm_start, vma->vm_end - 1); + mas_store(&mas, vma); + vma_mas_store(new, &mas); return 0; } @@ -1363,12 +1420,14 @@ static int shrink_vma(struct mm_struct * /* adjust the VMA's pointers, which may reposition it in the MM's tree * and list */ - delete_vma_from_mm(vma); + if (delete_vma_from_mm(vma)) + return -ENOMEM; if (from > vma->vm_start) vma->vm_end = from; else vma->vm_start = to; - add_vma_to_mm(mm, vma); + if (add_vma_to_mm(mm, vma)) + return -ENOMEM; /* cut the backing region down to size */ region = vma->vm_region; @@ -1396,9 +1455,10 @@ static int shrink_vma(struct mm_struct * */ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, struct list_head *uf) { + MA_STATE(mas, &mm->mm_mt, start, start); struct vm_area_struct *vma; unsigned long end; - int ret; + int ret = 0; len = PAGE_ALIGN(len); if (len == 0) @@ -1407,7 +1467,7 @@ int do_munmap(struct mm_struct *mm, unsi end = start + len; /* find the first potentially overlapping VMA */ - vma = find_vma(mm, start); + vma = mas_find(&mas, end - 1); if (!vma) { static int limit; if (limit < 5) { @@ -1426,7 +1486,7 @@ int do_munmap(struct mm_struct *mm, unsi return -EINVAL; if (end == vma->vm_end) goto erase_whole_vma; - vma = vma->vm_next; + vma = mas_next(&mas, end - 1); } while (vma); return -EINVAL; } else { @@ -1448,9 +1508,10 @@ int do_munmap(struct mm_struct *mm, unsi } erase_whole_vma: - delete_vma_from_mm(vma); + if (delete_vma_from_mm(vma)) + ret = -ENOMEM; delete_vma(mm, vma); - return 0; + return ret; } int vm_munmap(unsigned long addr, size_t len) @@ -1475,6 +1536,7 @@ SYSCALL_DEFINE2(munmap, unsigned long, a */ void exit_mmap(struct mm_struct *mm) { + VMA_ITERATOR(vmi, mm, 0); struct vm_area_struct *vma; if (!mm) @@ -1482,13 +1544,18 @@ void exit_mmap(struct mm_struct *mm) mm->total_vm = 0; - while ((vma = mm->mmap)) { - mm->mmap = vma->vm_next; - delete_vma_from_mm(vma); + /* + * Lock the mm to avoid assert complaining even though this is the only + * user of the mm + */ + mmap_write_lock(mm); + for_each_vma(vmi, vma) { + cleanup_vma_from_mm(vma); delete_vma(mm, vma); cond_resched(); } __mt_destroy(&mm->mm_mt); + mmap_write_unlock(mm); } int vm_brk(unsigned long addr, unsigned long len) _ Patches currently in -mm which might be from willy@infradead.org are support-highmem-pages-in-vmap_pages_range_noflush.patch mm-add-vma-iterator.patch mmap-use-the-vma-iterator-in-count_vma_pages_range.patch proc-remove-vma-rbtree-use-from-nommu.patch arm64-remove-mmap-linked-list-from-vdso.patch parisc-remove-mmap-linked-list-from-cache-handling.patch powerpc-remove-mmap-linked-list-walks.patch s390-remove-vma-linked-list-walks.patch x86-remove-vma-linked-list-walks.patch xtensa-remove-vma-linked-list-walks.patch cxl-remove-vma-linked-list-walk.patch optee-remove-vma-linked-list-walk.patch um-remove-vma-linked-list-walk.patch coredump-remove-vma-linked-list-walk.patch exec-use-vma-iterator-instead-of-linked-list.patch fs-proc-task_mmu-stop-using-linked-list-and-highest_vm_end.patch acct-use-vma-iterator-instead-of-linked-list.patch perf-use-vma-iterator.patch sched-use-maple-tree-iterator-to-walk-vmas.patch fork-use-vma-iterator.patch mm-khugepaged-stop-using-vma-linked-list.patch mm-ksm-use-vma-iterators-instead-of-vma-linked-list.patch mm-mlock-use-vma-iterator-and-maple-state-instead-of-vma-linked-list.patch mm-pagewalk-use-vma_find-instead-of-vma-linked-list.patch i915-use-the-vma-iterator.patch nommu-remove-uses-of-vma-linked-list.patch