All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qian Cai <quic_qiancai@quicinc.com>
To: Liam Howlett <liam.howlett@oracle.com>
Cc: "maple-tree@lists.infradead.org" <maple-tree@lists.infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v9 28/69] mm/mmap: reorganize munmap to use maple states
Date: Mon, 6 Jun 2022 08:09:24 -0400	[thread overview]
Message-ID: <Yp3udPy0vuDK8khc@qian> (raw)
In-Reply-To: <20220504011345.662299-13-Liam.Howlett@oracle.com>

On Wed, May 04, 2022 at 01:13:53AM +0000, Liam Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> 
> Remove __do_munmap() in favour of do_munmap(), do_mas_munmap(), and
> do_mas_align_munmap().
> 
> do_munmap() is a wrapper to create a maple state for any callers that have
> not been converted to the maple tree.
> 
> do_mas_munmap() takes a maple state to mumap a range.  This is just a
> small function which checks for error conditions and aligns the end of the
> range.
> 
> do_mas_align_munmap() uses the aligned range to mumap a range.
> do_mas_align_munmap() starts with the first VMA in the range, then finds
> the last VMA in the range.  Both start and end are split if necessary.
> Then the VMAs are removed from the linked list and the mm mlock count is
> updated at the same time.  Followed by a single tree operation of
> overwriting the area in with a NULL.  Finally, the detached list is
> unmapped and freed.
> 
> By reorganizing the munmap calls as outlined, it is now possible to avoid
> extra work of aligning pre-aligned callers which are known to be safe,
> avoid extra VMA lookups or tree walks for modifications.
> 
> detach_vmas_to_be_unmapped() is no longer used, so drop this code.
> 
> vm_brk_flags() can just call the do_mas_munmap() as it checks for
> intersecting VMAs directly.
> 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
...
> +/*
> + * do_mas_align_munmap() - munmap the aligned region from @start to @end.
> + * @mas: The maple_state, ideally set up to alter the correct tree location.
> + * @vma: The starting vm_area_struct
> + * @mm: The mm_struct
> + * @start: The aligned start address to munmap.
> + * @end: The aligned end address to munmap.
> + * @uf: The userfaultfd list_head
> + * @downgrade: Set to true to attempt a write downgrade of the mmap_sem
> + *
> + * If @downgrade is true, check return code for potential release of the lock.
> + */
> +static int
> +do_mas_align_munmap(struct ma_state *mas, struct vm_area_struct *vma,
> +		    struct mm_struct *mm, unsigned long start,
> +		    unsigned long end, struct list_head *uf, bool downgrade)
> +{
> +	struct vm_area_struct *prev, *last;
> +	int error = -ENOMEM;
> +	/* we have start < vma->vm_end  */
>  
> -	if (mas_preallocate(&mas, vma, GFP_KERNEL))
> +	if (mas_preallocate(mas, vma, GFP_KERNEL))
>  		return -ENOMEM;
> -	prev = vma->vm_prev;
> -	/* we have start < vma->vm_end  */
>  
> +	mas->last = end - 1;
>  	/*
>  	 * If we need to split any vma, do it now to save pain later.
>  	 *
...
> +/*
> + * do_mas_munmap() - munmap a given range.
> + * @mas: The maple state
> + * @mm: The mm_struct
> + * @start: The start address to munmap
> + * @len: The length of the range to munmap
> + * @uf: The userfaultfd list_head
> + * @downgrade: set to true if the user wants to attempt to write_downgrade the
> + * mmap_sem
> + *
> + * This function takes a @mas that is either pointing to the previous VMA or set
> + * to MA_START and sets it up to remove the mapping(s).  The @len will be
> + * aligned and any arch_unmap work will be preformed.
> + *
> + * Returns: -EINVAL on failure, 1 on success and unlock, 0 otherwise.
> + */
> +int do_mas_munmap(struct ma_state *mas, struct mm_struct *mm,
> +		  unsigned long start, size_t len, struct list_head *uf,
> +		  bool downgrade)
> +{
> +	unsigned long end;
> +	struct vm_area_struct *vma;
> +
> +	if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start)
> +		return -EINVAL;
> +
> +	end = start + PAGE_ALIGN(len);
> +	if (end == start)
> +		return -EINVAL;
> +
> +	 /* arch_unmap() might do unmaps itself.  */
> +	arch_unmap(mm, start, end);
> +
> +	/* Find the first overlapping VMA */
> +	vma = mas_find(mas, end - 1);
> +	if (!vma)
> +		return 0;
> +
> +	return do_mas_align_munmap(mas, vma, mm, start, end, uf, downgrade);
> +}
> +
...
> @@ -2845,11 +2908,12 @@ static int __vm_munmap(unsigned long start, size_t len, bool downgrade)
>  	int ret;
>  	struct mm_struct *mm = current->mm;
>  	LIST_HEAD(uf);
> +	MA_STATE(mas, &mm->mm_mt, start, start);
>  
>  	if (mmap_write_lock_killable(mm))
>  		return -EINTR;
>  
> -	ret = __do_munmap(mm, start, len, &uf, downgrade);
> +	ret = do_mas_munmap(&mas, mm, start, len, &uf, downgrade);
>  	/*
>  	 * Returning 1 indicates mmap_lock is downgraded.
>  	 * But 1 is not legal return value of vm_munmap() and munmap(), reset

Running a syscall fuzzer for a while could trigger those.

 WARNING: CPU: 95 PID: 1329067 at mm/slub.c:3643 kmem_cache_free_bulk
 CPU: 95 PID: 1329067 Comm: trinity-c32 Not tainted 5.18.0-next-20220603 #137
 pstate: 10400009 (nzcV daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : kmem_cache_free_bulk
 lr : mt_destroy_walk
 sp : ffff80005ed66bf0
 x29: ffff80005ed66bf0 x28: ffff401d6c82f050 x27: 0000000000000000
 x26: dfff800000000000 x25: 0000000000000003 x24: 1ffffa97cc5fb120
 x23: ffffd4be62fd8760 x22: ffff401d6c82f050 x21: 0000000000000003
 x20: 0000000000000000 x19: ffff401d6c82f000 x18: ffffd4be66407d1c
 x17: ffff40297ac21f0c x16: 1fffe8016136146b x15: 1fffe806c7d1ad38
 x14: 1fffe8016136145e x13: 0000000000000004 x12: ffff70000bdacd8d
 x11: 1ffff0000bdacd8c x10: ffff70000bdacd8c x9 : ffffd4be60d633c4
 x8 : ffff80005ed66c63 x7 : 0000000000000001 x6 : 0000000000000003
 x5 : ffff80005ed66c60 x4 : 0000000000000000 x3 : ffff400b09b09a80
 x2 : ffff401d6c82f050 x1 : 0000000000000000 x0 : ffff07ff80014a80
 Call trace:
  kmem_cache_free_bulk
  mt_destroy_walk
  mas_wmb_replace
  mas_spanning_rebalance.isra.0
  mas_wr_spanning_store.isra.0
  mas_wr_store_entry.isra.0
  mas_store_prealloc
  do_mas_align_munmap.constprop.0
  do_mas_munmap
  __vm_munmap
  __arm64_sys_munmap
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync
 irq event stamp: 665580
 hardirqs last  enabled at (665579):  kasan_quarantine_put
 hardirqs last disabled at (665580):  el1_dbg
 softirqs last  enabled at (664048):  __do_softirq
 softirqs last disabled at (663831):  __irq_exit_rcu


 BUG: KASAN: double-free or invalid-free in kmem_cache_free_bulk

 CPU: 95 PID: 1329067 Comm: trinity-c32 Tainted: G        W         5.18.0-next-20220603 #137
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  print_address_description.constprop.0
  print_report
  kasan_report_invalid_free
  ____kasan_slab_free
  __kasan_slab_free
  slab_free_freelist_hook
  kmem_cache_free_bulk
  mas_destroy
  mas_store_prealloc
  do_mas_align_munmap.constprop.0
  do_mas_munmap
  __vm_munmap
  __arm64_sys_munmap
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Allocated by task 1329067:
  kasan_save_stack
  __kasan_slab_alloc
  slab_post_alloc_hook
  kmem_cache_alloc_bulk
  mas_alloc_nodes
  mas_preallocate
  __vma_adjust
  vma_merge
  mprotect_fixup
  do_mprotect_pkey.constprop.0
  __arm64_sys_mprotect
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Freed by task 1329067:
  kasan_save_stack
  kasan_set_track
  kasan_set_free_info
  ____kasan_slab_free
  __kasan_slab_free
  slab_free_freelist_hook
  kmem_cache_free
  mt_destroy_walk
  mas_wmb_replace
  mas_spanning_rebalance.isra.0
  mas_wr_spanning_store.isra.0
  mas_wr_store_entry.isra.0
  mas_store_prealloc
  do_mas_align_munmap.constprop.0
  do_mas_munmap
  __vm_munmap
  __arm64_sys_munmap
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 The buggy address belongs to the object at ffff401d6c82f000
                which belongs to the cache maple_node of size 256
 The buggy address is located 0 bytes inside of
                256-byte region [ffff401d6c82f000, ffff401d6c82f100)

 The buggy address belongs to the physical page:
 page:fffffd0075b20a00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x401dec828
 head:fffffd0075b20a00 order:3 compound_mapcount:0 compound_pincount:0
 flags: 0x1bfffc0000010200(slab|head|node=1|zone=2|lastcpupid=0xffff)
 raw: 1bfffc0000010200 fffffd00065b2a08 fffffd0006474408 ffff07ff80014a80
 raw: 0000000000000000 00000000002a002a 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected
 page_owner tracks the page as allocated
 page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 185514, tgid 185514 (trinity-c15), ts 9791681605400, free_ts 9785882037080
  post_alloc_hook
  get_page_from_freelist
  __alloc_pages
  alloc_pages
  allocate_slab
  new_slab
  ___slab_alloc
  __slab_alloc.constprop.0
  kmem_cache_alloc
  mas_alloc_nodes
  mas_preallocate
  __vma_adjust
  vma_merge
  mlock_fixup
  apply_mlockall_flags
  __arm64_sys_munlockall
 page last free stack trace:
  free_pcp_prepare
  free_unref_page
  __free_pages
  __free_slab
  discard_slab
  __slab_free
  ___cache_free
  qlist_free_all
  kasan_quarantine_reduce
  __kasan_slab_alloc
  __kmalloc_node
  kvmalloc_node
  __slab_free
  ___cache_free
  qlist_free_all
  kasan_quarantine_reduce
  __kasan_slab_alloc
  __kmalloc_node
  kvmalloc_node
  proc_sys_call_handler
  proc_sys_read
  new_sync_read
  vfs_read

 Memory state around the buggy address:
  ffff401d6c82ef00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff401d6c82ef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >ffff401d6c82f000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                    ^
  ffff401d6c82f080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff401d6c82f100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc


  reply	other threads:[~2022-06-06 12:09 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-04  0:26 [PATCH 0/1] Prepare for maple tree Liam Howlett
2022-05-04  0:26 ` [PATCH 1/1] mips: rename mt_init to mips_mt_init Liam Howlett
2022-05-12  9:54   ` David Hildenbrand
2022-05-04  1:12 ` [PATCH v9 15/69] damon: Convert __damon_va_three_regions to use the VMA iterator Liam Howlett
2022-05-10 10:44   ` SeongJae Park
2022-05-10 16:27     ` Liam Howlett
2022-05-10 19:13     ` Andrew Morton
2022-05-04  1:13 ` [PATCH v9 16/69] proc: remove VMA rbtree use from nommu Liam Howlett
2022-05-04  1:13   ` [PATCH v9 17/69] mm: remove rb tree Liam Howlett
2022-05-04  1:13   ` [PATCH v9 18/69] mmap: change zeroing of maple tree in __vma_adjust() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 19/69] xen: use vma_lookup() in privcmd_ioctl_mmap() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 20/69] mm: optimize find_exact_vma() to use vma_lookup() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 21/69] mm/khugepaged: optimize collapse_pte_mapped_thp() by using vma_lookup() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 22/69] mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 23/69] mm: use maple tree operations for find_vma_intersection() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 24/69] mm/mmap: use advanced maple tree API for mmap_region() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 25/69] mm: remove vmacache Liam Howlett
2022-05-04  1:13   ` [PATCH v9 27/69] mm/mmap: move mmap_region() below do_munmap() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 28/69] mm/mmap: reorganize munmap to use maple states Liam Howlett
2022-06-06 12:09     ` Qian Cai [this message]
2022-06-06 16:19       ` Liam Howlett
2022-06-06 16:40         ` Qian Cai
2022-06-11 20:11           ` Yu Zhao
2022-06-11 21:49             ` Yu Zhao
2022-06-12  1:09               ` Liam Howlett
2022-06-15 14:25               ` Liam Howlett
2022-06-15 18:07                 ` Yu Zhao
2022-06-15 18:55                   ` Liam Howlett
2022-06-15 19:05                     ` Yu Zhao
2022-06-15 21:16                       ` Yu Zhao
2022-06-16  1:50                         ` Liam Howlett
2022-06-16  1:58                           ` Yu Zhao
2022-06-16  2:56                             ` Liam Howlett
2022-06-16  3:02                               ` Yu Zhao
2022-06-16  5:45                                 ` Yu Zhao
2022-06-16  5:55                                   ` Yu Zhao
2022-06-16 18:26                                     ` Liam Howlett
2022-06-16 18:34                                       ` Yu Zhao
2022-06-17 13:49                                         ` Liam Howlett
2022-05-04  1:13   ` [PATCH v9 26/69] mm: convert vma_lookup() to use mtree_load() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 30/69] arm64: remove mmap linked list from vdso Liam Howlett
2022-05-04  1:13   ` [PATCH v9 29/69] mm/mmap: change do_brk_munmap() to use do_mas_align_munmap() Liam Howlett
2022-05-04  1:13   ` [PATCH v9 31/69] arm64: Change elfcore for_each_mte_vma() to use VMA iterator Liam Howlett
2022-05-04  1:13   ` [PATCH v9 33/69] powerpc: remove mmap linked list walks Liam Howlett
2022-05-04  1:13   ` [PATCH v9 32/69] parisc: remove mmap linked list from cache handling Liam Howlett
2022-05-04  1:13   ` [PATCH v9 34/69] s390: remove vma linked list walks Liam Howlett
2022-05-04  1:13   ` [PATCH v9 36/69] xtensa: " Liam Howlett
2022-05-04  1:13   ` [PATCH v9 35/69] x86: " Liam Howlett
2022-05-04  1:13   ` [PATCH v9 37/69] cxl: remove vma linked list walk Liam Howlett
2022-05-04  1:13   ` [PATCH v9 39/69] um: " Liam Howlett
2022-05-04  1:13   ` [PATCH v9 38/69] optee: " Liam Howlett
2022-05-04  1:13   ` [PATCH v9 41/69] exec: use VMA iterator instead of linked list Liam Howlett
2022-05-04  1:13   ` [PATCH v9 42/69] fs/proc/base: use maple tree iterators in place " Liam Howlett
2022-05-04  1:13   ` [PATCH v9 40/69] coredump: remove vma linked list walk Liam Howlett
2022-05-04  1:13   ` [PATCH v9 45/69] ipc/shm: use VMA iterator instead of linked list Liam Howlett
2022-05-04  1:13   ` [PATCH v9 43/69] fs/proc/task_mmu: stop using linked list and highest_vm_end Liam Howlett
2022-05-04  1:13   ` [PATCH v9 44/69] userfaultfd: use maple tree iterator to iterate VMAs Liam Howlett
2022-05-04  1:14   ` [PATCH v9 46/69] acct: use VMA iterator instead of linked list Liam Howlett
2022-05-04  1:14   ` [PATCH v9 47/69] perf: use VMA iterator Liam Howlett
2022-05-04  1:14   ` [PATCH v9 48/69] sched: use maple tree iterator to walk VMAs Liam Howlett
2022-05-04  1:14   ` [PATCH v9 49/69] fork: use VMA iterator Liam Howlett
2022-05-04  1:14   ` [PATCH v9 50/69] bpf: remove VMA linked list Liam Howlett
2022-05-04  1:14   ` [PATCH v9 52/69] mm/khugepaged: stop using vma " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 53/69] mm/ksm: use vma iterators instead of " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 51/69] mm/gup: use maple tree navigation instead of " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 56/69] mm/mempolicy: use vma iterator & maple state instead of vma " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 55/69] mm/memcontrol: stop using mm->highest_vm_end Liam Howlett
2022-05-04  1:14   ` [PATCH v9 54/69] mm/madvise: use vma_find() instead of vma linked list Liam Howlett
2022-05-04  1:14   ` [PATCH v9 58/69] mm/mprotect: use maple tree navigation " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 57/69] mm/mlock: use vma iterator and maple state " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 59/69] mm/mremap: use vma_find_intersection() " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 60/69] mm/msync: use vma_find() " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 61/69] mm/oom_kill: use maple tree iterators " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 62/69] mm/pagewalk: use vma_find() " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 63/69] mm/swapfile: use vma iterator " Liam Howlett
2022-05-04  1:14   ` [PATCH v9 64/69] i915: use the VMA iterator Liam Howlett
2022-05-04  1:14   ` [PATCH v9 65/69] nommu: remove uses of VMA linked list Liam Howlett
2022-05-04  1:14   ` [PATCH v9 68/69] mm/mmap: drop range_has_overlap() function Liam Howlett
2022-05-04  1:14   ` [PATCH v9 66/69] riscv: use vma iterator for vdso Liam Howlett
2022-05-04  1:14   ` [PATCH v9 67/69] mm: remove the vma linked list Liam Howlett
2022-05-13 13:30     ` Qian Cai
2022-05-13 14:17       ` Liam Howlett
2022-05-04  1:14   ` [PATCH v9 69/69] mm/mmap.c: pass in mapping to __vma_link_file() Liam Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yp3udPy0vuDK8khc@qian \
    --to=quic_qiancai@quicinc.com \
    --cc=akpm@linux-foundation.org \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maple-tree@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.