From: Qian Cai <quic_qiancai@quicinc.com>
To: Liam Howlett <liam.howlett@oracle.com>
Cc: "maple-tree@lists.infradead.org" <maple-tree@lists.infradead.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v9 28/69] mm/mmap: reorganize munmap to use maple states
Date: Mon, 6 Jun 2022 08:09:24 -0400 [thread overview]
Message-ID: <Yp3udPy0vuDK8khc@qian> (raw)
In-Reply-To: <20220504011345.662299-13-Liam.Howlett@oracle.com>
On Wed, May 04, 2022 at 01:13:53AM +0000, Liam Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
>
> Remove __do_munmap() in favour of do_munmap(), do_mas_munmap(), and
> do_mas_align_munmap().
>
> do_munmap() is a wrapper to create a maple state for any callers that have
> not been converted to the maple tree.
>
> do_mas_munmap() takes a maple state to mumap a range. This is just a
> small function which checks for error conditions and aligns the end of the
> range.
>
> do_mas_align_munmap() uses the aligned range to mumap a range.
> do_mas_align_munmap() starts with the first VMA in the range, then finds
> the last VMA in the range. Both start and end are split if necessary.
> Then the VMAs are removed from the linked list and the mm mlock count is
> updated at the same time. Followed by a single tree operation of
> overwriting the area in with a NULL. Finally, the detached list is
> unmapped and freed.
>
> By reorganizing the munmap calls as outlined, it is now possible to avoid
> extra work of aligning pre-aligned callers which are known to be safe,
> avoid extra VMA lookups or tree walks for modifications.
>
> detach_vmas_to_be_unmapped() is no longer used, so drop this code.
>
> vm_brk_flags() can just call the do_mas_munmap() as it checks for
> intersecting VMAs directly.
>
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
...
> +/*
> + * do_mas_align_munmap() - munmap the aligned region from @start to @end.
> + * @mas: The maple_state, ideally set up to alter the correct tree location.
> + * @vma: The starting vm_area_struct
> + * @mm: The mm_struct
> + * @start: The aligned start address to munmap.
> + * @end: The aligned end address to munmap.
> + * @uf: The userfaultfd list_head
> + * @downgrade: Set to true to attempt a write downgrade of the mmap_sem
> + *
> + * If @downgrade is true, check return code for potential release of the lock.
> + */
> +static int
> +do_mas_align_munmap(struct ma_state *mas, struct vm_area_struct *vma,
> + struct mm_struct *mm, unsigned long start,
> + unsigned long end, struct list_head *uf, bool downgrade)
> +{
> + struct vm_area_struct *prev, *last;
> + int error = -ENOMEM;
> + /* we have start < vma->vm_end */
>
> - if (mas_preallocate(&mas, vma, GFP_KERNEL))
> + if (mas_preallocate(mas, vma, GFP_KERNEL))
> return -ENOMEM;
> - prev = vma->vm_prev;
> - /* we have start < vma->vm_end */
>
> + mas->last = end - 1;
> /*
> * If we need to split any vma, do it now to save pain later.
> *
...
> +/*
> + * do_mas_munmap() - munmap a given range.
> + * @mas: The maple state
> + * @mm: The mm_struct
> + * @start: The start address to munmap
> + * @len: The length of the range to munmap
> + * @uf: The userfaultfd list_head
> + * @downgrade: set to true if the user wants to attempt to write_downgrade the
> + * mmap_sem
> + *
> + * This function takes a @mas that is either pointing to the previous VMA or set
> + * to MA_START and sets it up to remove the mapping(s). The @len will be
> + * aligned and any arch_unmap work will be preformed.
> + *
> + * Returns: -EINVAL on failure, 1 on success and unlock, 0 otherwise.
> + */
> +int do_mas_munmap(struct ma_state *mas, struct mm_struct *mm,
> + unsigned long start, size_t len, struct list_head *uf,
> + bool downgrade)
> +{
> + unsigned long end;
> + struct vm_area_struct *vma;
> +
> + if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start)
> + return -EINVAL;
> +
> + end = start + PAGE_ALIGN(len);
> + if (end == start)
> + return -EINVAL;
> +
> + /* arch_unmap() might do unmaps itself. */
> + arch_unmap(mm, start, end);
> +
> + /* Find the first overlapping VMA */
> + vma = mas_find(mas, end - 1);
> + if (!vma)
> + return 0;
> +
> + return do_mas_align_munmap(mas, vma, mm, start, end, uf, downgrade);
> +}
> +
...
> @@ -2845,11 +2908,12 @@ static int __vm_munmap(unsigned long start, size_t len, bool downgrade)
> int ret;
> struct mm_struct *mm = current->mm;
> LIST_HEAD(uf);
> + MA_STATE(mas, &mm->mm_mt, start, start);
>
> if (mmap_write_lock_killable(mm))
> return -EINTR;
>
> - ret = __do_munmap(mm, start, len, &uf, downgrade);
> + ret = do_mas_munmap(&mas, mm, start, len, &uf, downgrade);
> /*
> * Returning 1 indicates mmap_lock is downgraded.
> * But 1 is not legal return value of vm_munmap() and munmap(), reset
Running a syscall fuzzer for a while could trigger those.
WARNING: CPU: 95 PID: 1329067 at mm/slub.c:3643 kmem_cache_free_bulk
CPU: 95 PID: 1329067 Comm: trinity-c32 Not tainted 5.18.0-next-20220603 #137
pstate: 10400009 (nzcV daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : kmem_cache_free_bulk
lr : mt_destroy_walk
sp : ffff80005ed66bf0
x29: ffff80005ed66bf0 x28: ffff401d6c82f050 x27: 0000000000000000
x26: dfff800000000000 x25: 0000000000000003 x24: 1ffffa97cc5fb120
x23: ffffd4be62fd8760 x22: ffff401d6c82f050 x21: 0000000000000003
x20: 0000000000000000 x19: ffff401d6c82f000 x18: ffffd4be66407d1c
x17: ffff40297ac21f0c x16: 1fffe8016136146b x15: 1fffe806c7d1ad38
x14: 1fffe8016136145e x13: 0000000000000004 x12: ffff70000bdacd8d
x11: 1ffff0000bdacd8c x10: ffff70000bdacd8c x9 : ffffd4be60d633c4
x8 : ffff80005ed66c63 x7 : 0000000000000001 x6 : 0000000000000003
x5 : ffff80005ed66c60 x4 : 0000000000000000 x3 : ffff400b09b09a80
x2 : ffff401d6c82f050 x1 : 0000000000000000 x0 : ffff07ff80014a80
Call trace:
kmem_cache_free_bulk
mt_destroy_walk
mas_wmb_replace
mas_spanning_rebalance.isra.0
mas_wr_spanning_store.isra.0
mas_wr_store_entry.isra.0
mas_store_prealloc
do_mas_align_munmap.constprop.0
do_mas_munmap
__vm_munmap
__arm64_sys_munmap
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
irq event stamp: 665580
hardirqs last enabled at (665579): kasan_quarantine_put
hardirqs last disabled at (665580): el1_dbg
softirqs last enabled at (664048): __do_softirq
softirqs last disabled at (663831): __irq_exit_rcu
BUG: KASAN: double-free or invalid-free in kmem_cache_free_bulk
CPU: 95 PID: 1329067 Comm: trinity-c32 Tainted: G W 5.18.0-next-20220603 #137
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
print_address_description.constprop.0
print_report
kasan_report_invalid_free
____kasan_slab_free
__kasan_slab_free
slab_free_freelist_hook
kmem_cache_free_bulk
mas_destroy
mas_store_prealloc
do_mas_align_munmap.constprop.0
do_mas_munmap
__vm_munmap
__arm64_sys_munmap
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
Allocated by task 1329067:
kasan_save_stack
__kasan_slab_alloc
slab_post_alloc_hook
kmem_cache_alloc_bulk
mas_alloc_nodes
mas_preallocate
__vma_adjust
vma_merge
mprotect_fixup
do_mprotect_pkey.constprop.0
__arm64_sys_mprotect
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
Freed by task 1329067:
kasan_save_stack
kasan_set_track
kasan_set_free_info
____kasan_slab_free
__kasan_slab_free
slab_free_freelist_hook
kmem_cache_free
mt_destroy_walk
mas_wmb_replace
mas_spanning_rebalance.isra.0
mas_wr_spanning_store.isra.0
mas_wr_store_entry.isra.0
mas_store_prealloc
do_mas_align_munmap.constprop.0
do_mas_munmap
__vm_munmap
__arm64_sys_munmap
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
The buggy address belongs to the object at ffff401d6c82f000
which belongs to the cache maple_node of size 256
The buggy address is located 0 bytes inside of
256-byte region [ffff401d6c82f000, ffff401d6c82f100)
The buggy address belongs to the physical page:
page:fffffd0075b20a00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x401dec828
head:fffffd0075b20a00 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x1bfffc0000010200(slab|head|node=1|zone=2|lastcpupid=0xffff)
raw: 1bfffc0000010200 fffffd00065b2a08 fffffd0006474408 ffff07ff80014a80
raw: 0000000000000000 00000000002a002a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 185514, tgid 185514 (trinity-c15), ts 9791681605400, free_ts 9785882037080
post_alloc_hook
get_page_from_freelist
__alloc_pages
alloc_pages
allocate_slab
new_slab
___slab_alloc
__slab_alloc.constprop.0
kmem_cache_alloc
mas_alloc_nodes
mas_preallocate
__vma_adjust
vma_merge
mlock_fixup
apply_mlockall_flags
__arm64_sys_munlockall
page last free stack trace:
free_pcp_prepare
free_unref_page
__free_pages
__free_slab
discard_slab
__slab_free
___cache_free
qlist_free_all
kasan_quarantine_reduce
__kasan_slab_alloc
__kmalloc_node
kvmalloc_node
__slab_free
___cache_free
qlist_free_all
kasan_quarantine_reduce
__kasan_slab_alloc
__kmalloc_node
kvmalloc_node
proc_sys_call_handler
proc_sys_read
new_sync_read
vfs_read
Memory state around the buggy address:
ffff401d6c82ef00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff401d6c82ef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff401d6c82f000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff401d6c82f080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff401d6c82f100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
next prev parent reply other threads:[~2022-06-06 12:09 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-04 0:26 [PATCH 0/1] Prepare for maple tree Liam Howlett
2022-05-04 0:26 ` [PATCH 1/1] mips: rename mt_init to mips_mt_init Liam Howlett
2022-05-12 9:54 ` David Hildenbrand
2022-05-04 1:12 ` [PATCH v9 15/69] damon: Convert __damon_va_three_regions to use the VMA iterator Liam Howlett
2022-05-10 10:44 ` SeongJae Park
2022-05-10 16:27 ` Liam Howlett
2022-05-10 19:13 ` Andrew Morton
2022-05-04 1:13 ` [PATCH v9 16/69] proc: remove VMA rbtree use from nommu Liam Howlett
2022-05-04 1:13 ` [PATCH v9 17/69] mm: remove rb tree Liam Howlett
2022-05-04 1:13 ` [PATCH v9 18/69] mmap: change zeroing of maple tree in __vma_adjust() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 19/69] xen: use vma_lookup() in privcmd_ioctl_mmap() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 20/69] mm: optimize find_exact_vma() to use vma_lookup() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 21/69] mm/khugepaged: optimize collapse_pte_mapped_thp() by using vma_lookup() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 22/69] mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 23/69] mm: use maple tree operations for find_vma_intersection() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 24/69] mm/mmap: use advanced maple tree API for mmap_region() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 25/69] mm: remove vmacache Liam Howlett
2022-05-04 1:13 ` [PATCH v9 27/69] mm/mmap: move mmap_region() below do_munmap() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 28/69] mm/mmap: reorganize munmap to use maple states Liam Howlett
2022-06-06 12:09 ` Qian Cai [this message]
2022-06-06 16:19 ` Liam Howlett
2022-06-06 16:40 ` Qian Cai
2022-06-11 20:11 ` Yu Zhao
2022-06-11 21:49 ` Yu Zhao
2022-06-12 1:09 ` Liam Howlett
2022-06-15 14:25 ` Liam Howlett
2022-06-15 18:07 ` Yu Zhao
2022-06-15 18:55 ` Liam Howlett
2022-06-15 19:05 ` Yu Zhao
2022-06-15 21:16 ` Yu Zhao
2022-06-16 1:50 ` Liam Howlett
2022-06-16 1:58 ` Yu Zhao
2022-06-16 2:56 ` Liam Howlett
2022-06-16 3:02 ` Yu Zhao
2022-06-16 5:45 ` Yu Zhao
2022-06-16 5:55 ` Yu Zhao
2022-06-16 18:26 ` Liam Howlett
2022-06-16 18:34 ` Yu Zhao
2022-06-17 13:49 ` Liam Howlett
2022-05-04 1:13 ` [PATCH v9 26/69] mm: convert vma_lookup() to use mtree_load() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 30/69] arm64: remove mmap linked list from vdso Liam Howlett
2022-05-04 1:13 ` [PATCH v9 29/69] mm/mmap: change do_brk_munmap() to use do_mas_align_munmap() Liam Howlett
2022-05-04 1:13 ` [PATCH v9 31/69] arm64: Change elfcore for_each_mte_vma() to use VMA iterator Liam Howlett
2022-05-04 1:13 ` [PATCH v9 33/69] powerpc: remove mmap linked list walks Liam Howlett
2022-05-04 1:13 ` [PATCH v9 32/69] parisc: remove mmap linked list from cache handling Liam Howlett
2022-05-04 1:13 ` [PATCH v9 34/69] s390: remove vma linked list walks Liam Howlett
2022-05-04 1:13 ` [PATCH v9 36/69] xtensa: " Liam Howlett
2022-05-04 1:13 ` [PATCH v9 35/69] x86: " Liam Howlett
2022-05-04 1:13 ` [PATCH v9 37/69] cxl: remove vma linked list walk Liam Howlett
2022-05-04 1:13 ` [PATCH v9 39/69] um: " Liam Howlett
2022-05-04 1:13 ` [PATCH v9 38/69] optee: " Liam Howlett
2022-05-04 1:13 ` [PATCH v9 41/69] exec: use VMA iterator instead of linked list Liam Howlett
2022-05-04 1:13 ` [PATCH v9 42/69] fs/proc/base: use maple tree iterators in place " Liam Howlett
2022-05-04 1:13 ` [PATCH v9 40/69] coredump: remove vma linked list walk Liam Howlett
2022-05-04 1:13 ` [PATCH v9 45/69] ipc/shm: use VMA iterator instead of linked list Liam Howlett
2022-05-04 1:13 ` [PATCH v9 43/69] fs/proc/task_mmu: stop using linked list and highest_vm_end Liam Howlett
2022-05-04 1:13 ` [PATCH v9 44/69] userfaultfd: use maple tree iterator to iterate VMAs Liam Howlett
2022-05-04 1:14 ` [PATCH v9 46/69] acct: use VMA iterator instead of linked list Liam Howlett
2022-05-04 1:14 ` [PATCH v9 47/69] perf: use VMA iterator Liam Howlett
2022-05-04 1:14 ` [PATCH v9 48/69] sched: use maple tree iterator to walk VMAs Liam Howlett
2022-05-04 1:14 ` [PATCH v9 49/69] fork: use VMA iterator Liam Howlett
2022-05-04 1:14 ` [PATCH v9 50/69] bpf: remove VMA linked list Liam Howlett
2022-05-04 1:14 ` [PATCH v9 52/69] mm/khugepaged: stop using vma " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 53/69] mm/ksm: use vma iterators instead of " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 51/69] mm/gup: use maple tree navigation instead of " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 56/69] mm/mempolicy: use vma iterator & maple state instead of vma " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 55/69] mm/memcontrol: stop using mm->highest_vm_end Liam Howlett
2022-05-04 1:14 ` [PATCH v9 54/69] mm/madvise: use vma_find() instead of vma linked list Liam Howlett
2022-05-04 1:14 ` [PATCH v9 58/69] mm/mprotect: use maple tree navigation " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 57/69] mm/mlock: use vma iterator and maple state " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 59/69] mm/mremap: use vma_find_intersection() " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 60/69] mm/msync: use vma_find() " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 61/69] mm/oom_kill: use maple tree iterators " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 62/69] mm/pagewalk: use vma_find() " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 63/69] mm/swapfile: use vma iterator " Liam Howlett
2022-05-04 1:14 ` [PATCH v9 64/69] i915: use the VMA iterator Liam Howlett
2022-05-04 1:14 ` [PATCH v9 65/69] nommu: remove uses of VMA linked list Liam Howlett
2022-05-04 1:14 ` [PATCH v9 68/69] mm/mmap: drop range_has_overlap() function Liam Howlett
2022-05-04 1:14 ` [PATCH v9 66/69] riscv: use vma iterator for vdso Liam Howlett
2022-05-04 1:14 ` [PATCH v9 67/69] mm: remove the vma linked list Liam Howlett
2022-05-13 13:30 ` Qian Cai
2022-05-13 14:17 ` Liam Howlett
2022-05-04 1:14 ` [PATCH v9 69/69] mm/mmap.c: pass in mapping to __vma_link_file() Liam Howlett
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yp3udPy0vuDK8khc@qian \
--to=quic_qiancai@quicinc.com \
--cc=akpm@linux-foundation.org \
--cc=liam.howlett@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maple-tree@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.