* Re: [PATCH v2 15/15] mm/mmap: Move may_expand_vm() check in mmap_region() @ 2024-06-26 6:56 Bert Karwatzki 0 siblings, 0 replies; 2+ messages in thread From: Bert Karwatzki @ 2024-06-26 6:56 UTC (permalink / raw) To: Liam R . Howlett; +Cc: Bert Karwatzki, linux-mm I have tested the whole set for a few hours now (applied on top of next-20240625) so far no problems occured and I can open as many firefox tabs as I want without crashing. Bert Karwatzki ^ permalink raw reply [flat|nested] 2+ messages in thread
* [PATCH v2 00/15] Avoid MAP_FIXED gap exposure
@ 2024-06-25 19:11 Liam R. Howlett
2024-06-25 19:11 ` [PATCH v2 15/15] mm/mmap: Move may_expand_vm() check in mmap_region() Liam R. Howlett
0 siblings, 1 reply; 2+ messages in thread
From: Liam R. Howlett @ 2024-06-25 19:11 UTC (permalink / raw)
To: linux-mm, Andrew Morton
Cc: Suren Baghdasaryan, Vlastimil Babka, Lorenzo Stoakes,
Matthew Wilcox, sidhartha.kumar, Paul E . McKenney,
Bert Karwatzki, Jiri Olsa, linux-kernel, Kees Cook,
Liam R. Howlett
It is now possible to walk the vma tree using the rcu read locks and is
beneficial to do so to reduce lock contention. Doing so while a
MAP_FIXED mapping is executing means that a reader may see a gap in the
vma tree that should never logically exist - and does not when using the
mmap lock in read mode. The temporal gap exists because mmap_region()
calls munmap() prior to installing the new mapping.
This patch set stops rcu readers from seeing the temporal gap by
splitting up the munmap() function into two parts. The first part
prepares the vma tree for modifications by doing the necessary splits
and tracks the vmas marked for removal in a side tree. The second part
completes the munmapping of the vmas after the vma tree has been
overwritten (either by a MAP_FIXED replacement vma or by a NULL in the
munmap() case).
Please note that rcu walkers will still be able to see a temporary state
of split vmas that may be in the process of being removed, but the
temporal gap will not be exposed. vma_start_write() are called on both
parts of the split vma, so this state is detectable.
RFC: https://lore.kernel.org/linux-mm/20240531163217.1584450-1-Liam.Howlett@oracle.com/
v1: https://lore.kernel.org/linux-mm/20240611180200.711239-1-Liam.Howlett@oracle.com/
Changes since v1:
- Fixed issue with expanding vmas clearing the incorrect PTE range,
Thanks Bert Karwatzki and Jiri Olsa for testing linux-next
- Separated patches into smaller portions for easier reviewing
- Replaced open coded shifting of length with PHYS_PFN()
- Changed security calculation (Cc Kees on this change)
- Changed the vma iterator use to point to the targeted area instead of
the previous vma
- Remove page table entries prior to fully removing the vmas, if a
driver may do mappings
- Tested with stress-ng and split/joining of VMA at the start, end, and
middle of a vma.
Liam R. Howlett (15):
mm/mmap: Correctly position vma_iterator in __split_vma()
mm/mmap: Introduce abort_munmap_vmas()
mm/mmap: Introduce vmi_complete_munmap_vmas()
mm/mmap: Extract the gathering of vmas from do_vmi_align_munmap()
mm/mmap: Introduce vma_munmap_struct for use in munmap operations
mm/mmap: Change munmap to use vma_munmap_struct() for accounting and
surrounding vmas
mm/mmap: Extract validate_mm() from vma_complete()
mm/mmap: Inline munmap operation in mmap_region()
mm/mmap: Expand mmap_region() munmap call
mm/mmap: Reposition vma iterator in mmap_region()
mm/mmap: Track start and end of munmap in vma_munmap_struct
mm/mmap: Avoid zeroing vma tree in mmap_region()
mm/mmap: Use PHYS_PFN in mmap_region()
mm/mmap: Use vms accounted pages in mmap_region()
mm/mmap: Move may_expand_vm() check in mmap_region()
mm/internal.h | 25 +++
mm/mmap.c | 449 +++++++++++++++++++++++++++++++-------------------
2 files changed, 304 insertions(+), 170 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 2+ messages in thread* [PATCH v2 15/15] mm/mmap: Move may_expand_vm() check in mmap_region() 2024-06-25 19:11 [PATCH v2 00/15] Avoid MAP_FIXED gap exposure Liam R. Howlett @ 2024-06-25 19:11 ` Liam R. Howlett 0 siblings, 0 replies; 2+ messages in thread From: Liam R. Howlett @ 2024-06-25 19:11 UTC (permalink / raw) To: linux-mm, Andrew Morton Cc: Suren Baghdasaryan, Vlastimil Babka, Lorenzo Stoakes, Matthew Wilcox, sidhartha.kumar, Paul E . McKenney, Bert Karwatzki, Jiri Olsa, linux-kernel, Kees Cook, Liam R. Howlett From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> The MAP_FIXED page count is available after the vms_gather_munmap_vmas() call, so use it instead of looping over the vmas twice. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> --- mm/mmap.c | 36 ++++-------------------------------- 1 file changed, 4 insertions(+), 32 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index adb0bb5ea344..a310b05a01c2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -405,27 +405,6 @@ anon_vma_interval_tree_post_update_vma(struct vm_area_struct *vma) anon_vma_interval_tree_insert(avc, &avc->anon_vma->rb_root); } -static unsigned long count_vma_pages_range(struct mm_struct *mm, - unsigned long addr, unsigned long end, - unsigned long *nr_accounted) -{ - VMA_ITERATOR(vmi, mm, addr); - struct vm_area_struct *vma; - unsigned long nr_pages = 0; - - *nr_accounted = 0; - for_each_vma_range(vmi, vma, end) { - unsigned long vm_start = max(addr, vma->vm_start); - unsigned long vm_end = min(end, vma->vm_end); - - nr_pages += PHYS_PFN(vm_end - vm_start); - if (vma->vm_flags & VM_ACCOUNT) - *nr_accounted += PHYS_PFN(vm_end - vm_start); - } - - return nr_pages; -} - static void __vma_link_file(struct vm_area_struct *vma, struct address_space *mapping) { @@ -2936,17 +2915,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr, pgoff_t vm_pgoff; int error = -ENOMEM; VMA_ITERATOR(vmi, mm, addr); - unsigned long nr_pages, nr_accounted; - - nr_pages = count_vma_pages_range(mm, addr, end, &nr_accounted); - - /* Check against address space limit. */ - /* - * MAP_FIXED may remove pages of mappings that intersects with requested - * mapping. Account for the pages it would unmap. - */ - if (!may_expand_vm(mm, vm_flags, pglen - nr_pages)) - return -ENOMEM; if (unlikely(!can_modify_mm(mm, addr, end))) return -EPERM; @@ -2977,6 +2945,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma_iter_next_range(&vmi); } + /* Check against address space limit. */ + if (!may_expand_vm(mm, vm_flags, pglen - vms.nr_pages)) + goto abort_munmap; + /* * Private writable mapping: check memory availability */ -- 2.43.0 ^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-06-26 6:56 UTC | newest] Thread overview: 2+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-26 6:56 [PATCH v2 15/15] mm/mmap: Move may_expand_vm() check in mmap_region() Bert Karwatzki -- strict thread matches above, loose matches on Subject: below -- 2024-06-25 19:11 [PATCH v2 00/15] Avoid MAP_FIXED gap exposure Liam R. Howlett 2024-06-25 19:11 ` [PATCH v2 15/15] mm/mmap: Move may_expand_vm() check in mmap_region() Liam R. Howlett
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).