From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9D246ECAAD3 for ; Thu, 1 Sep 2022 17:44:01 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4MJT1X2KW0z3fVV for ; Fri, 2 Sep 2022 03:44:00 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=ObA20LIT; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--surenb.bounces.google.com (client-ip=2607:f8b0:4864:20::1149; helo=mail-yw1-x1149.google.com; envelope-from=3e-0qywykdlgqspclzemmejc.amkjglsvnna-bctjgqrq.mxjyzq.mpe@flex--surenb.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=ObA20LIT; dkim-atps=neutral Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4MJSrF6BN8z3bxt for ; Fri, 2 Sep 2022 03:35:57 +1000 (AEST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-335420c7bfeso234906847b3.16 for ; Thu, 01 Sep 2022 10:35:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=ObA20LITrip87VziaHHJxhsQdt90Sartw38u0QjInvigVBBYp9UQt6FfSRSeXBXEZU PLrSrOLMVOwI3/sz3dUO/NCHhkXbIW3GF8M34MWk3xttvYsGitw637zJhtS5J3KbjVTq FLk9jwJNLpWEznbWA1xdOdydu+93KP79dvrPgyK7xujYPxVhRBYXk7sgB7YC6n2PoUnC kIiWKC2Oq5VXot4jMCKtmju4UEkl0k8b2etD7U9K2bdu4kKMwZb5udL9M4YbR2lJrZYA TRfi2Vr1zk+DfsN0D7bvgFUbxdpPUkljp0W3vF7qidlJ9uVGD7Pq+2IeCcnvt21qYUIk bD3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=pgkWD/fEXkbXqxxOIfUZT2FaXWR8WonYyFrIxHlowFUfYEUUNPKoIQ/DjGhJG8CzRJ f5uTKPQkO2hi+u+gY4cmMSSITtBPk+FolORjYc1vJkPfsFwaxX9LieomH8cCQp5v4jOg dT03UQs0wqXkEGHtm7E7sVBvwTOStV4uvGAa5JPZwKgUcZIdSMsT2vw1GBaxioW6IT0D 3H9R+gG94DDcpYdsL5VLVx6EIfycybg6S1e5J6RdGEcEyGyyaOFpVoMzjZuDop9pyxv9 GIFkN0vG49s5xe3nXGFC1fY/xQO/eJWVfEFxJ+2RHWujP94iYM/bseWMsi+DYWwdTTEW Rz4w== X-Gm-Message-State: ACgBeo2Jqa6NDER5afWMfU7MrBP8O4A2yqtw/lZC5r2VEwHbyf6SilWS o0Bxx4cS6UV7XpNUpdB8dU3R/yzIpD8= X-Google-Smtp-Source: AA6agR5CArj0XA3uZUba700jvv45FjGtdD6CkgMKCuV3RkIgq+Sm8gt1hyIVkRx9I0S41UWPyjD3FlSg3pw= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:1bfc:e7ee:6530:4449]) (user=surenb job=sendgmr) by 2002:a25:dd05:0:b0:69c:97aa:f08e with SMTP id u5-20020a25dd05000000b0069c97aaf08emr9069882ybg.583.1662053755821; Thu, 01 Sep 2022 10:35:55 -0700 (PDT) Date: Thu, 1 Sep 2022 10:35:01 -0700 In-Reply-To: <20220901173516.702122-1-surenb@google.com> Mime-Version: 1.0 References: <20220901173516.702122-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.789.g6183377224-goog Message-ID: <20220901173516.702122-14-surenb@google.com> Subject: [RFC PATCH RESEND 13/28] mm: conditionally mark VMA as locked in free_pgtables and unmap_page_range From: Suren Baghdasaryan To: akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: michel@lespinasse.org, joelaf@google.com, songliubraving@fb.com, mhocko@suse.com, david@redhat.com, peterz@infradead.org, bigeasy@linutronix.de, peterx@redhat.com, dhowells@redhat.com, linux-mm@kvack.org, jglisse@google.com, dave@stgolabs.net, minchan@google.com, x86@kernel.org, hughd@google.com, willy@infradead.org, laurent.dufour@fr.ibm.com, mgorman@suse.de, rientjes@google.com, axelrasmussen@google.com, kernel-team@android.com, paulmck@kernel.org, liam.howlett@oracle.com, luto@kernel.org, ldufour@linux.ibm.com, surenb@google.com, vbabka@suse.cz, linux-arm-kernel@lists.infradead.org, kent.overstreet@linux.dev, linux-kernel@vger.kernel.org, hannes@cmpxchg.org, linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" free_pgtables and unmap_page_range functions can be called with mmap_lock held for write (e.g. in mmap_region), held for read (e.g in madvise_pageout) or not held at all (e.g in madvise_remove might drop mmap_lock before calling vfs_fallocate, which ends up calling unmap_page_range). Provide free_pgtables and unmap_page_range with additional argument indicating whether to mark the VMA as locked or not based on the usage. The parameter is set based on whether mmap_lock is held in write mode during the call. This ensures no change in behavior between mmap_lock and per-vma locks. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 2 +- mm/internal.h | 4 ++-- mm/memory.c | 32 +++++++++++++++++++++----------- mm/mmap.c | 17 +++++++++-------- mm/oom_kill.c | 3 ++- 5 files changed, 35 insertions(+), 23 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 476bf936c5f0..dc72be923e5b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1874,7 +1874,7 @@ void zap_vma_ptes(struct vm_area_struct *vma, unsigned long address, void zap_page_range(struct vm_area_struct *vma, unsigned long address, unsigned long size); void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma, - unsigned long start, unsigned long end); + unsigned long start, unsigned long end, bool lock_vma); struct mmu_notifier_range; diff --git a/mm/internal.h b/mm/internal.h index 785409805ed7..e6c0f999e0cb 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -85,14 +85,14 @@ bool __folio_end_writeback(struct folio *folio); void deactivate_file_folio(struct folio *folio); void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, - unsigned long floor, unsigned long ceiling); + unsigned long floor, unsigned long ceiling, bool lock_vma); void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte); struct zap_details; void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, - struct zap_details *details); + struct zap_details *details, bool lock_vma); void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, unsigned int order); diff --git a/mm/memory.c b/mm/memory.c index 4ba73f5aa8bb..9ac9944e8c62 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -403,7 +403,7 @@ void free_pgd_range(struct mmu_gather *tlb, } void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, - unsigned long floor, unsigned long ceiling) + unsigned long floor, unsigned long ceiling, bool lock_vma) { while (vma) { struct vm_area_struct *next = vma->vm_next; @@ -413,6 +413,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, * Hide vma from rmap and truncate_pagecache before freeing * pgtables */ + if (lock_vma) + vma_mark_locked(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); @@ -427,6 +429,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, && !is_vm_hugetlb_page(next)) { vma = next; next = vma->vm_next; + if (lock_vma) + vma_mark_locked(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); } @@ -1631,12 +1635,16 @@ static inline unsigned long zap_p4d_range(struct mmu_gather *tlb, void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, - struct zap_details *details) + struct zap_details *details, + bool lock_vma) { pgd_t *pgd; unsigned long next; BUG_ON(addr >= end); + if (lock_vma) + vma_mark_locked(vma); + tlb_start_vma(tlb, vma); pgd = pgd_offset(vma->vm_mm, addr); do { @@ -1652,7 +1660,7 @@ void unmap_page_range(struct mmu_gather *tlb, static void unmap_single_vma(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr, - struct zap_details *details) + struct zap_details *details, bool lock_vma) { unsigned long start = max(vma->vm_start, start_addr); unsigned long end; @@ -1691,7 +1699,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, i_mmap_unlock_write(vma->vm_file->f_mapping); } } else - unmap_page_range(tlb, vma, start, end, details); + unmap_page_range(tlb, vma, start, end, details, lock_vma); } } @@ -1715,7 +1723,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, */ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, - unsigned long end_addr) + unsigned long end_addr, bool lock_vma) { struct mmu_notifier_range range; struct zap_details details = { @@ -1728,7 +1736,8 @@ void unmap_vmas(struct mmu_gather *tlb, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) - unmap_single_vma(tlb, vma, start_addr, end_addr, &details); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details, + lock_vma); mmu_notifier_invalidate_range_end(&range); } @@ -1753,7 +1762,7 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < range.end; vma = vma->vm_next) - unmap_single_vma(&tlb, vma, start, range.end, NULL); + unmap_single_vma(&tlb, vma, start, range.end, NULL, false); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -1768,7 +1777,7 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, * The range must fit into one VMA. */ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, - unsigned long size, struct zap_details *details) + unsigned long size, struct zap_details *details, bool lock_vma) { struct mmu_notifier_range range; struct mmu_gather tlb; @@ -1779,7 +1788,7 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr tlb_gather_mmu(&tlb, vma->vm_mm); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); - unmap_single_vma(&tlb, vma, address, range.end, details); + unmap_single_vma(&tlb, vma, address, range.end, details, lock_vma); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -1802,7 +1811,7 @@ void zap_vma_ptes(struct vm_area_struct *vma, unsigned long address, !(vma->vm_flags & VM_PFNMAP)) return; - zap_page_range_single(vma, address, size, NULL); + zap_page_range_single(vma, address, size, NULL, true); } EXPORT_SYMBOL_GPL(zap_vma_ptes); @@ -3483,7 +3492,8 @@ static void unmap_mapping_range_vma(struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr, struct zap_details *details) { - zap_page_range_single(vma, start_addr, end_addr - start_addr, details); + zap_page_range_single(vma, start_addr, end_addr - start_addr, details, + false); } static inline void unmap_mapping_range_tree(struct rb_root_cached *root, diff --git a/mm/mmap.c b/mm/mmap.c index 121544fd90de..094678b4434b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -79,7 +79,7 @@ core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, - unsigned long start, unsigned long end); + unsigned long start, unsigned long end, bool lock_vma); static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) { @@ -1866,7 +1866,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_file = NULL; /* Undo any partial mapping done by a device driver. */ - unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end); + unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end, true); if (vm_flags & VM_SHARED) mapping_unmap_writable(file->f_mapping); free_vma: @@ -2626,7 +2626,7 @@ static void remove_vma_list(struct mm_struct *mm, struct vm_area_struct *vma) */ static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, bool lock_vma) { struct vm_area_struct *next = vma_next(mm, prev); struct mmu_gather tlb; @@ -2634,9 +2634,10 @@ static void unmap_region(struct mm_struct *mm, lru_add_drain(); tlb_gather_mmu(&tlb, mm); update_hiwater_rss(mm); - unmap_vmas(&tlb, vma, start, end); + unmap_vmas(&tlb, vma, start, end, lock_vma); free_pgtables(&tlb, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, - next ? next->vm_start : USER_PGTABLES_CEILING); + next ? next->vm_start : USER_PGTABLES_CEILING, + lock_vma); tlb_finish_mmu(&tlb); } @@ -2849,7 +2850,7 @@ int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, if (downgrade) mmap_write_downgrade(mm); - unmap_region(mm, vma, prev, start, end); + unmap_region(mm, vma, prev, start, end, !downgrade); /* Fix up all other VM information */ remove_vma_list(mm, vma); @@ -3129,8 +3130,8 @@ void exit_mmap(struct mm_struct *mm) tlb_gather_mmu_fullmm(&tlb, mm); /* update_hiwater_rss(mm) here? but nobody should be looking */ /* Use -1 here to ensure all VMAs in the mm are unmapped */ - unmap_vmas(&tlb, vma, 0, -1); - free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); + unmap_vmas(&tlb, vma, 0, -1, true); + free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING, true); tlb_finish_mmu(&tlb); /* Walk the list again, actually closing and freeing it. */ diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 3c6cf9e3cd66..6ffa7c511aa3 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -549,7 +549,8 @@ bool __oom_reap_task_mm(struct mm_struct *mm) ret = false; continue; } - unmap_page_range(&tlb, vma, range.start, range.end, NULL); + unmap_page_range(&tlb, vma, range.start, range.end, + NULL, false); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } -- 2.37.2.789.g6183377224-goog From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 52602ECAAD3 for ; Thu, 1 Sep 2022 18:07:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=yd/W78/L/8Yafmn93Jggs0IDF1BolvZ79/Fw0fv3k1Y=; b=dX0ju5Ca1wQkRSeAA0b+bBHvH1 WRSdOdhiOgKe3TyJZriKVyanenCvvYYSBq2Kr1g36l/9qPZj5tTiWqPaNpGrrnB2csctVoolBvZOz NZn8ohWycGONqBPr9MKrbm3rxfyMy/SkTM1nOcpYaZ09szIsgr9eM+yb7NvFqn2fjb07NTgAQwy6B OoW5Ao36wb3Yx8wfnwCPjTlWJ4CSQMJqF0bcLHiF9SoI5XSYn2Pz8NK8n6NLHhRdPJMSAAWum4oTd ieMFykxRIB6v0yt9MY7/u/fIyyFxp7qyNYqIPR4RV21ZhxGml1Nmau3wA5NA3rWpiHOr3iY3o8FCq 77DGvh5Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oToZm-00DnOG-Vg; Thu, 01 Sep 2022 18:06:04 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oToXS-00Dm5L-TP for linux-arm-kernel@bombadil.infradead.org; Thu, 01 Sep 2022 18:03:39 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Cc:To:From:Subject: Message-ID:References:Mime-Version:In-Reply-To:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=Jz/EgR46kmXoChR/R6mTWeExUp 8dFoyKZ0VGQA3HzDRj+0ZCMZKkk+HssKyQOeEMQnHmcGu3ytZgR0bBVmWuWB4XjV7Gm0mC6UjPI5u /vlJaPLtO+8nRdQpTRKuV9vvP47H4z5wr67sCKBMVDL7E+Nh2UBAyneXFFZXMpxwtyK8t6MoheO11 GKbk6viOMuELBGPPL57Dm3XtOi46/OeDvoYI3UfRFQycc6KgJMtKqnfA/2GpYk28x2p98pxJrv1Y0 swJ3XM9Xxw9HsB4SwVkTd4gKXbnqu0945h02e2ef4uXYjAki/HF7VgucJ1oYtoJcsIkfpCTvZK/3X Td4QqWqA==; Received: from mail-yw1-x1149.google.com ([2607:f8b0:4864:20::1149]) by desiato.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oTo6f-008RiZ-9s for linux-arm-kernel@lists.infradead.org; Thu, 01 Sep 2022 17:35:59 +0000 Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-33dc390f26cso233044717b3.9 for ; Thu, 01 Sep 2022 10:35:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=ObA20LITrip87VziaHHJxhsQdt90Sartw38u0QjInvigVBBYp9UQt6FfSRSeXBXEZU PLrSrOLMVOwI3/sz3dUO/NCHhkXbIW3GF8M34MWk3xttvYsGitw637zJhtS5J3KbjVTq FLk9jwJNLpWEznbWA1xdOdydu+93KP79dvrPgyK7xujYPxVhRBYXk7sgB7YC6n2PoUnC kIiWKC2Oq5VXot4jMCKtmju4UEkl0k8b2etD7U9K2bdu4kKMwZb5udL9M4YbR2lJrZYA TRfi2Vr1zk+DfsN0D7bvgFUbxdpPUkljp0W3vF7qidlJ9uVGD7Pq+2IeCcnvt21qYUIk bD3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=oYvAvCyK3T6yzKSx1ubxubCU+3IOMLINk20f56J4CHo81Lg+MuSKlVn/xcCX4zRpw1 L7Jm0W0R0fkCaDeh8qYHZDan+KpphC+YrVQ0JfRwXVooBepK+AHbIMPNxft99EXhYDpv xUktGBnUbwtlNZ2qSy2DDgkH9dlMIuWDEfE7R+HPJQ89xOr+o2l4R6n4J7qdDTMT/tAs EdoCZWhDI/PzCd8wKXNbR+qnSzeV4mzB9RSdRJNPnD2rMi6vmQhlP0rWlD93dMKdtmRU mY1ZyWyApVk6cHxGYSw5QItPsoSeHS9QlpdyobLqE5CuiuazNm8UUnOl1b0xbpKRnt+5 Oc3A== X-Gm-Message-State: ACgBeo3IoWMP/y+xhAvDtBv2YFcNnt06TlTIdaZNBvRxjul/9LlbAR2h 7x6w9DDF3pF6Tm3D/JRGG5zVegng9OI= X-Google-Smtp-Source: AA6agR5CArj0XA3uZUba700jvv45FjGtdD6CkgMKCuV3RkIgq+Sm8gt1hyIVkRx9I0S41UWPyjD3FlSg3pw= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:1bfc:e7ee:6530:4449]) (user=surenb job=sendgmr) by 2002:a25:dd05:0:b0:69c:97aa:f08e with SMTP id u5-20020a25dd05000000b0069c97aaf08emr9069882ybg.583.1662053755821; Thu, 01 Sep 2022 10:35:55 -0700 (PDT) Date: Thu, 1 Sep 2022 10:35:01 -0700 In-Reply-To: <20220901173516.702122-1-surenb@google.com> Mime-Version: 1.0 References: <20220901173516.702122-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.789.g6183377224-goog Message-ID: <20220901173516.702122-14-surenb@google.com> Subject: [RFC PATCH RESEND 13/28] mm: conditionally mark VMA as locked in free_pgtables and unmap_page_range From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220901_183557_500360_5990771A X-CRM114-Status: GOOD ( 19.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org free_pgtables and unmap_page_range functions can be called with mmap_lock held for write (e.g. in mmap_region), held for read (e.g in madvise_pageout) or not held at all (e.g in madvise_remove might drop mmap_lock before calling vfs_fallocate, which ends up calling unmap_page_range). Provide free_pgtables and unmap_page_range with additional argument indicating whether to mark the VMA as locked or not based on the usage. The parameter is set based on whether mmap_lock is held in write mode during the call. This ensures no change in behavior between mmap_lock and per-vma locks. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 2 +- mm/internal.h | 4 ++-- mm/memory.c | 32 +++++++++++++++++++++----------- mm/mmap.c | 17 +++++++++-------- mm/oom_kill.c | 3 ++- 5 files changed, 35 insertions(+), 23 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 476bf936c5f0..dc72be923e5b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1874,7 +1874,7 @@ void zap_vma_ptes(struct vm_area_struct *vma, unsigned long address, void zap_page_range(struct vm_area_struct *vma, unsigned long address, unsigned long size); void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma, - unsigned long start, unsigned long end); + unsigned long start, unsigned long end, bool lock_vma); struct mmu_notifier_range; diff --git a/mm/internal.h b/mm/internal.h index 785409805ed7..e6c0f999e0cb 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -85,14 +85,14 @@ bool __folio_end_writeback(struct folio *folio); void deactivate_file_folio(struct folio *folio); void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, - unsigned long floor, unsigned long ceiling); + unsigned long floor, unsigned long ceiling, bool lock_vma); void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte); struct zap_details; void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, - struct zap_details *details); + struct zap_details *details, bool lock_vma); void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, unsigned int order); diff --git a/mm/memory.c b/mm/memory.c index 4ba73f5aa8bb..9ac9944e8c62 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -403,7 +403,7 @@ void free_pgd_range(struct mmu_gather *tlb, } void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, - unsigned long floor, unsigned long ceiling) + unsigned long floor, unsigned long ceiling, bool lock_vma) { while (vma) { struct vm_area_struct *next = vma->vm_next; @@ -413,6 +413,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, * Hide vma from rmap and truncate_pagecache before freeing * pgtables */ + if (lock_vma) + vma_mark_locked(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); @@ -427,6 +429,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, && !is_vm_hugetlb_page(next)) { vma = next; next = vma->vm_next; + if (lock_vma) + vma_mark_locked(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); } @@ -1631,12 +1635,16 @@ static inline unsigned long zap_p4d_range(struct mmu_gather *tlb, void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, - struct zap_details *details) + struct zap_details *details, + bool lock_vma) { pgd_t *pgd; unsigned long next; BUG_ON(addr >= end); + if (lock_vma) + vma_mark_locked(vma); + tlb_start_vma(tlb, vma); pgd = pgd_offset(vma->vm_mm, addr); do { @@ -1652,7 +1660,7 @@ void unmap_page_range(struct mmu_gather *tlb, static void unmap_single_vma(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr, - struct zap_details *details) + struct zap_details *details, bool lock_vma) { unsigned long start = max(vma->vm_start, start_addr); unsigned long end; @@ -1691,7 +1699,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, i_mmap_unlock_write(vma->vm_file->f_mapping); } } else - unmap_page_range(tlb, vma, start, end, details); + unmap_page_range(tlb, vma, start, end, details, lock_vma); } } @@ -1715,7 +1723,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, */ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, - unsigned long end_addr) + unsigned long end_addr, bool lock_vma) { struct mmu_notifier_range range; struct zap_details details = { @@ -1728,7 +1736,8 @@ void unmap_vmas(struct mmu_gather *tlb, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) - unmap_single_vma(tlb, vma, start_addr, end_addr, &details); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details, + lock_vma); mmu_notifier_invalidate_range_end(&range); } @@ -1753,7 +1762,7 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < range.end; vma = vma->vm_next) - unmap_single_vma(&tlb, vma, start, range.end, NULL); + unmap_single_vma(&tlb, vma, start, range.end, NULL, false); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -1768,7 +1777,7 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, * The range must fit into one VMA. */ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, - unsigned long size, struct zap_details *details) + unsigned long size, struct zap_details *details, bool lock_vma) { struct mmu_notifier_range range; struct mmu_gather tlb; @@ -1779,7 +1788,7 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr tlb_gather_mmu(&tlb, vma->vm_mm); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); - unmap_single_vma(&tlb, vma, address, range.end, details); + unmap_single_vma(&tlb, vma, address, range.end, details, lock_vma); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -1802,7 +1811,7 @@ void zap_vma_ptes(struct vm_area_struct *vma, unsigned long address, !(vma->vm_flags & VM_PFNMAP)) return; - zap_page_range_single(vma, address, size, NULL); + zap_page_range_single(vma, address, size, NULL, true); } EXPORT_SYMBOL_GPL(zap_vma_ptes); @@ -3483,7 +3492,8 @@ static void unmap_mapping_range_vma(struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr, struct zap_details *details) { - zap_page_range_single(vma, start_addr, end_addr - start_addr, details); + zap_page_range_single(vma, start_addr, end_addr - start_addr, details, + false); } static inline void unmap_mapping_range_tree(struct rb_root_cached *root, diff --git a/mm/mmap.c b/mm/mmap.c index 121544fd90de..094678b4434b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -79,7 +79,7 @@ core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, - unsigned long start, unsigned long end); + unsigned long start, unsigned long end, bool lock_vma); static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) { @@ -1866,7 +1866,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_file = NULL; /* Undo any partial mapping done by a device driver. */ - unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end); + unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end, true); if (vm_flags & VM_SHARED) mapping_unmap_writable(file->f_mapping); free_vma: @@ -2626,7 +2626,7 @@ static void remove_vma_list(struct mm_struct *mm, struct vm_area_struct *vma) */ static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, bool lock_vma) { struct vm_area_struct *next = vma_next(mm, prev); struct mmu_gather tlb; @@ -2634,9 +2634,10 @@ static void unmap_region(struct mm_struct *mm, lru_add_drain(); tlb_gather_mmu(&tlb, mm); update_hiwater_rss(mm); - unmap_vmas(&tlb, vma, start, end); + unmap_vmas(&tlb, vma, start, end, lock_vma); free_pgtables(&tlb, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, - next ? next->vm_start : USER_PGTABLES_CEILING); + next ? next->vm_start : USER_PGTABLES_CEILING, + lock_vma); tlb_finish_mmu(&tlb); } @@ -2849,7 +2850,7 @@ int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, if (downgrade) mmap_write_downgrade(mm); - unmap_region(mm, vma, prev, start, end); + unmap_region(mm, vma, prev, start, end, !downgrade); /* Fix up all other VM information */ remove_vma_list(mm, vma); @@ -3129,8 +3130,8 @@ void exit_mmap(struct mm_struct *mm) tlb_gather_mmu_fullmm(&tlb, mm); /* update_hiwater_rss(mm) here? but nobody should be looking */ /* Use -1 here to ensure all VMAs in the mm are unmapped */ - unmap_vmas(&tlb, vma, 0, -1); - free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); + unmap_vmas(&tlb, vma, 0, -1, true); + free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING, true); tlb_finish_mmu(&tlb); /* Walk the list again, actually closing and freeing it. */ diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 3c6cf9e3cd66..6ffa7c511aa3 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -549,7 +549,8 @@ bool __oom_reap_task_mm(struct mm_struct *mm) ret = false; continue; } - unmap_page_range(&tlb, vma, range.start, range.end, NULL); + unmap_page_range(&tlb, vma, range.start, range.end, + NULL, false); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } -- 2.37.2.789.g6183377224-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88696ECAAD3 for ; Thu, 1 Sep 2022 17:35:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A17F80023; Thu, 1 Sep 2022 13:35:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1DB828000D; Thu, 1 Sep 2022 13:35:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 054D680023; Thu, 1 Sep 2022 13:35:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E6E938000D for ; Thu, 1 Sep 2022 13:35:56 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CBF8E1A1048 for ; Thu, 1 Sep 2022 17:35:56 +0000 (UTC) X-FDA: 79864219512.07.3F6A51E Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf19.hostedemail.com (Postfix) with ESMTP id 7BA691A0040 for ; Thu, 1 Sep 2022 17:35:56 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-33daeaa6b8eso233295317b3.7 for ; Thu, 01 Sep 2022 10:35:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=ObA20LITrip87VziaHHJxhsQdt90Sartw38u0QjInvigVBBYp9UQt6FfSRSeXBXEZU PLrSrOLMVOwI3/sz3dUO/NCHhkXbIW3GF8M34MWk3xttvYsGitw637zJhtS5J3KbjVTq FLk9jwJNLpWEznbWA1xdOdydu+93KP79dvrPgyK7xujYPxVhRBYXk7sgB7YC6n2PoUnC kIiWKC2Oq5VXot4jMCKtmju4UEkl0k8b2etD7U9K2bdu4kKMwZb5udL9M4YbR2lJrZYA TRfi2Vr1zk+DfsN0D7bvgFUbxdpPUkljp0W3vF7qidlJ9uVGD7Pq+2IeCcnvt21qYUIk bD3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=dQha4rBdcD1KnryQOplMJ840yTP11L+3oEnd1yh55cuKqjWlvDs+w9MH8nEuWBGK+F tm361EhJF6EH/82KEZ0LR9CwH/Ndx4DN/J9Gm7DVtri/iYr9fEkGj3kq6v6iBBIw8PiZ nm28aUb85ef8f/JbKzkxyQdVDxOYU+z3d6Q8z3LNcg15Qge3Q6X5E5XWdY9ZY0/0otwv 0JvFcz32wUv55prTtnEoDuarn69lx0ctOjnTc0dz8/flB1k3cu2CfdQy0zsqpfSGHha0 wbJT0HunHG2Cq+gWL7XzgT3vRD5nsseeGioT55R3u9H8LkNfw+zwtAWg5RMCrsk2a5FX jQMQ== X-Gm-Message-State: ACgBeo0fz9hmG8rwpN68yRLxNQoGJJyrckzMKcBnNGevLV4rVQE9FCdo Dehf4Zed30Qk1vlREsgnjDJG/mD8sQM= X-Google-Smtp-Source: AA6agR5CArj0XA3uZUba700jvv45FjGtdD6CkgMKCuV3RkIgq+Sm8gt1hyIVkRx9I0S41UWPyjD3FlSg3pw= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:1bfc:e7ee:6530:4449]) (user=surenb job=sendgmr) by 2002:a25:dd05:0:b0:69c:97aa:f08e with SMTP id u5-20020a25dd05000000b0069c97aaf08emr9069882ybg.583.1662053755821; Thu, 01 Sep 2022 10:35:55 -0700 (PDT) Date: Thu, 1 Sep 2022 10:35:01 -0700 In-Reply-To: <20220901173516.702122-1-surenb@google.com> Mime-Version: 1.0 References: <20220901173516.702122-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.789.g6183377224-goog Message-ID: <20220901173516.702122-14-surenb@google.com> Subject: [RFC PATCH RESEND 13/28] mm: conditionally mark VMA as locked in free_pgtables and unmap_page_range From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662053756; a=rsa-sha256; cv=none; b=ETvp4Hav2rJDP0gXSIi7hfLtH9flJBiBj2cncAXtevvhPOmOKO5m6eWJ53BGEyL23QWK1V 150DY3A+dG3r3Tv+lb3Vz2x4f65KtJCLO0B9D++/NGCVnz2S2Y2BYTBFSBmV0EXuBvN2Qx ZwfWnosXfC6hukYl4BeIXkIyQPoUkYQ= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ObA20LIT; spf=pass (imf19.hostedemail.com: domain of 3e-0QYwYKCLgqspclZemmejc.amkjglsv-kkitYai.mpe@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3e-0QYwYKCLgqspclZemmejc.amkjglsv-kkitYai.mpe@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662053756; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=r773C9YRjCBNvMT4CzLpPbmrmyQxRNwpck2TtfCck1k=; b=3UectlnCLWvAHh+f06yZAp7LBsRCXpQnEDX5fd+6YdCzedBBQoyGEC7v3YLTmlTep5DNpw PsOBVRjWqQ1/xNsNUvUZ4+bxvu572eVU134zRnoz8CxY68MIorzNBM4VBCtf2PsIEgdy6C 3xoBsYiauygYVZUzI8RiaPPg/tqZJ7U= X-Stat-Signature: e9b7beztx8jasi4zk6dxsbd8rcu9gfjo X-Rspam-User: X-Rspamd-Queue-Id: 7BA691A0040 X-Rspamd-Server: rspam07 Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ObA20LIT; spf=pass (imf19.hostedemail.com: domain of 3e-0QYwYKCLgqspclZemmejc.amkjglsv-kkitYai.mpe@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3e-0QYwYKCLgqspclZemmejc.amkjglsv-kkitYai.mpe@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1662053756-247455 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: free_pgtables and unmap_page_range functions can be called with mmap_lock held for write (e.g. in mmap_region), held for read (e.g in madvise_pageout) or not held at all (e.g in madvise_remove might drop mmap_lock before calling vfs_fallocate, which ends up calling unmap_page_range). Provide free_pgtables and unmap_page_range with additional argument indicating whether to mark the VMA as locked or not based on the usage. The parameter is set based on whether mmap_lock is held in write mode during the call. This ensures no change in behavior between mmap_lock and per-vma locks. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 2 +- mm/internal.h | 4 ++-- mm/memory.c | 32 +++++++++++++++++++++----------- mm/mmap.c | 17 +++++++++-------- mm/oom_kill.c | 3 ++- 5 files changed, 35 insertions(+), 23 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 476bf936c5f0..dc72be923e5b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1874,7 +1874,7 @@ void zap_vma_ptes(struct vm_area_struct *vma, unsigned long address, void zap_page_range(struct vm_area_struct *vma, unsigned long address, unsigned long size); void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma, - unsigned long start, unsigned long end); + unsigned long start, unsigned long end, bool lock_vma); struct mmu_notifier_range; diff --git a/mm/internal.h b/mm/internal.h index 785409805ed7..e6c0f999e0cb 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -85,14 +85,14 @@ bool __folio_end_writeback(struct folio *folio); void deactivate_file_folio(struct folio *folio); void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, - unsigned long floor, unsigned long ceiling); + unsigned long floor, unsigned long ceiling, bool lock_vma); void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte); struct zap_details; void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, - struct zap_details *details); + struct zap_details *details, bool lock_vma); void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, unsigned int order); diff --git a/mm/memory.c b/mm/memory.c index 4ba73f5aa8bb..9ac9944e8c62 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -403,7 +403,7 @@ void free_pgd_range(struct mmu_gather *tlb, } void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, - unsigned long floor, unsigned long ceiling) + unsigned long floor, unsigned long ceiling, bool lock_vma) { while (vma) { struct vm_area_struct *next = vma->vm_next; @@ -413,6 +413,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, * Hide vma from rmap and truncate_pagecache before freeing * pgtables */ + if (lock_vma) + vma_mark_locked(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); @@ -427,6 +429,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, && !is_vm_hugetlb_page(next)) { vma = next; next = vma->vm_next; + if (lock_vma) + vma_mark_locked(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); } @@ -1631,12 +1635,16 @@ static inline unsigned long zap_p4d_range(struct mmu_gather *tlb, void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, - struct zap_details *details) + struct zap_details *details, + bool lock_vma) { pgd_t *pgd; unsigned long next; BUG_ON(addr >= end); + if (lock_vma) + vma_mark_locked(vma); + tlb_start_vma(tlb, vma); pgd = pgd_offset(vma->vm_mm, addr); do { @@ -1652,7 +1660,7 @@ void unmap_page_range(struct mmu_gather *tlb, static void unmap_single_vma(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr, - struct zap_details *details) + struct zap_details *details, bool lock_vma) { unsigned long start = max(vma->vm_start, start_addr); unsigned long end; @@ -1691,7 +1699,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, i_mmap_unlock_write(vma->vm_file->f_mapping); } } else - unmap_page_range(tlb, vma, start, end, details); + unmap_page_range(tlb, vma, start, end, details, lock_vma); } } @@ -1715,7 +1723,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, */ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, - unsigned long end_addr) + unsigned long end_addr, bool lock_vma) { struct mmu_notifier_range range; struct zap_details details = { @@ -1728,7 +1736,8 @@ void unmap_vmas(struct mmu_gather *tlb, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) - unmap_single_vma(tlb, vma, start_addr, end_addr, &details); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details, + lock_vma); mmu_notifier_invalidate_range_end(&range); } @@ -1753,7 +1762,7 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < range.end; vma = vma->vm_next) - unmap_single_vma(&tlb, vma, start, range.end, NULL); + unmap_single_vma(&tlb, vma, start, range.end, NULL, false); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -1768,7 +1777,7 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, * The range must fit into one VMA. */ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, - unsigned long size, struct zap_details *details) + unsigned long size, struct zap_details *details, bool lock_vma) { struct mmu_notifier_range range; struct mmu_gather tlb; @@ -1779,7 +1788,7 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr tlb_gather_mmu(&tlb, vma->vm_mm); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); - unmap_single_vma(&tlb, vma, address, range.end, details); + unmap_single_vma(&tlb, vma, address, range.end, details, lock_vma); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -1802,7 +1811,7 @@ void zap_vma_ptes(struct vm_area_struct *vma, unsigned long address, !(vma->vm_flags & VM_PFNMAP)) return; - zap_page_range_single(vma, address, size, NULL); + zap_page_range_single(vma, address, size, NULL, true); } EXPORT_SYMBOL_GPL(zap_vma_ptes); @@ -3483,7 +3492,8 @@ static void unmap_mapping_range_vma(struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr, struct zap_details *details) { - zap_page_range_single(vma, start_addr, end_addr - start_addr, details); + zap_page_range_single(vma, start_addr, end_addr - start_addr, details, + false); } static inline void unmap_mapping_range_tree(struct rb_root_cached *root, diff --git a/mm/mmap.c b/mm/mmap.c index 121544fd90de..094678b4434b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -79,7 +79,7 @@ core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, - unsigned long start, unsigned long end); + unsigned long start, unsigned long end, bool lock_vma); static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) { @@ -1866,7 +1866,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_file = NULL; /* Undo any partial mapping done by a device driver. */ - unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end); + unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end, true); if (vm_flags & VM_SHARED) mapping_unmap_writable(file->f_mapping); free_vma: @@ -2626,7 +2626,7 @@ static void remove_vma_list(struct mm_struct *mm, struct vm_area_struct *vma) */ static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, bool lock_vma) { struct vm_area_struct *next = vma_next(mm, prev); struct mmu_gather tlb; @@ -2634,9 +2634,10 @@ static void unmap_region(struct mm_struct *mm, lru_add_drain(); tlb_gather_mmu(&tlb, mm); update_hiwater_rss(mm); - unmap_vmas(&tlb, vma, start, end); + unmap_vmas(&tlb, vma, start, end, lock_vma); free_pgtables(&tlb, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, - next ? next->vm_start : USER_PGTABLES_CEILING); + next ? next->vm_start : USER_PGTABLES_CEILING, + lock_vma); tlb_finish_mmu(&tlb); } @@ -2849,7 +2850,7 @@ int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, if (downgrade) mmap_write_downgrade(mm); - unmap_region(mm, vma, prev, start, end); + unmap_region(mm, vma, prev, start, end, !downgrade); /* Fix up all other VM information */ remove_vma_list(mm, vma); @@ -3129,8 +3130,8 @@ void exit_mmap(struct mm_struct *mm) tlb_gather_mmu_fullmm(&tlb, mm); /* update_hiwater_rss(mm) here? but nobody should be looking */ /* Use -1 here to ensure all VMAs in the mm are unmapped */ - unmap_vmas(&tlb, vma, 0, -1); - free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); + unmap_vmas(&tlb, vma, 0, -1, true); + free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING, true); tlb_finish_mmu(&tlb); /* Walk the list again, actually closing and freeing it. */ diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 3c6cf9e3cd66..6ffa7c511aa3 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -549,7 +549,8 @@ bool __oom_reap_task_mm(struct mm_struct *mm) ret = false; continue; } - unmap_page_range(&tlb, vma, range.start, range.end, NULL); + unmap_page_range(&tlb, vma, range.start, range.end, + NULL, false); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } -- 2.37.2.789.g6183377224-goog