From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6373C83F1A for ; Fri, 18 Jul 2025 09:03:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A82D6B009D; Fri, 18 Jul 2025 05:03:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 758B86B009F; Fri, 18 Jul 2025 05:03:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 646EB6B00A4; Fri, 18 Jul 2025 05:03:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4EBDA6B009D for ; Fri, 18 Jul 2025 05:03:20 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CC22910FDCC for ; Fri, 18 Jul 2025 09:03:19 +0000 (UTC) X-FDA: 83676796518.02.C43F286 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf13.hostedemail.com (Postfix) with ESMTP id 2629E20009 for ; Fri, 18 Jul 2025 09:03:17 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf13.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752829398; a=rsa-sha256; cv=none; b=QZRvrppJv3MKHwecFFNYB27b+C6YyeLfOiIcdLKwLlmrYfCzCtZ2b2qyR2pKT1hOj6AWeE LMOYsxGQ3fuwDPZWisGIzaVTJz83UNY/QcmlAcmIrllSjQf0ahn983RfOKw1mMI6UjrWVA UQPnj/QbMTXmDGuBMLWk4o0YWKiv36E= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf13.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752829398; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ge5yNcYoGO8Q8M1beansVs9YraJQPdsSzpie8QIQsEM=; b=Opj/Bf/UctTw8Wa+y5VpQHWqogGtZn1lgJSu1Fo2vANB9ngBM9MNiJCkpHp16I6XA9Ybto MjqpnFoAwEpKJQVEFEbNAYlm8d9T+kxVCqEWrlVXhqI2iqCPYugnHQ1FC/+KcUZQYvin7F pSYN5B+sRIppH/wrngU4LSDBfpjLJVQ= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0BC46176C; Fri, 18 Jul 2025 02:03:10 -0700 (PDT) Received: from MacBook-Pro.blr.arm.com (unknown [10.164.18.51]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5CCBC3F66E; Fri, 18 Jul 2025 02:03:09 -0700 (PDT) From: Dev Jain To: akpm@linux-foundation.org Cc: ryan.roberts@arm.com, david@redhat.com, willy@infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, catalin.marinas@arm.com, will@kernel.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, vbabka@suse.cz, jannh@google.com, anshuman.khandual@arm.com, peterx@redhat.com, joey.gouly@arm.com, ioworker0@gmail.com, baohua@kernel.org, kevin.brodsky@arm.com, quic_zhenhuah@quicinc.com, christophe.leroy@csgroup.eu, yangyicong@hisilicon.com, linux-arm-kernel@lists.infradead.org, hughd@google.com, yang@os.amperecomputing.com, ziy@nvidia.com, Dev Jain Subject: [PATCH v5 2/7] mm: Optimize mprotect() for MM_CP_PROT_NUMA by batch-skipping PTEs Date: Fri, 18 Jul 2025 14:32:39 +0530 Message-Id: <20250718090244.21092-3-dev.jain@arm.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250718090244.21092-1-dev.jain@arm.com> References: <20250718090244.21092-1-dev.jain@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 2629E20009 X-Stat-Signature: 991q7ju9g5ccu4yy61zd7zujb6rdwped X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1752829397-97911 X-HE-Meta: U2FsdGVkX1+kDzgZBxD58U0Ow+CK2ZupXWTmD2eRgcMz5Vw02yTQTlUf/X4xEMeoDWVY0jRdLcJEM8lPnjCRVowxng1luRGw4sCHy/dRE0ZoOecFhpr+g9G4QD5a+8oKdjatgNIz51+62G2972b46SnTnQ8igL1B6n0kDcvV7HydF/AdpCCoDqc9j31mfXsZRENFEZcOT2h+5JDdTyNSMt4jJ7a7R8NJwizJlxnrtCWnuA4fznvvl2WdKQL4b2ficGJmQwqA3B/rmQBe7NdhVwz5HfH1uS5O39Yl6t48spq1cyVxLJMiNPN9qpK3WIF1x7PkhddepQhyC2TQhC+sojAkOpg2Q1jWHR9loir4QFtlKEJYtLRcvWj0A0Rg8/N97qoo09x9cy0d/UVKHW5hWWsVzZL9LyaSWoceal2FQGLNKsv5LU7Wduro5sruzjhqJyah6RILwdSade2oFiLQNxIXQR3HqVJI3Fl4/d6BC0tjZ0gHlseSxnvE7d+T4358iXiEpD4qkYu3o5I9oaQQru5O9cf6qtnVEb3oH/1IQexCF6m6VJNyf4lp1VwbcjXxjQUbA3L2MbVbrg8M6h0uOve7ln9Tk/e4eWLmWHLZIZKhjwiHDiQg8ZHU2VzSTZDweVtq6FDEtiJQXrjAb1bnGnqTr3IQR58I++fOnDfWO4PZiI8GniMoV0jqHVLRSVrVxIwulIMZZMPPvxpDeRo9WqdIiVzGUtQmpBG7UX8SgcA7Vcmcc9klCRb2HKLg3qK1gEG+sqyHPh4HNwPIq8qc4LOvCIvdLNS6hZNYJ+A8i64YsHcc7EOdCvLFtsRak8cmp8okmRkZseE4f5Pw/biKcMW4Wz/gsCzIFl6tqctQxmA294j2TF2V2XkYMzcpZnW/V1eaYp4MtePJBcNZ1ZayowomkDVppYNPnd9Eb3Hl0F5Vbq0aTpREySgvH1UtoUahWL13kNcCLSJxskO/0d8 cg87e5sd roEjV4jC9UgZ6HnavPVpTG6ZMJLSNpwut1BlbUzr132hn/4y4xvu+t2lF1R/Rv7TagjCvkHvUjZvZhKknpl6Nrw6aIvr+EXah0rgREcl49M7rE2fvdz8CV49r+66C+GJ66I5De4oXMUFnS990DMU6X6zdnoVHG/wMnM9hFm2r1LMRLdv8DwmsqAM81RumM4Z3ZGfbB3jmgaSzU3Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For the MM_CP_PROT_NUMA skipping case, observe that, if we skip an iteration due to the underlying folio satisfying any of the skip conditions, then for all subsequent ptes which map the same folio, the iteration will be skipped for them too. Therefore, we can optimize by using folio_pte_batch() to batch skip the iterations. Use prot_numa_skip() introduced in the previous patch to determine whether we need to skip the iteration. Change its signature to have a double pointer to a folio, which will be used by mprotect_folio_pte_batch() to determine the number of iterations we can safely skip. Signed-off-by: Dev Jain --- mm/mprotect.c | 55 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 42 insertions(+), 13 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 2a9c73bd0778..97adc62c50ab 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -83,28 +83,43 @@ bool can_change_pte_writable(struct vm_area_struct *vma, unsigned long addr, return pte_dirty(pte); } +static int mprotect_folio_pte_batch(struct folio *folio, pte_t *ptep, + pte_t pte, int max_nr_ptes) +{ + /* No underlying folio, so cannot batch */ + if (!folio) + return 1; + + if (!folio_test_large(folio)) + return 1; + + return folio_pte_batch(folio, ptep, pte, max_nr_ptes); +} + static bool prot_numa_skip(struct vm_area_struct *vma, unsigned long addr, - pte_t oldpte, pte_t *pte, int target_node) + pte_t oldpte, pte_t *pte, int target_node, + struct folio **foliop) { - struct folio *folio; + struct folio *folio = NULL; + bool ret = true; bool toptier; int nid; /* Avoid TLB flush if possible */ if (pte_protnone(oldpte)) - return true; + goto skip; folio = vm_normal_folio(vma, addr, oldpte); if (!folio) - return true; + goto skip; if (folio_is_zone_device(folio) || folio_test_ksm(folio)) - return true; + goto skip; /* Also skip shared copy-on-write pages */ if (is_cow_mapping(vma->vm_flags) && (folio_maybe_dma_pinned(folio) || folio_maybe_mapped_shared(folio))) - return true; + goto skip; /* * While migration can move some dirty pages, @@ -112,7 +127,7 @@ static bool prot_numa_skip(struct vm_area_struct *vma, unsigned long addr, * context. */ if (folio_is_file_lru(folio) && folio_test_dirty(folio)) - return true; + goto skip; /* * Don't mess with PTEs if page is already on the node @@ -120,7 +135,7 @@ static bool prot_numa_skip(struct vm_area_struct *vma, unsigned long addr, */ nid = folio_nid(folio); if (target_node == nid) - return true; + goto skip; toptier = node_is_toptier(nid); @@ -129,11 +144,15 @@ static bool prot_numa_skip(struct vm_area_struct *vma, unsigned long addr, * balancing is disabled */ if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) && toptier) - return true; + goto skip; + ret = false; if (folio_use_access_time(folio)) folio_xchg_access_time(folio, jiffies_to_msecs(jiffies)); - return false; + +skip: + *foliop = folio; + return ret; } static long change_pte_range(struct mmu_gather *tlb, @@ -147,6 +166,7 @@ static long change_pte_range(struct mmu_gather *tlb, bool prot_numa = cp_flags & MM_CP_PROT_NUMA; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; + int nr_ptes; tlb_change_page_size(tlb, PAGE_SIZE); pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); @@ -161,8 +181,11 @@ static long change_pte_range(struct mmu_gather *tlb, flush_tlb_batched_pending(vma->vm_mm); arch_enter_lazy_mmu_mode(); do { + nr_ptes = 1; oldpte = ptep_get(pte); if (pte_present(oldpte)) { + int max_nr_ptes = (end - addr) >> PAGE_SHIFT; + struct folio *folio; pte_t ptent; /* @@ -170,9 +193,15 @@ static long change_pte_range(struct mmu_gather *tlb, * pages. See similar comment in change_huge_pmd. */ if (prot_numa) { - if (prot_numa_skip(vma, addr, oldpte, pte, - target_node)) + int ret = prot_numa_skip(vma, addr, oldpte, pte, + target_node, &folio); + if (ret) { + + /* determine batch to skip */ + nr_ptes = mprotect_folio_pte_batch(folio, + pte, oldpte, max_nr_ptes); continue; + } } oldpte = ptep_modify_prot_start(vma, addr, pte); @@ -289,7 +318,7 @@ static long change_pte_range(struct mmu_gather *tlb, pages++; } } - } while (pte++, addr += PAGE_SIZE, addr != end); + } while (pte += nr_ptes, addr += nr_ptes * PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); pte_unmap_unlock(pte - 1, ptl); -- 2.30.2