From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 169E8204012 for ; Fri, 18 Oct 2024 17:27:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729272426; cv=none; b=DSr+7U+D0keuMG5jv/9V/YZf2y0YFozKd1BKhqh02mbSkZReGnq49KZfhfhmhSynDDsnKfGrWMRtzjxOyZvtZtwJZPs9reCkZ+9Y50p+61IKbNAq/VE3qHlDwSW1TCBIiAyL+Tm+9wUmE69qtrF8xZELfMHvVcX4Y9jkM2uYP44= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729272426; c=relaxed/simple; bh=BjT4yUEif/C7ZnutetNR54/CQEs/PxSVu6rHWzDZz40=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=CxlQC+gbUS8zn2KAA5aszBN7bSjgLkd5+1VpidtK/3xQ9hMYmcpw5M+aedmmL219aDVTsslq+efqxdca+KPRinscRh2PydIjIeBfGKWVg5vXctDdPV/5rIt4dOSe6xlkAS3KHTke2y4227wZsGNABNbFalrHN1cAXgoddHZX8AU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OiygzGmG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OiygzGmG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7EC77C4CEC3; Fri, 18 Oct 2024 17:27:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729272425; bh=BjT4yUEif/C7ZnutetNR54/CQEs/PxSVu6rHWzDZz40=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=OiygzGmG+rsXe5oF+DX14kRuCiXHjanSNf4JE/rhG41Xbo7CUfZqTP6tycZbdqmKy V6kJqLWcJC8uqPNkv/IBsjufGvvfgIpXK6lCREdgTiLEoHBp4S+gBnCT8+O+w/Y/2V VEwLOZDsxbIYZXWNIiATdAShbM54iIO7T8cbZuw2OcijXH0jwlmXDoABx+2g4J4d+1 gP3KU+bpzu9Ki4jNgeTfur+LLHCYRVV+r/6UUyYf61jnMYhdp4LFV6RGmEO755q9ob WmhEc7jsSx6XHoiQ3rABvKyYvzzejMGz+0F/R0meGpGMS1ZBnyIP9fFcc2esc6P5j6 LA5UZF0NymmcA== From: chrisl@kernel.org Date: Fri, 18 Oct 2024 10:27:05 -0700 Subject: [PATCH 6.11.y v2 2/3] mm/codetag: fix pgalloc_tag_split() Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20241018-stable-yuzhao-v2-2-1fd556716eda@kernel.org> References: <20241018-stable-yuzhao-v2-0-1fd556716eda@kernel.org> In-Reply-To: <20241018-stable-yuzhao-v2-0-1fd556716eda@kernel.org> To: stable@vger.kernel.org Cc: Greg KH , Muchun Song , Andrew Morton , Yu Zhao , Suren Baghdasaryan , Kent Overstreet , Vlastimil Babka , Chris Li X-Mailer: b4 0.13.0 From: Yu Zhao [ Upstream commit 95599ef684d01136a8b77c16a7c853496786e173 ] The current assumption is that a large folio can only be split into order-0 folios. That is not the case for hugeTLB demotion, nor for THP split: see commit c010d47f107f ("mm: thp: split huge page to any lower order pages"). When a large folio is split into ones of a lower non-zero order, only the new head pages should be tagged. Tagging tail pages can cause imbalanced "calls" counters, since only head pages are untagged by pgalloc_tag_sub() and the "calls" counts on tail pages are leaked, e.g., # echo 2048kB >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size # echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote # echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # grep alloc_gigantic_folio /proc/allocinfo Before this patch: 0 549427200 mm/hugetlb.c:1549 func:alloc_gigantic_folio real 0m2.057s user 0m0.000s sys 0m2.051s After this patch: 0 0 mm/hugetlb.c:1549 func:alloc_gigantic_folio real 0m1.711s user 0m0.000s sys 0m1.704s Not tagging tail pages also improves the splitting time, e.g., by about 15% when demoting 1GB hugeTLB folios to 2MB ones, as shown above. Link: https://lkml.kernel.org/r/20240906042108.1150526-2-yuzhao@google.com Fixes: be25d1d4e822 ("mm: create new codetag references during page splitting") Signed-off-by: Yu Zhao Acked-by: Suren Baghdasaryan Cc: Kent Overstreet Cc: Muchun Song Cc: Signed-off-by: Andrew Morton Signed-off-by: Chris Li --- include/linux/mm.h | 30 ++++++++++++++++++++++++++++++ include/linux/pgalloc_tag.h | 31 ------------------------------- mm/huge_memory.c | 2 +- mm/page_alloc.c | 4 ++-- 4 files changed, 33 insertions(+), 34 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1470736017168..8330363126918 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4216,4 +4216,34 @@ void vma_pgtable_walk_end(struct vm_area_struct *vma); int reserve_mem_find_by_name(const char *name, phys_addr_t *start, phys_addr_t *size); +#ifdef CONFIG_MEM_ALLOC_PROFILING +static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new_order) +{ + int i; + struct alloc_tag *tag; + unsigned int nr_pages = 1 << new_order; + + if (!mem_alloc_profiling_enabled()) + return; + + tag = pgalloc_tag_get(&folio->page); + if (!tag) + return; + + for (i = nr_pages; i < (1 << old_order); i += nr_pages) { + union codetag_ref *ref = get_page_tag_ref(folio_page(folio, i)); + + if (ref) { + /* Set new reference to point to the original tag */ + alloc_tag_ref_set(ref, tag); + put_page_tag_ref(ref); + } + } +} +#else /* !CONFIG_MEM_ALLOC_PROFILING */ +static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new_order) +{ +} +#endif /* CONFIG_MEM_ALLOC_PROFILING */ + #endif /* _LINUX_MM_H */ diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h index 207f0c83c8e97..59a3deb792a8d 100644 --- a/include/linux/pgalloc_tag.h +++ b/include/linux/pgalloc_tag.h @@ -80,36 +80,6 @@ static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) } } -static inline void pgalloc_tag_split(struct page *page, unsigned int nr) -{ - int i; - struct page_ext *first_page_ext; - struct page_ext *page_ext; - union codetag_ref *ref; - struct alloc_tag *tag; - - if (!mem_alloc_profiling_enabled()) - return; - - first_page_ext = page_ext = page_ext_get(page); - if (unlikely(!page_ext)) - return; - - ref = codetag_ref_from_page_ext(page_ext); - if (!ref->ct) - goto out; - - tag = ct_to_alloc_tag(ref->ct); - page_ext = page_ext_next(page_ext); - for (i = 1; i < nr; i++) { - /* Set new reference to point to the original tag */ - alloc_tag_ref_set(codetag_ref_from_page_ext(page_ext), tag); - page_ext = page_ext_next(page_ext); - } -out: - page_ext_put(first_page_ext); -} - static inline struct alloc_tag *pgalloc_tag_get(struct page *page) { struct alloc_tag *tag = NULL; @@ -142,7 +112,6 @@ static inline void clear_page_tag_ref(struct page *page) {} static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, unsigned int nr) {} static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {} -static inline void pgalloc_tag_split(struct page *page, unsigned int nr) {} static inline struct alloc_tag *pgalloc_tag_get(struct page *page) { return NULL; } static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr) {} diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 99b146d16a185..837d41906f2ac 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2976,7 +2976,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, /* Caller disabled irqs, so they are still disabled here */ split_page_owner(head, order, new_order); - pgalloc_tag_split(head, 1 << order); + pgalloc_tag_split(folio, order, new_order); /* See comment in __split_huge_page_tail() */ if (folio_test_anon(folio)) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 91ace8ca97e21..72b710566cdbc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2764,7 +2764,7 @@ void split_page(struct page *page, unsigned int order) for (i = 1; i < (1 << order); i++) set_page_refcounted(page + i); split_page_owner(page, order, 0); - pgalloc_tag_split(page, 1 << order); + pgalloc_tag_split(page_folio(page), order, 0); split_page_memcg(page, order, 0); } EXPORT_SYMBOL_GPL(split_page); @@ -4950,7 +4950,7 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order, struct page *last = page + nr; split_page_owner(page, order, 0); - pgalloc_tag_split(page, 1 << order); + pgalloc_tag_split(page_folio(page), order, 0); split_page_memcg(page, order, 0); while (page < --last) set_page_refcounted(last); -- 2.47.0.rc1.288.g06298d1525-goog