From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A7FDE94131 for ; Fri, 6 Oct 2023 21:49:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233806AbjJFVtE (ORCPT ); Fri, 6 Oct 2023 17:49:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233864AbjJFVs5 (ORCPT ); Fri, 6 Oct 2023 17:48:57 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7746CE for ; Fri, 6 Oct 2023 14:48:55 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0019EC433C7; Fri, 6 Oct 2023 21:48:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1696628935; bh=vmupAjxAk6MVnVWrXSI8JLPtmRuingnkzUbq0vYT2JQ=; h=Date:To:From:Subject:From; b=O7l24zOjNXXGnQBorytBKsNciQXBkIjO1Gk1aaXOSTfTFGNuyTDl0esAL823aUr2H C+gstHr18KEEoAPFVsWS7UD2Sabj4udIJs+tY3MQAhuiNeyBNug2OO3UIBoxD0Pkax R21208jL3O5xfbP3B4tn93skV0KAk8NRXXZmm3a8= Date: Fri, 06 Oct 2023 14:48:52 -0700 To: mm-commits@vger.kernel.org, willy@infradead.org, vbabka@suse.cz, sudeep.holla@arm.com, pasha.tatashin@soleen.com, mhocko@suse.com, mgorman@techsingularity.net, jweiner@redhat.com, david@redhat.com, dave.hansen@linux.intel.com, cl@linux.com, arjan@linux.intel.com, ying.huang@intel.com, akpm@linux-foundation.org From: Andrew Morton Subject: [merged mm-stable] mm-page_alloc-scale-the-number-of-pages-that-are-batch-allocated.patch removed from -mm tree Message-Id: <20231006214855.0019EC433C7@smtp.kernel.org> Precedence: bulk Reply-To: linux-kernel@vger.kernel.org List-ID: X-Mailing-List: mm-commits@vger.kernel.org The quilt patch titled Subject: mm, page_alloc: scale the number of pages that are batch allocated has been removed from the -mm tree. Its filename was mm-page_alloc-scale-the-number-of-pages-that-are-batch-allocated.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Huang Ying Subject: mm, page_alloc: scale the number of pages that are batch allocated Date: Tue, 26 Sep 2023 14:09:06 +0800 When a task is allocating a large number of order-0 pages, it may acquire the zone->lock multiple times allocating pages in batches. This may unnecessarily contend on the zone lock when allocating very large number of pages. This patch adapts the size of the batch based on the recent pattern to scale the batch size for subsequent allocations. On a 2-socket Intel server with 224 logical CPU, we run 8 kbuild instances in parallel (each with `make -j 28`) in 8 cgroup. This simulates the kbuild server that is used by 0-Day kbuild service. With the patch, the cycles% of the spinlock contention (mostly for zone lock) decreases from 11.7% to 10.0% (with PCP size == 361). Link: https://lkml.kernel.org/r/20230926060911.266511-6-ying.huang@intel.com Signed-off-by: "Huang, Ying" Suggested-by: Mel Gorman Cc: Vlastimil Babka Cc: David Hildenbrand Cc: Johannes Weiner Cc: Dave Hansen Cc: Michal Hocko Cc: Pavel Tatashin Cc: Matthew Wilcox Cc: Christoph Lameter Cc: Arjan van de Ven Cc: Sudeep Holla Signed-off-by: Andrew Morton --- include/linux/mmzone.h | 3 +- mm/page_alloc.c | 52 +++++++++++++++++++++++++++++++-------- 2 files changed, 44 insertions(+), 11 deletions(-) --- a/include/linux/mmzone.h~mm-page_alloc-scale-the-number-of-pages-that-are-batch-allocated +++ a/include/linux/mmzone.h @@ -683,9 +683,10 @@ struct per_cpu_pages { int high; /* high watermark, emptying needed */ int batch; /* chunk size for buddy add/remove */ u8 flags; /* protected by pcp->lock */ + u8 alloc_factor; /* batch scaling factor during allocate */ u8 free_factor; /* batch scaling factor during free */ #ifdef CONFIG_NUMA - short expire; /* When 0, remote pagesets are drained */ + u8 expire; /* When 0, remote pagesets are drained */ #endif /* Lists of pages, one per migrate type stored on the pcp-lists */ --- a/mm/page_alloc.c~mm-page_alloc-scale-the-number-of-pages-that-are-batch-allocated +++ a/mm/page_alloc.c @@ -2376,6 +2376,12 @@ static void free_unref_page_commit(struc int pindex; bool free_high = false; + /* + * On freeing, reduce the number of pages that are batch allocated. + * See nr_pcp_alloc() where alloc_factor is increased for subsequent + * allocations. + */ + pcp->alloc_factor >>= 1; __count_vm_events(PGFREE, 1 << order); pindex = order_to_pindex(migratetype, order); list_add(&page->pcp_list, &pcp->lists[pindex]); @@ -2682,6 +2688,41 @@ struct page *rmqueue_buddy(struct zone * return page; } +static int nr_pcp_alloc(struct per_cpu_pages *pcp, int order) +{ + int high, batch, max_nr_alloc; + + high = READ_ONCE(pcp->high); + batch = READ_ONCE(pcp->batch); + + /* Check for PCP disabled or boot pageset */ + if (unlikely(high < batch)) + return 1; + + /* + * Double the number of pages allocated each time there is subsequent + * refiling of order-0 pages without drain. + */ + if (!order) { + max_nr_alloc = max(high - pcp->count - batch, batch); + batch <<= pcp->alloc_factor; + if (batch <= max_nr_alloc && pcp->alloc_factor < PCP_BATCH_SCALE_MAX) + pcp->alloc_factor++; + batch = min(batch, max_nr_alloc); + } + + /* + * Scale batch relative to order if batch implies free pages + * can be stored on the PCP. Batch can be 1 for small zones or + * for boot pagesets which should never store free pages as + * the pages may belong to arbitrary zones. + */ + if (batch > 1) + batch = max(batch >> order, 2); + + return batch; +} + /* Remove page from the per-cpu list, caller must protect the list */ static inline struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order, @@ -2694,18 +2735,9 @@ struct page *__rmqueue_pcplist(struct zo do { if (list_empty(list)) { - int batch = READ_ONCE(pcp->batch); + int batch = nr_pcp_alloc(pcp, order); int alloced; - /* - * Scale batch relative to order if batch implies - * free pages can be stored on the PCP. Batch can - * be 1 for small zones or for boot pagesets which - * should never store free pages as the pages may - * belong to arbitrary zones. - */ - if (batch > 1) - batch = max(batch >> order, 2); alloced = rmqueue_bulk(zone, order, batch, list, migratetype, alloc_flags); _ Patches currently in -mm which might be from ying.huang@intel.com are mm-fix-draining-remote-pageset.patch