From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F154F2AF14 for ; Wed, 2 Apr 2025 19:50:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743623449; cv=none; b=bzogicxiFuw5AGDAsK81dNeO+BVlgTkSzE7V1cmwylrM8i7ui/w/NKDnzX8mCncKq3ZEd7YATCy/1qaB1uTQTijEkidDsd7sPb5jdHj4492qfKH+I3/yKr5rwaDAi072BHVSWBGbZmCngwryuQg2YW6YfIfwLU4Yx4i6QNTrcac= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743623449; c=relaxed/simple; bh=P8KqmCQWrYMiyjyjMtSu8TkEJysklz/EpZ8QOsaqSa4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=g2HMmKgahazNZrNIR4YpAsrntZHRI3Kzb0ChCB4PjYfUa/GXlT8nr42xkYDZ3tDDdUWtmNoAfl+hULVqNymO/Rt+R47XwYSKRjRM8khJpUi70POCAC9RjnPb3HDRbQerpFz7HKUy0/tkJIgLsxONwYpTvo40Th3XC0EbTkmixps= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b=hSRFgjPN; arc=none smtp.client-ip=209.85.160.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b="hSRFgjPN" Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-47698757053so1253081cf.0 for ; Wed, 02 Apr 2025 12:50:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1743623444; x=1744228244; darn=lists.linux.dev; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=FfMrrYY13YZzcIjXFNy74fRKR0QYf/adnBC1OdclTvQ=; b=hSRFgjPNi5nMC2lYHKifesQJOJVswpFB0OWCl3XPusv+uZ/mjRTdUv9gE+ojGjESn7 k2ZrxL/h/BnGRaW7EiZsroiCdlu6jmj8G4vw6IsGwBoEnG4GssI7ZHSNdw7DIVrbgDDU Co0OVGOIAvxbPKur4neM/JTt4dBztM/ehu+hNFJ3E3ox2IzH62xDYS+Ml8fC/5hu9zA6 XtNeERfYayYkKmAuAZw4HK5NSB4mwl5dJhnXAVOcNkrz45Q4MUhSp9+VXklychwH9MB0 yWGxW8DTez6g5GaEb9CdK4pDng1z4V/MbhkpiwXdjt+nTCE1En4BOkHr+4H9O6ZOGn1m umFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743623444; x=1744228244; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FfMrrYY13YZzcIjXFNy74fRKR0QYf/adnBC1OdclTvQ=; b=ORLcB8Sv3rzaxHDhgLafYeWQFmE1f2PwwtqZLrdnhn9sR+RR5x6OqLjTXJBHc0SOQp 42T5ymwFUiWy2CKVkNAp2DbnF2LVhACw1vtXC2t9y30WouRA+P5RSJtW8MAs2eNnaz8Y vTYDCbjDN1O0eLLHuOilDwIYIxhK1zj3STrqjCkSOEwN+ApxsV1uAkF+FHhpvkCc1gpD fdq012ZPFpYk6KnvYKvKVEsqpmZlAOaggGKWE+Ep4QFvIGgMQISW59V5Tq/sJHQoYU2X Yi1XLPizgY4KPpvar/DtYVgLUNLr1K3M7079ZyVWlKjzF79TDSUMKkROBLhpNSO0Fka0 qISw== X-Gm-Message-State: AOJu0YwoS0s1NWiAfKsZtk141x4XdZ5nheb4sWuBEiGmsuPk+EfEFbos j6967Rod37LpPikBRysMO2tV4COGE8voClcKixWS9/LANWFzVeLKSan8njKNNqE= X-Gm-Gg: ASbGncu8wGkFM4W1Kt/F0DVBMT97nSwmya/isESAnfopenlE+fftwGHqfUbfc9xokA6 TDPwTptCeyNxUyOoXTvWxTcMk6ca7ViwqUJJqppro3z6jxrNjG8lSEuVycaabA8g0EptXmdaYVk ZiUyc0JAKNBqwNCCktmXBhUsUWlZmi7xV9dBy5ykCQKEcy0J3GLqIwWCKQ+LZPTYe+LUZDseqg2 LL+RkCOs9bKDrvY4BGmK8M4yTqJFy2KGL5IQhrqXoUK/orbP3KcCvZ4ctQKSvgTMyvC0GyCpjQE QVJgT6FhSjplZOfANMOAzd9QsPPau2CvSg8E7kPa09k= X-Google-Smtp-Source: AGHT+IHn/2owON+aw3mMItcmXsuTXk528yVv5D09IgwdkjIRjFqz9OSMXPbppYkHulZkuWMXfVHI8Q== X-Received: by 2002:a05:622a:252:b0:477:419a:a3bc with SMTP id d75a77b69052e-477e4b93c7amr286709531cf.27.1743623444554; Wed, 02 Apr 2025 12:50:44 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-47782a1033csm83134521cf.16.2025.04.02.12.50.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Apr 2025 12:50:43 -0700 (PDT) Date: Wed, 2 Apr 2025 15:50:42 -0400 From: Johannes Weiner To: kernel test robot Cc: oe-lkp@lists.linux.dev, lkp@intel.com, Andrew Morton , Vlastimil Babka , Brendan Jackman , linux-mm@kvack.org Subject: Re: [linux-next:master] [mm] c2f6ea38fc: vm-scalability.throughput 56.4% regression Message-ID: <20250402195042.GC198651@cmpxchg.org> References: <202503271547.fc08b188-lkp@intel.com> Precedence: bulk X-Mailing-List: oe-lkp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <202503271547.fc08b188-lkp@intel.com> Hello, On Thu, Mar 27, 2025 at 04:20:41PM +0800, kernel test robot wrote: > kernel test robot noticed a 56.4% regression of vm-scalability.throughput on: > > commit: c2f6ea38fc1b640aa7a2e155cc1c0410ff91afa2 ("mm: page_alloc: don't steal single pages from biggest buddy") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > testcase: vm-scalability > config: x86_64-rhel-9.4 > compiler: gcc-12 > test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory > parameters: > > runtime: 300s > test: lru-file-mmap-read > cpufreq_governor: performance Thanks for the report. Would you be able to re-test with the below patch applied? There are more details in the thread here: https://lore.kernel.org/all/20250402194425.GB198651@cmpxchg.org/ It's on top of the following upstream commit: commit acc4d5ff0b61eb1715c498b6536c38c1feb7f3c1 (origin/master, origin/HEAD) Merge: 3491aa04787f f278b6d5bb46 Author: Linus Torvalds Date: Tue Apr 1 20:00:51 2025 -0700 Merge tag 'net-6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Thanks! --- >From 13433454403e0c6f99ccc3b76c609034fe47e41c Mon Sep 17 00:00:00 2001 From: Johannes Weiner Date: Wed, 2 Apr 2025 14:23:53 -0400 Subject: [PATCH] mm: page_alloc: speed up fallbacks in rmqueue_bulk() Not-yet-signed-off-by: Johannes Weiner --- mm/page_alloc.c | 100 +++++++++++++++++++++++++++++++++++------------- 1 file changed, 74 insertions(+), 26 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f51aa6051a99..03b0d45ed45a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2194,11 +2194,11 @@ try_to_claim_block(struct zone *zone, struct page *page, * The use of signed ints for order and current_order is a deliberate * deviation from the rest of this file, to make the for loop * condition simpler. - * - * Return the stolen page, or NULL if none can be found. */ + +/* Try to claim a whole foreign block, take a page, expand the remainder */ static __always_inline struct page * -__rmqueue_fallback(struct zone *zone, int order, int start_migratetype, +__rmqueue_claim(struct zone *zone, int order, int start_migratetype, unsigned int alloc_flags) { struct free_area *area; @@ -2236,14 +2236,26 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, page = try_to_claim_block(zone, page, current_order, order, start_migratetype, fallback_mt, alloc_flags); - if (page) - goto got_one; + if (page) { + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; + } } - if (alloc_flags & ALLOC_NOFRAGMENT) - return NULL; + return NULL; +} + +/* Try to steal a single page from a foreign block */ +static __always_inline struct page * +__rmqueue_steal(struct zone *zone, int order, int start_migratetype) +{ + struct free_area *area; + int current_order; + struct page *page; + int fallback_mt; + bool claim_block; - /* No luck claiming pageblock. Find the smallest fallback page */ for (current_order = order; current_order < NR_PAGE_ORDERS; current_order++) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, @@ -2253,25 +2265,28 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, page = get_page_from_free_area(area, fallback_mt); page_del_and_expand(zone, page, order, current_order, fallback_mt); - goto got_one; + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; } return NULL; - -got_one: - trace_mm_page_alloc_extfrag(page, order, current_order, - start_migratetype, fallback_mt); - - return page; } +enum rmqueue_mode { + RMQUEUE_NORMAL, + RMQUEUE_CMA, + RMQUEUE_CLAIM, + RMQUEUE_STEAL, +}; + /* * Do the hard work of removing an element from the buddy allocator. * Call me with the zone->lock already held. */ static __always_inline struct page * __rmqueue(struct zone *zone, unsigned int order, int migratetype, - unsigned int alloc_flags) + unsigned int alloc_flags, enum rmqueue_mode *mode) { struct page *page; @@ -2290,16 +2305,47 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, } } - page = __rmqueue_smallest(zone, order, migratetype); - if (unlikely(!page)) { - if (alloc_flags & ALLOC_CMA) + /* + * Try the different freelists, native then foreign. + * + * The fallback logic is expensive and rmqueue_bulk() calls in + * a loop with the zone->lock held, meaning the freelists are + * not subject to any outside changes. Remember in *mode where + * we found pay dirt, to save us the search on the next call. + */ + switch (*mode) { + case RMQUEUE_NORMAL: + page = __rmqueue_smallest(zone, order, migratetype); + if (page) + return page; + fallthrough; + case RMQUEUE_CMA: + if (alloc_flags & ALLOC_CMA) { page = __rmqueue_cma_fallback(zone, order); - - if (!page) - page = __rmqueue_fallback(zone, order, migratetype, - alloc_flags); + if (page) { + *mode = RMQUEUE_CMA; + return page; + } + } + fallthrough; + case RMQUEUE_CLAIM: + page = __rmqueue_claim(zone, order, migratetype, alloc_flags); + if (page) { + /* Replenished native freelist, back to normal mode */ + *mode = RMQUEUE_NORMAL; + return page; + } + fallthrough; + case RMQUEUE_STEAL: + if (!(alloc_flags & ALLOC_NOFRAGMENT)) { + page = __rmqueue_steal(zone, order, migratetype); + if (page) { + *mode = RMQUEUE_STEAL; + return page; + } + } } - return page; + return NULL; } /* @@ -2311,6 +2357,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, unsigned long count, struct list_head *list, int migratetype, unsigned int alloc_flags) { + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; unsigned long flags; int i; @@ -2321,7 +2368,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, } for (i = 0; i < count; ++i) { struct page *page = __rmqueue(zone, order, migratetype, - alloc_flags); + alloc_flags, &rmqm); if (unlikely(page == NULL)) break; @@ -2934,6 +2981,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, { struct page *page; unsigned long flags; + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; do { page = NULL; @@ -2945,7 +2993,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, if (alloc_flags & ALLOC_HIGHATOMIC) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); if (!page) { - page = __rmqueue(zone, order, migratetype, alloc_flags); + page = __rmqueue(zone, order, migratetype, alloc_flags, &rmqm); /* * If the allocation fails, allow OOM handling and -- 2.49.0