From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D22AC021BC for ; Tue, 25 Feb 2025 15:04:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 305616B0082; Tue, 25 Feb 2025 10:04:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B50D6B0085; Tue, 25 Feb 2025 10:04:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17D316B0088; Tue, 25 Feb 2025 10:04:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ED2246B0082 for ; Tue, 25 Feb 2025 10:04:49 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7EEB6A3C84 for ; Tue, 25 Feb 2025 15:04:49 +0000 (UTC) X-FDA: 83158789098.01.E2F3DB8 Received: from mail-qv1-f46.google.com (mail-qv1-f46.google.com [209.85.219.46]) by imf18.hostedemail.com (Postfix) with ESMTP id 49B201C0019 for ; Tue, 25 Feb 2025 15:04:47 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="sCNwQcp/"; spf=pass (imf18.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.46 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740495887; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Nn6Uxgl3eAo9l3NW1uWYKG21RtNcAXLzEO35VKD04Go=; b=Wgw4S/cgVFxly9KqJ6F5Vg71z/QxDYwp+JDHsNusqUaMSpvByxCtxMCRx9fmwQXi39zYl3 xgMKsRGjdCSt1/bVO2zBltTkN07ZRPwfhYFXG7aB5VmWvzf6N9jt/FDDCyiNeRdzNjHdeD gR5qz3mo2FqPb8jULx2iQzD42hlZOps= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="sCNwQcp/"; spf=pass (imf18.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.46 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740495887; a=rsa-sha256; cv=none; b=ZZMhfWEHX+XXHZgC7sE9ywX/Ip6UrExNbBoiXJoeQI5mGcMctkMBEpM3ZvGpGOXYNtu+4a j22siM/OOWev+RhkICoYw9RkxmR6ArlDcSK6o3krChBnn/iZkELZ1NB+Cn1q/rL0yyWFsp Gr5dkDlb+G7QEbMUxOjCLs7rrEfwKBk= Received: by mail-qv1-f46.google.com with SMTP id 6a1803df08f44-6e6c082eac0so33015106d6.0 for ; Tue, 25 Feb 2025 07:04:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1740495886; x=1741100686; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Nn6Uxgl3eAo9l3NW1uWYKG21RtNcAXLzEO35VKD04Go=; b=sCNwQcp/Pbc/t71S1RP72sSMC+UrKptx8ki4lg3QPE9UzL185Mui05fgA6F82RTCWi eq5Gml5H4DtzTXeAncntWYXFfZ/vsXoXqmti4CDXyrIOLbDm/7JXc/kV25vepJ9Io4W9 Q6tzuoNPYtqHlOUkQ42r7FJdDcrlSo68SriMBMWLoaDhm7sOgeh2y+xd5kNXWASpsPuv lTqkm56sUqUW6th0DexOM5WvblAhwfvyajPMFt9+ndq67EMutOa17LRPpyV6rGwJ4fAs WwKXMKktaVKxvWLQ39C3QAOClzJfNHZKCdvCZiyVH7Um/pZvqKXig+N55s6xi3PXrbos s8OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740495886; x=1741100686; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Nn6Uxgl3eAo9l3NW1uWYKG21RtNcAXLzEO35VKD04Go=; b=TZ5xn2TnvtL1D4rKiGH6SMBU9wxDZ+SlGV0DPLs/iPvNdmZOa+wdg0OthbN7qNlAT9 buH/CHI1ldwtoQy6FjkJ8zWMkq3vi0+2C4EpXAbadYJdlXPw/aK9Rzb2Xb9Jk/6BnQd4 42kaB68g19ujZjcxaq5GID2WVnTn7UXNPjLHEkyLGOU2SzPZEPCUu16z0iT9TFR10u3C 0bW64Km4Y+QHkdu10iAzjPiNEHLATh2WYjiILYXDk51Iya0/HNyIBKY42k13d/4QbZsC WV+7GoZSqmYszNtPkAzQDrgmUTIliVIsD19bdHo+ItHkrtT6sPkN8Uwtq2Q38E8UpSJd +vYA== X-Forwarded-Encrypted: i=1; AJvYcCV5HxJDtYBXhoRxOwvmTQp971Yqp2XOBR91eWmy8ZjYKBjQwUHl9fcpysrC+GqaEIlTDGXE3A3rMA==@kvack.org X-Gm-Message-State: AOJu0YxODAty+DGDwbIDNFyjm2uW1gm1fDY+pX6dhCqjPjdbZmi+KfuL zZ1aBrKsoO9eThyY90gP5Qtn9ekQAR/VuaNt1JFz0m/SeSSs86FBjAGWwrQCIbw= X-Gm-Gg: ASbGncvwstPuIzX2AIX1vL1JmCDhhlrKSr9jlhOhyyNWs0hu8IDWQ/7LyFml8HyloTH 0zrZK+8r2ltZvMsHQvLShsI0+5Z9KvoiXLEkyXbEwfzGkDDCaL8BHZBhdpptQq/W08iuoiVeIjQ CHiDA2m8E9E8LfGFkQuZjt0YBBJLW036WZZ9y/L0Hmrmc8z5QLx285YQCKFQ2MwoHl5izO1jnrA oKtGesnY2HcWeU1rmjWv4glym5WkcYsImkT09au8/JULl0qXhWZGgCr5Ld2D1x63EOE/kQZg4E3 3T3a/sr9mRkDt6qnVIndkYJp X-Google-Smtp-Source: AGHT+IGonCnwQEj5UDFxTvMWr15lZ0jTE+5hj5ccntmODdWVtK3zIvgmZ1LmA0sV0c9X2ShiF6SaoQ== X-Received: by 2002:a05:6214:27e2:b0:6e4:1e16:b63d with SMTP id 6a1803df08f44-6e6ae826c01mr255677666d6.21.1740495886247; Tue, 25 Feb 2025 07:04:46 -0800 (PST) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id 6a1803df08f44-6e87b1564easm10084066d6.72.2025.02.25.07.04.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Feb 2025 07:04:45 -0800 (PST) Date: Tue, 25 Feb 2025 10:04:41 -0500 From: Johannes Weiner To: Brendan Jackman Cc: Andrew Morton , Vlastimil Babka , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/3] mm: page_alloc: don't steal single pages from biggest buddy Message-ID: <20250225150441.GA1499716@cmpxchg.org> References: <20250225001023.1494422-1-hannes@cmpxchg.org> <20250225001023.1494422-2-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 49B201C0019 X-Stat-Signature: z5bqei6wcmcusdbmazbugbejizi6q7b6 X-Rspam-User: X-HE-Tag: 1740495887-138836 X-HE-Meta: U2FsdGVkX1/eJBoY24vmvn/XErzaXqMHJuTSm/nkFc6AVrxAwNTqEw2BB7eArrOKF/8SYCE6Ph/M3UVfHIiCDC3mWQ44wf+FMqxOnB03u8vuKKMy5abWneWoPnCyxIWOnmeFm+mzI91dNL0MOTcr1JUiQqq3u3aZahyZcpBWlTF+wOojJ8ob43iyU0yVXejwzsdbGKglI4SmEOSjdRXMmjAGFeZE0Kgc9VONuv4YfGNKgZLqfAx4WLPc+Cgc7M3Ojm+ZJktMmXZUR/40KzCAEORnThbCOTU9+AtQPwwizl4FLYkiS8g1tlmwLjgq16lZc4/awkFRgCyNhdfiCDQcDpliujcMULfl4/nHuIZkHAm9bt4zevewg+08TFHM1UJEraC235DRmjOWbprgqSZfgv8/eBaPBlcDgsCLt7GXy8PToRC0zT2KiDTrjpicji6zSecrMydBEHJSjOfSPTEDFvslNKI+qosdwVgPyjPFHrqX7dbnSwVWrDZKOxfEnaxrGg8PcSPewE20dpbx53ciXD0MATYCWJFHxuOiJv4R5arL+L4OAZuCogix+2Uk/gohKBqsSr8ZPc4z7QSUZrAJd+JPP4YaA9VlijkiAxcSUev3XlmWA0f+z3XenfF4U/dgwYkUExB6wkbP8UB2DqTmgjhCnaEuGyV7L7Li3e58ZD2IyiJi6pudCXZasthgHhq7akgi/joCerRPEKiPUtFP+YAB93uu0PcM6oKfTUf4XdKMi9/42L0cBim9t7Tr5F6pH5aDrJku7ciGJN71YGg9eeNq7M5qMPUvNCnfyutmyt8tsgZlZf0kg4PLldhzSxY8Hu6DtW3vYBdXkjtEXZ0sp22zD3dNgZg+nIWGPxcr/N6oB1h4uCe9bpCGZCD01hZ5A4SwSINXU4BQdJ98uxNVvkHzGQ8PkDRnmFtQgb9DQ1WGgZImprTbF68Wm/9RJXzhoPuJIawC/HljXrh8/au QHCpjmo3 GZP4hd8pLYO+ZO/kI/zrNbZLXSenieK6SW+cCy7PYyCIpbBFAwPH7q+huUQwOa9KzL3uCeURJPMmOoOR9luGZBGBB1RmmrLpWwT7+Zvw+1TvIgweaP0Lq49qCTfYn44lgjDixRQBruwRVa3pVi8/2SmwPJDlXwNxBkqJaWQex6kWDPFlxOX+2LAA2QNbZXAGjk4zvWReNzGtwita0Yu0LyLedcXFNSF5w1MDCIZdHZj/0TTJ46niGzeo9FHlouAbL8IPBWu1JcAvJwpc+GLJR5Qq0u7aAD6v2BP9azVFydAtcHadSqEZ55Y4TBobYjT5AhXD2AEmPm40w83+qi2aad00nxLlmrbff9WDNlQgwWcO1X4AbcK3ltk4tCYIDbKgb2RPfldP9tAKctKlZeh9ZpXUxH3n49uYEDY/YG6MIDPcX7+J3MSTcLzVHTK93XPsO++whqW6pEANVf6ctD4S/E7TmuZ8YVAVkS6UGFzrESTzoJ5RVgLNgDgRfP3nrJbNqmCc+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 25, 2025 at 01:34:32PM +0000, Brendan Jackman wrote: > On Mon, Feb 23, 2025 at 07:08:24PM -0500, Johannes Weiner wrote: > > The fallback code searches for the biggest buddy first in an attempt > > to steal the whole block and encourage type grouping down the line. > > > > The approach used to be this: > > > > - Non-movable requests will split the largest buddy and steal the > > remainder. This splits up contiguity, but it allows subsequent > > requests of this type to fall back into adjacent space. > > > > - Movable requests go and look for the smallest buddy instead. The > > thinking is that movable requests can be compacted, so grouping is > > less important than retaining contiguity. > > > > c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block > > conversion") enforces freelist type hygiene, which restricts stealing > > to either claiming the whole block or just taking the requested chunk; > > no additional pages or buddy remainders can be stolen any more. > > > > The patch mishandled when to switch to finding the smallest buddy in > > that new reality. As a result, it may steal the exact request size, > > but from the biggest buddy. This causes fracturing for no good reason. > > > > Fix this by committing to the new behavior: either steal the whole > > block, or fall back to the smallest buddy. > > > > Remove single-page stealing from steal_suitable_fallback(). Rename it > > to try_to_steal_block() to make the intentions clear. If this fails, > > always fall back to the smallest buddy. > > Nit - I think the try_to_steal_block() changes could be a separate > patch, the history might be easier to understand if it went: > > [1/N] mm: page_alloc: don't steal single pages from biggest buddy > [2/N] mm: page_alloc: drop unused logic in steal_suitable_fallback() There are several ways in which steal_suitable_fallback() could end up taking a single page, and I'd have to mirror all those conditions in the caller if I wanted to prevent this. That would be too convoluted. > > static __always_inline struct page * > > __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, > > @@ -2291,45 +2289,35 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, > > if (fallback_mt == -1) > > continue; > > > > - /* > > - * We cannot steal all free pages from the pageblock and the > > - * requested migratetype is movable. In that case it's better to > > - * steal and split the smallest available page instead of the > > - * largest available page, because even if the next movable > > - * allocation falls back into a different pageblock than this > > - * one, it won't cause permanent fragmentation. > > - */ > > - if (!can_steal && start_migratetype == MIGRATE_MOVABLE > > - && current_order > order) > > - goto find_smallest; > > + if (!can_steal) > > + break; > > > > - goto do_steal; > > + page = get_page_from_free_area(area, fallback_mt); > > + page = try_to_steal_block(zone, page, current_order, order, > > + start_migratetype, alloc_flags); > > + if (page) > > + goto got_one; > > } > > > > - return NULL; > > + if (alloc_flags & ALLOC_NOFRAGMENT) > > + return NULL; > > Is this a separate change? Is it a bug that we currently allow > stealing a from a fallback type when ALLOC_NOFRAGMENT? (I wonder if > the second loop was supposed to start from min_order). No, I don't see how we could hit that right now. With NOFRAGMENT, the first loop scans whole free blocks only, which, if present, are always stealable. If there are no blocks, the loop continues through all the fallback_mt == -1 and then the function returns NULL. Only without NOFRAGMENT does it run into !can_steal buddies. IOW, the control flow implicit in min_order, can_steal and the gotos would make it honor NOFRAGMENT - albeit in a fairly non-obvious way. The code is just a bit odd. While the function currently looks like it's two loops following each other, this isn't how it's actually executed. Instead, the first loop is the main sequence of the function. The second loop is entered only from a jump in the main loop under certain conditions, more akin to a function call. I'm changing the sequence so that all types fall back to the smallest buddy if stealing a block fails. The easiest way to express that is removing the find_smallest jump and having the loops *actually* follow each other as the main sequence of this function. For that, I need to make that implicit NOFRAGMENT behavior explicit.