From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7699C04FFE for ; Sat, 11 May 2024 05:15:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 438866B0158; Sat, 11 May 2024 01:15:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E8816B0159; Sat, 11 May 2024 01:15:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B0866B015A; Sat, 11 May 2024 01:15:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 08CBC6B0158 for ; Sat, 11 May 2024 01:15:26 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6800081972 for ; Sat, 11 May 2024 05:15:25 +0000 (UTC) X-FDA: 82104951810.02.E7C433D Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) by imf25.hostedemail.com (Postfix) with ESMTP id 88E13A000C for ; Sat, 11 May 2024 05:15:23 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F3N2I4qz; spf=pass (imf25.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715404523; a=rsa-sha256; cv=none; b=hM4Dpd87w5cWZSP//OPbChPcFtw54Ly0PNsYMee5fbg318mnyUuj8p9AwqgXde45zmR16a 1p8MtDLTQXcILqfAaF1ZWEZVYGilsHkMvJF36ER5dtk1vvnSzoUN8AlDmgRAoFt/FnUKVw nPE8QGxvEJuofz61xi4NAa0Zg4f3I3Q= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F3N2I4qz; spf=pass (imf25.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715404523; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VR2quo6tlZ4enrMTfiyjAw0a+nG2TivaK5vmV5DYmrA=; b=aZ+aGbFwOfqKRHL8oOhqtFwmB2wR+c1B9MqBog79V+b7LGWLmITjixuFkR1AXMnrvtDvej BHaeb+oasIawTfrAvPhOCXNwn+Gfv9vBzRBvAZRH/XHFzLVRB5gWlO2JBFo84U/1Pzo2zq kIvWu2hyegjsJq1cfezZ2rtbUSkbIWg= Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-420107286ecso4595e9.0 for ; Fri, 10 May 2024 22:15:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1715404522; x=1716009322; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VR2quo6tlZ4enrMTfiyjAw0a+nG2TivaK5vmV5DYmrA=; b=F3N2I4qz6NHwUe20dlierbbV97G+rKu5u8PEBj3Qvy5C1eeA6XSHX9X4h2xO4Sm79f MwEFmIWqQ0sWD6Iemif2KWXv8M0Sk64eLlxMxTcHOfxfdTJ/mjylHJkc7g4a8i5hEaZQ rGK8RqjtDe90ejWg3VVZwDLBqiE2tsyXCj8Rwzeu8TRjgXNV2XMZlSmXisVSAQadqXBo UJLhVCrVvURlckrc8hLVtuc9HYu33gTl/DejYsPVWvBzK99b4P1r51JZzKnlKmntcCvO YGb4BldVsTacTOk7Qh4RWK80LU/WCHUqf9cRNRRO+ErhJsm7tsndjt+yXaaZq1I9RkuB u8Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715404522; x=1716009322; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VR2quo6tlZ4enrMTfiyjAw0a+nG2TivaK5vmV5DYmrA=; b=ry4i2l4VvhYVXdJJ7+v1rEMdGL28+/tNWgXNNZku2U5IChEVv2Bc+EBDzuI2R/dqH5 oCH0pPnjOHis64F8pT95yb/3jH4k4if3q0YCArvrONaNYS2f7NQzqQmY3Xk/03wAyqwc EsEVlT2aOiKXjETS6MKeFpu26ymxDSSlT6Q7g+IUMzJR6fiujiSoQpRj4ek/HrB5IEyy Y2ktwdLPDF9oyGFDbnaju7QRJU4VBeH/fbAtqALoz6fFzRhCh8E2dOHMJgca4SXOt6Wu DtQrJeptEghnTP4BdhQfUMxStVftDNAXGoWLreacIlTDut8lWkiuLa//AVj08bM8ffOL /M4Q== X-Forwarded-Encrypted: i=1; AJvYcCWtzbySFprxKCjik2FOwzpJnsSkuLg4Hc/7t67MPpFDRGjBjOlHqidys9LRlGB57pJ2/F7cLZA7NomawHKdcWDcqv8= X-Gm-Message-State: AOJu0Ywkik+xMOrobveBuKDbzY3L/VmFSuFr6X93pNI2FGPhYrCGW31S fCKROu7t3XKa11eya0iXU6dghRAfnSZgxsr0rGbwnMuE2S/SlEjBZxWLEgpoGy9KYEUwxiwEGq3 Aor3UqwoaetMCQQzKhZz9yg4fgEBcAcPeFXTg X-Google-Smtp-Source: AGHT+IFBbUXxWVS/k4A2acQWp/MCTFiIGxGOGHdZ4HqDcNm4MKFFPjKu0aCDg4uXIPnplGVN/UA7gi9I2BjYNoO8ca8= X-Received: by 2002:a05:600c:4e55:b0:41f:9dd0:7168 with SMTP id 5b1f17b1804b1-4200ee13359mr899435e9.2.1715404521631; Fri, 10 May 2024 22:15:21 -0700 (PDT) MIME-Version: 1.0 References: <20240320180429.678181-1-hannes@cmpxchg.org> In-Reply-To: <20240320180429.678181-1-hannes@cmpxchg.org> From: Yu Zhao Date: Fri, 10 May 2024 23:14:43 -0600 Message-ID: Subject: Re: [PATCH V4 00/10] mm: page_alloc: freelist migratetype hygiene To: Johannes Weiner , Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Zi Yan , "Huang, Ying" , David Hildenbrand , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kalesh Singh , Chun-Tse Shao Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 88E13A000C X-Stat-Signature: p7yuh6n5wertrmwx7wskwg7mkkuu15hu X-HE-Tag: 1715404523-744824 X-HE-Meta: U2FsdGVkX1/is0Jeo5rWCG6h8rv6alaRC2PNRmTCLvf5/6QND+O3HhquQWmhXF45pdvqG7WYUhpCWdUQlsOREoOv7LkzQAaRxINZrANijNmMJkh4/NhcRfCynr3IYkZxbIdfC8YkGJj+wwpCGX6M+EHI2VOTMN40AO4dl8Ya1jTsANrqwIoFBTP12j7mgMzB93GplEmstbU/dPUbrozMSanDfGNH88nfP2Ujg7P5df0dcO1/nafG3zJxLBo/VSJOJ85OcDy0L5Klx7lGQF81rZ41YOt3O8Yn5JSuD+4rB1/4dJJXve+I9uhjSYGMH0FJfu5e1qjVAiW17i9APXXjlD/h7hrSK/oBOIDXTHHon+misfDrI4Tp53x3fb0kyCbRYkQPGVPYaCGwymTrXf9kYuUvcnE7yKSGFXEn0AxZGa4Ok/pL9QjngzLVA9ljx73Y7gI1i3K2vlUVMqKGySxiFWdLZ8TUhPG7vzPDQBpFg50XmYboj9rIo2hMgzyI4bcZNPrcmTK6WL364rZ2eBMmTiyIFcs72/JlqOu1Of7y62dxpKPlKHurP0fqmUbqdYcBmBZfbll5xP211AK7Z/80kVwUroxDtHuTv/rKioYdYRhcXHO5tFvgdPR23sKFIkCnNhy2v0s2Si7dDUCNrLzLrYfJpwq/mipy9WxKvGLMgGz4BuWNoZjP31bJsN6XF2zMV3uTE4R+LPzXl8cZbzck1hrsbhs+71mur9h3kyLHcYwMa1iHu0txDF6WrDfW+vwPTiYN6bvfeIdmBm1Vjswj+uLwI4Qdh/pntAEca8MYa9aCngROX+mw75vgDf/mwYogmcS14kfOcn7Oe0uxn7SHqeovlv24oBJQZaLMzydYw5/C3D71W3/sxMwwkfv/RWLfWjm8kIhDcl2zm6Uf7NwEYpo2illL5+De8qSETUwYePTJgzgNSIPjVqU/Twejicm8oO/X3ZdvDJ9N0OVayiN K2nY8SO8 BlGzp95VJCES6YW0scZF3tvOR9bitwk6hcbgIU9gTWtXdY1yQ5oBVFy9tI0ClfR8JMHc1MOOTCXSsyrqdyZ7iKw1mD7+hRwKs5jOQSNrNqEpVieiUhvmN9mK5pLtVJ+r6i1Bj56NPO0e3JBdz3j/EwjLlVj307rVnFuRsnhDWubL9/ljus7oBBmQPwSEgcAu+CaCMLi5x2IFPnuxT6OejVSwCle6kVbBhQA+bak8N5Rx4Pn99/PQAdqPYM7FGprVOii/aKJDCupZeVN7Di7+RLHM+fk7gjBZzZdphTexMLAva8iHboke2s4wVRip2fYyFB9YHmATVBOV9To0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 20, 2024 at 12:04=E2=80=AFPM Johannes Weiner wrote: > > V4: > - fixed !pcp_order_allowed() case in free_unref_folios() > - reworded the patch 0 changelog a bit for the git log > - rebased to mm-everything-2024-03-19-23-01 > - runtime-tested again with various CONFIG_DEBUG_FOOs enabled > > --- > > The page allocator's mobility grouping is intended to keep unmovable > pages separate from reclaimable/compactable ones to allow on-demand > defragmentation for higher-order allocations and huge pages. > > Currently, there are several places where accidental type mixing > occurs: an allocation asks for a page of a certain migratetype and > receives another. This ruins pageblocks for compaction, which in turn > makes allocating huge pages more expensive and less reliable. > > The series addresses those causes. The last patch adds type checks on > all freelist movements to prevent new violations being introduced. > > The benefits can be seen in a mixed workload that stresses the machine > with a memcache-type workload and a kernel build job while > periodically attempting to allocate batches of THP. The following data > is aggregated over 50 consecutive defconfig builds: > > VANILLA = PATCHED > Hugealloc Time mean 165843.93 ( +0.00%) 113025.8= 8 ( -31.85%) > Hugealloc Time stddev 158957.35 ( +0.00%) 114716.0= 7 ( -27.83%) > Kbuild Real time 310.24 ( +0.00%) 300.7= 3 ( -3.06%) > Kbuild User time 1271.13 ( +0.00%) 1259.4= 2 ( -0.92%) > Kbuild System time 582.02 ( +0.00%) 559.7= 9 ( -3.81%) > THP fault alloc 30585.14 ( +0.00%) 40853.6= 2 ( +33.57%) > THP fault fallback 36626.46 ( +0.00%) 26357.6= 2 ( -28.04%) > THP fault fail rate % 54.49 ( +0.00%) 39.2= 2 ( -27.53%) > Pagealloc fallback 1328.00 ( +0.00%) 1.0= 0 ( -99.85%) > Pagealloc type mismatch 181009.50 ( +0.00%) 0.0= 0 ( -100.00%) > Direct compact stall 434.56 ( +0.00%) 257.6= 6 ( -40.61%) > Direct compact fail 421.70 ( +0.00%) 249.9= 4 ( -40.63%) > Direct compact success 12.86 ( +0.00%) 7.7= 2 ( -37.09%) > Direct compact success rate % 2.86 ( +0.00%) 2.8= 2 ( -0.96%) > Compact daemon scanned migrate 3370059.62 ( +0.00%) 3612054.7= 6 ( +7.18%) > Compact daemon scanned free 7718439.20 ( +0.00%) 5386385.0= 2 ( -30.21%) > Compact direct scanned migrate 309248.62 ( +0.00%) 176721.0= 4 ( -42.85%) > Compact direct scanned free 433582.84 ( +0.00%) 315727.6= 6 ( -27.18%) > Compact migrate scanned daemon % 91.20 ( +0.00%) 94.4= 8 ( +3.56%) > Compact free scanned daemon % 94.58 ( +0.00%) 94.4= 2 ( -0.16%) > Compact total migrate scanned 3679308.24 ( +0.00%) 3788775.8= 0 ( +2.98%) > Compact total free scanned 8152022.04 ( +0.00%) 5702112.6= 8 ( -30.05%) > Alloc stall 872.04 ( +0.00%) 5156.1= 2 ( +490.71%) > Pages kswapd scanned 510645.86 ( +0.00%) 3394.9= 4 ( -99.33%) > Pages kswapd reclaimed 134811.62 ( +0.00%) 2701.2= 6 ( -98.00%) > Pages direct scanned 99546.06 ( +0.00%) 376407.5= 2 ( +278.12%) > Pages direct reclaimed 62123.40 ( +0.00%) 289535.7= 0 ( +366.06%) > Pages total scanned 610191.92 ( +0.00%) 379802.4= 6 ( -37.76%) > Pages scanned kswapd % 76.36 ( +0.00%) 0.1= 0 ( -98.58%) > Swap out 12057.54 ( +0.00%) 15022.9= 8 ( +24.59%) > Swap in 209.16 ( +0.00%) 256.4= 8 ( +22.52%) > File refaults 17701.64 ( +0.00%) 11765.4= 0 ( -33.53%) > > Huge page success rate is higher, allocation latencies are shorter and > more predictable. > > Stealing (fallback) rate is drastically reduced. Notably, while the > vanilla kernel keeps doing fallbacks on an ongoing basis, the patched > kernel enters a steady state once the distribution of block types is > adequate for the workload. Steals over 50 runs: > > VANILLA PATCHED > 1504.0 227.0 > 1557.0 6.0 > 1391.0 13.0 > 1080.0 26.0 > 1057.0 40.0 > 1156.0 6.0 > 805.0 46.0 > 736.0 20.0 > 1747.0 2.0 > 1699.0 34.0 > 1269.0 13.0 > 1858.0 12.0 > 907.0 4.0 > 727.0 2.0 > 563.0 2.0 > 3094.0 2.0 > 10211.0 3.0 > 2621.0 1.0 > 5508.0 2.0 > 1060.0 2.0 > 538.0 3.0 > 5773.0 2.0 > 2199.0 0.0 > 3781.0 2.0 > 1387.0 1.0 > 4977.0 0.0 > 2865.0 1.0 > 1814.0 1.0 > 3739.0 1.0 > 6857.0 0.0 > 382.0 0.0 > 407.0 1.0 > 3784.0 0.0 > 297.0 0.0 > 298.0 0.0 > 6636.0 0.0 > 4188.0 0.0 > 242.0 0.0 > 9960.0 0.0 > 5816.0 0.0 > 354.0 0.0 > 287.0 0.0 > 261.0 0.0 > 140.0 1.0 > 2065.0 0.0 > 312.0 0.0 > 331.0 0.0 > 164.0 0.0 > 465.0 1.0 > 219.0 0.0 > > Type mismatches are down too. Those count every time an allocation > request asks for one migratetype and gets another. This can still > occur minimally in the patched kernel due to non-stealing fallbacks, > but it's quite rare and follows the pattern of overall fallbacks - > once the block type distribution settles, mismatches cease as well: > > VANILLA: PATCHED: > 182602.0 268.0 > 135794.0 20.0 > 88619.0 19.0 > 95973.0 0.0 > 129590.0 0.0 > 129298.0 0.0 > 147134.0 0.0 > 230854.0 0.0 > 239709.0 0.0 > 137670.0 0.0 > 132430.0 0.0 > 65712.0 0.0 > 57901.0 0.0 > 67506.0 0.0 > 63565.0 4.0 > 34806.0 0.0 > 42962.0 0.0 > 32406.0 0.0 > 38668.0 0.0 > 61356.0 0.0 > 57800.0 0.0 > 41435.0 0.0 > 83456.0 0.0 > 65048.0 0.0 > 28955.0 0.0 > 47597.0 0.0 > 75117.0 0.0 > 55564.0 0.0 > 38280.0 0.0 > 52404.0 0.0 > 26264.0 0.0 > 37538.0 0.0 > 19671.0 0.0 > 30936.0 0.0 > 26933.0 0.0 > 16962.0 0.0 > 44554.0 0.0 > 46352.0 0.0 > 24995.0 0.0 > 35152.0 0.0 > 12823.0 0.0 > 21583.0 0.0 > 18129.0 0.0 > 31693.0 0.0 > 28745.0 0.0 > 33308.0 0.0 > 31114.0 0.0 > 35034.0 0.0 > 12111.0 0.0 > 24885.0 0.0 > > Compaction work is markedly reduced despite much better THP rates. > > In the vanilla kernel, reclaim seems to have been driven primarily by > watermark boosting that happens as a result of fallbacks. With those > all but eliminated, watermarks average lower and kswapd does less > work. The uptick in direct reclaim is because THP requests have to > fend for themselves more often - which is intended policy right > now. Aggregate reclaim activity is lowered significantly, though. This series significantly regresses Android and ChromeOS under memory pressure. THPs are virtually nonexistent on client devices, and IIRC, it was mentioned in the early discussions that potential regressions for such a case are somewhat expected? On Android (ARMv8.2), app launch time regressed by about 7%; On ChromeOS (Intel ADL), tab switch time regressed by about 8%. Also PSI (full and some) on both platforms increased by over 20%. I could post the details of the benchmarks and the metrics they measure, but I doubt they would mean much to you. I did ask our test teams to save extra kernel logs that might be more helpful, and I could forward them to you. Note that the numbers above were from the default LRU, not MGLRU, which I specifically asked our test teams to disable to double check the regressions. Given the merge window will be open soon, I don't plan to stand in its way. If we can't fix the regression after a reasonable amount of time, can we find a way to disable this series runtime/build time?