From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CBCDC5478C for ; Sun, 3 Mar 2024 01:18:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 69D056B0092; Sat, 2 Mar 2024 20:18:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6265A6B0095; Sat, 2 Mar 2024 20:18:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49FA86B0098; Sat, 2 Mar 2024 20:18:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 353716B0092 for ; Sat, 2 Mar 2024 20:18:31 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BC14C40127 for ; Sun, 3 Mar 2024 01:18:30 +0000 (UTC) X-FDA: 81853967580.08.CD8F45F Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by imf21.hostedemail.com (Postfix) with ESMTP id 1A9D51C0003 for ; Sun, 3 Mar 2024 01:18:28 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Au2BkAys; spf=pass (imf21.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709428709; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=f77HDrndeUzZPuM3ebUR8DwsWIVLXuRKhL6nKQQKvmI=; b=jyL0SB1ZloICq4PEUfJEyXjedpam/1UMHmrU0xtm7vSALHB+cD1qhz9Ujitd5UcsfGF6cs 7EJFr2XIC1vzRJaQouuXWkbq7t8IoeM1ubXOTKJef/xTIpTBM5xf2YIGMQ4vnTM4cnF+xq +pjk1U9edbnSPoZmZxx7IaEExaPA9F4= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Au2BkAys; spf=pass (imf21.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709428709; a=rsa-sha256; cv=none; b=wQMAsnjTwDfF4ykB/Avc8p6LoKMGFXNIpXLuuREsIcvgn/5IKZ4vCWS5BAi3WMRA4tCz5n Xh68iyD7XMfq+CY1Y8Wl3it6Fmv6qVJevwwkXF+A9WgvnBXHcqnwn9eT6K9sYnxUJ6e1NC nHHmv1/lvYk7B5vYkxJDJPY6NfEYjec= Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-412dd723af4so4345e9.0 for ; Sat, 02 Mar 2024 17:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709428707; x=1710033507; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=f77HDrndeUzZPuM3ebUR8DwsWIVLXuRKhL6nKQQKvmI=; b=Au2BkAysm+db9DuvZJ/jO/Ab0hyfs5qXMY2PjfEg/oUbyBzGq3IOcZ+fpw8xbPwaf9 xr80Raj9n3ULNGUA56X6GPNKIux2z1skwJ96V8umfYus6hWjASFdpY4C73WGGZ+jXatH 6+MryVzg9ADodaAzYG2hZGSdmvKoT+K82wz04Vz1c0YzXmG446aPwirBFx9cvgSPpvvv zL2L4j0KY+UK1ePLy37WHTrEEyI5DTPPsqP5L7h0PiR/KaGayxOBGZDLfMyh3oI4cwvv upUEO5Swj1G6nd9ci64rtCE1XWDCUs/cY2s+smHssBLevQDOhoHg9T9gqWlDMc8ebFo5 THlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709428707; x=1710033507; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f77HDrndeUzZPuM3ebUR8DwsWIVLXuRKhL6nKQQKvmI=; b=cHxOSnzKPytd1ZgBCINAzDKr5K+JPBdxLC7Ej3Hzwf0mhY/3FlhLb2ojRan82zF1mi BRn11mmMqQG5YHBs8XVm8Eq3y7Y/K/2WhcQlNgfzvz42AA2VQcX00MM3gKwWU/iRmZYZ MUT7N4CdBF6q6WbMv/47uYqtFDcBF3eeTyxoAxEvRIMz5b0RWCTwKPHSB9yhSK78nKP+ BdTLPMSGlaFZJkyNLdeuHzWrrtUPfcyYAa0mPSlFdBjnqhOIFfkXS7hnxky28QhZBNtQ flMlINoHLWY2Z+ltCiKktXcoZ5RBFXFgeh9dqHCKyueYRlQ7ul1evEK7SoFvg04Y/3k5 PtBQ== X-Forwarded-Encrypted: i=1; AJvYcCVxVlV8o8UVonJE3aVhJ9YkcMNy2HZI0PLm6Mg+W4E/GSUrYjPclmp6fBMqUwskIXMb0Xe9AVok5LiNvZyyMV8Qb78= X-Gm-Message-State: AOJu0YzR90FnmE/5lnGN7rsrrLnQqo/sYoZjbTgRYfcHXhng9ouIJQQe dwOVn5/DkQKsiwfOLkJssO505iuO+8BLYsyjq7crQA6RGj+Kj2xsyk1vTnNhErWt5Qt1K2d18oT qdH7Tb5McllR9kjS9gEXqta9l24Lz1LuY8px5 X-Google-Smtp-Source: AGHT+IFzc2kFosqmZ63Sb0Zku4Hxp8FduJoV/14OCvquOSwxufgDsmf1jNYEzuBBLEDNlcSegxHOasW+WQN+osGfsMU= X-Received: by 2002:a05:600c:5118:b0:412:c811:b5fe with SMTP id o24-20020a05600c511800b00412c811b5femr244157wms.0.1709428707437; Sat, 02 Mar 2024 17:18:27 -0800 (PST) MIME-Version: 1.0 References: <20240229183436.4110845-1-yuzhao@google.com> <20240229183436.4110845-3-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Sat, 2 Mar 2024 20:17:50 -0500 Message-ID: Subject: Re: [Chapter Two] THP shattering: the reverse of collapsing To: Zi Yan Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, Jonathan Corbet Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1A9D51C0003 X-Rspam-User: X-Stat-Signature: 6paqxcobtwtz75betwjzzaccn6ys48r4 X-Rspamd-Server: rspam01 X-HE-Tag: 1709428708-435159 X-HE-Meta: U2FsdGVkX1/zPVQRQ6fFbe5cIrE1POhYesk5qkpoYqzGItL+iY/e9OwbBoBtCJaZ/5gc1lIKOXWLVovik2VZ29rmL5YYFv5VYGL9K/8KOnF9qCT+BZ3wXcZX3vJag9CeKNLGxzsqaAwejlCYxEIpYy67eVR3RvqUqHwadzSF3dBHC7QlHxXkefjJQfK/tpuPjbHMjWQBOxAnQvEFq3NTOocoPgBQru6MKmhEi0Iicd+wMZEfzR3IXJ0IiH/+lK+6OJ0avUq55EvZ7S2jqT1a+0QF4pXOSTsLBI9QdbYqoKRgirHYXoRNyIKecYNG4V/H+giwHvT5TlHVMGOFvXxCcSe4AmKJZwp5WaMAKg7AvkoqxxlQHGd3XHDw16+E7dKolJl/yLSm8ha6ElOsq8FOuPfadFU9cLhStdInelVN4lD/3GH+WLPXzT9KjDYT3MRIA7k64EEH0PvAokG274RYs9xIDymOwG/CfdC7ZnqPRVTjjvFu4bDgog110GkWCDRyja07AjC/6uO+2xZRQ7Bh0tVvZYi10oXt820sqVABPDjA5z3D71YMeeAAd/vj4tl2jEMvM+5GTGQTH5Ve5QEQaEzooXUWQvC0oQFTPc1fyDqnW1zEVCG+FGCRTXlq08GN2JZ4FTpvP1MLFb8EHeHUFyM9t2fwuUW+AS1/J7TA1QmM55asvNIJsa3gFVURScOk2pkS/M/YfWxlwRmtju+9dwQqcZlGjPdj6sq7Elw/WpCGkZjsZVufqPBXDskFb+vlSDrI5frEiPLo8UHK542lyDGd8ebUCr5fI/TwFeYJrl49Stv9IT/Lzx3fVtVE6eT5EuT3RJI0kqiARmA45fZjr78NZuponDrsuV0hS0KcYqqZlwegcq3+FsaV7w95AhhMvIM0jfedRoD8LvJoy/Jw4uFrHdkn7su6nfHfIgUQ48FB+gchPHEcKi5Vo3Vk3B7C0vV/AXgwx7t0vq6ic1C IKQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000523, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 29, 2024 at 4:55=E2=80=AFPM Zi Yan wrote: > > On 29 Feb 2024, at 13:34, Yu Zhao wrote: > > > In contrast to split, shatter migrates occupied pages in a partially > > mapped THP to a bunch of base folios. IOW, unlike split done in place, > > shatter is the exact opposite of collapse. > > > > The advantage of shattering is that it keeps the original THP intact. > > Why keep the THP intact? To prevent the THP from fragmentation, since > the shattered part will not be returned to buddy allocator for reuse? There might be a confusion here: there is no "shattered part" -- the entire THP becomes free after shattering (the occupied part is moved to a bunch of 4KB pages). > I agree with the idea of shattering, but keeping THP intact might > give us trouble for 1GB THP case when PMD mapping is created after > shattering. How to update mapcount for a PMD mapping in the middle of > a 1GB folio? I used head[0], head[512], ... as the PMD mapping head > page, but that is ugly. For mTHPs, there is no such problem since > only PTE mappings are involved. If we don't consider the copying cost during shattering, it can work for 1GB THPs as it does for 2MB THPs. > It might be better to just split the THP and move free pages to a > donot-use free list until the rest are freed too The main reason we do shattering is, using a crude analogy, a million dollar in $10,000 bills (yes, they exist) is worth a lot more than that in pennies. You can carry the former in your pocket but the latter weighs at least 250 tons. So if we split, we lose money. 1GB THP is one of the important *end goals* for TAO. But I don't want to go into details since we need to focus on the first few steps at the current stage. The problem with shattering for 1GB is the copying cost -- if we shatter a 1GB THP half mapped/unmapped, we'd have to copy 512MB data, which is unacceptable. 1GB THP requires something we call "THP fungibility" (see the epilogue) -- we do split in place, but we also "collapse" in place (called THP recovery, i.e., MADV_RECOVERY). Shattering is for 2MB THPs only. > if the zone enforces > a minimal order that is larger than the free pages. > > > The cost of copying during the migration is not a side effect, but > > rather by design, since splitting is considered a discouraged > > behavior. In retail terms, the return of a purchase is charged with a > > restocking fee and the original goods can be resold. > > > > THPs from ZONE_NOMERGE can only be shattered, since they cannot be > > split or merged. THPs from ZONE_NOSPLIT can be shattered or split (the > > latter requires [1]), if they are above the minimum order. > > > > [1] https://lore.kernel.org/20240226205534.1603748-1-zi.yan@sent.com/ > > > > > -- > Best Regards, > Yan, Zi