From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA76AC4167B for ; Tue, 31 Oct 2023 13:12:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 437C06B02F4; Tue, 31 Oct 2023 09:12:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E7FE6B02F5; Tue, 31 Oct 2023 09:12:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D6526B02F6; Tue, 31 Oct 2023 09:12:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1AD366B02F4 for ; Tue, 31 Oct 2023 09:12:26 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D6B61120861 for ; Tue, 31 Oct 2023 13:12:25 +0000 (UTC) X-FDA: 81405795450.20.BD29BD2 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 12AFE40012 for ; Tue, 31 Oct 2023 13:12:22 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698757943; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I90jh+Bm1tmFYdgCa5G5DsN9bZyvmQMY/Ttsn+CoSU8=; b=2OEQJqqPaVrpARcL/8WGh4JZFwfl4qiDCEvsyKJxTLJgq2ymTEtCOKOA5CDC0FzdyztWGU bzFDqxuBbHuvpabfVkyY/dXzP89lYXzt5lQtG4RUH5wi5+ocD4tYdHCROpUYVBRfQ8uQan udHPKYiMn/M4WiX4+2CIF0bboa7+XPc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698757943; a=rsa-sha256; cv=none; b=YcDVxzmxaubrPL1w7qqG46I2GvPi/RiTvjibBJAu1LV5Ie25loto3sRgNuNkAp7oIGTyAO mhjaNGaBKtnvOmn+Fzy9tqucxNnXLjti150TW59JueQI4uK2xrOyjZ9VNuzZoxmCOsfxGu 0iaS/X8we9Wq8wsqrJLvYe3y7tgL9Xg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 94DFCC15; Tue, 31 Oct 2023 06:13:03 -0700 (PDT) Received: from [10.1.34.180] (XHFQ2J9959.cambridge.arm.com [10.1.34.180]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 82A713F738; Tue, 31 Oct 2023 06:12:19 -0700 (PDT) Message-ID: Date: Tue, 31 Oct 2023 13:12:18 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory Content-Language: en-GB To: David Hildenbrand , Andrew Morton , Matthew Wilcox , Yin Fengwei , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , John Hubbard , David Rientjes , Vlastimil Babka , Hugh Dickins Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <20230929114421.3761121-1-ryan.roberts@arm.com> <6d89fdc9-ef55-d44e-bf12-fafff318aef8@redhat.com> <7a3a2d49-528d-4297-ae19-56aa9e6c59c6@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 12AFE40012 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: eo5feuj595zb3u7mco6ycpuwhpg83egn X-HE-Tag: 1698757942-414700 X-HE-Meta: U2FsdGVkX1/Np6FzrEDM+xo2auM0rlmCb0gCAgBNXGe+WIo1gGgHQgIlbFCDbXYqZIERVAG42tHCTO7B0zcOA3v863tqBhNvRImUtN/VB5B+GF1DmS1828LvrjQe2qDrX6ghcBiD3CTpibaVuPY1YSKQo7+Wso+cNuaWIiTiG0xegOgt5RiFum2wBWXLjV8dy/i9TRSrnuS9uB0qh03p+JqVqQP2MgvNsnvbbHRp0NMn6AAd8OxbJt4A1URYI+hd6IIyny2G7gMFpHvA5bbHZSoPF9pQqTuKSA/Ymi5/sM95tcgP8wiajdjV2ExTm7Y8v5RTbSjSXj7DyVooKbEkOjVxSzqGXnuctl8rlVWl0ZLdWVn2tu6eN1K9xvEqRI9kOefQLSDfPOd8upMaYYwQ/2X51RlBxlB0Psb+C05fzttL6c0Wld06hKdW4WP/TUqKIN0BSY4R9CYFkxtHLsAIYQ5ghe0JjDu/adKRGQbAuquMvuIbI/V5VO0Lr0WyvO4a+6XddR++AVDZft9gQWPzfwPG7h2cs3DqFKvDH0TW6y4wnvY1/J5AFGqlpp41yqUyeBvRHWlm2AXN0beLLvuE9A6DEISRdbrdHXFHJBOgzOFcg8dkrPfiPTD5UpQhxviz0A5NxkUm+vw8e9459fv01/xZklKTeq10jC8tWiBf6/oANpN10YHTbZIGlJSFj1ljSJRiXUQ+H2uzU21XcDHe4t656+dfKvQRU60quscSZgF/z1U9pqiibRCwSD6LTAdjKiAXpvlqPhwFnGJZju2jDfsKypzAz6TjWDFTDGDlvgOdalVsW/Pewb44eMvOwk6kxs5GfhT6yRQ8aKerLvX7mqtjw6dEClJBMes3xC7MiShbYv/oVZOcQHd9Ho+hMYys8c69fv1O9Cv8cLb4xxJlWTWPMlmUEnyra/jTFpx6I2drDXLoAJ/Bs5GfOcj2U9fbYXFvbYHvzCGzhR4fK8J T5it/oHK hx5BRhMrv5ipYh3jyhSgfZWGYqu6xoTONAq1N2pbDxDzlPo+wIY9ZTbsVdcfiozjIh7y1UMIAo9WoQbQZbine9rho9o2GXZ1lRTiqMUDMCVx5qgWRwrZSccSTc2qOlZNDqPSxB1KNHufbR8p8A+xkRJw1Fg9pazbmFAJz+5Sy4yQvf0R3lNq+sdVf0QdDaLJj1gDS2b8l5RdkzAhCtajE1r5i5ZkjuLJaCqXL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 31/10/2023 11:58, David Hildenbrand wrote: > On 31.10.23 12:50, Ryan Roberts wrote: >> On 06/10/2023 21:06, David Hildenbrand wrote: >> [...] >>> >>> Change 2: sysfs interface. >>> >>> If we call it THP, it shall go under "/sys/kernel/mm/transparent_hugepage/", I >>> agree. >>> >>> What we expose there and how, is TBD. Again, not a friend of "orders" and >>> bitmaps at all. We can do better if we want to go down that path. >>> >>> Maybe we should take a look at hugetlb, and how they added support for multiple >>> sizes. What *might* make sense could be (depending on which values we actually >>> support!) >>> >>> >>> /sys/kernel/mm/transparent_hugepage/hugepages-64kB/ >>> /sys/kernel/mm/transparent_hugepage/hugepages-128kB/ >>> /sys/kernel/mm/transparent_hugepage/hugepages-256kB/ >>> /sys/kernel/mm/transparent_hugepage/hugepages-512kB/ >>> /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/ >>> /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/ >>> >>> Each one would contain an "enabled" and "defrag" file. We want something minimal >>> first? Start with the "enabled" option. >>> >>> >>> enabled: always [global] madvise never >>> >>> Initially, we would set it for PMD-sized THP to "global" and for everything else >>> to "never". >> >> Hi David, > > Hi! > >> >> I've just started coding this, and it occurs to me that I might need a small >> clarification here; the existing global "enabled" control is used to drive >> decisions for both anonymous memory and (non-shmem) file-backed memory. But the >> proposed new per-size "enabled" is implicitly only controlling anon memory (for >> now). > > Anon was (way) first, and pagecache later decided to reuse that one as an > indication whether larger folios are desired. > > For the pagecache, it's just a way to enable/disable it globally. As there is no > memory waste, nobody currently really cares about the exact sized the pagecache > is allocating (maybe that will change at some point, maybe not, who knows). Yup. Its not _just_ about allocation though; its also about collapse (MADV_COLLAPSE, khugepaged) which is supported for pagecache pages. I can imagine value in collapsing to various sizes that are beneficial for HW... anyway that's for another day. > >> >> 1) Is this potentially confusing for the user? Should we rename the per-size >> controls to "anon_enabled"? Or is it preferable to jsut keep it vague for now so >> we can reuse the same control for file-backed memory in future? > > The latter would be my take. Just like we did with the global toggle. ACK > >> >> 2) The global control will continue to drive the file-backed memory decision >> (for now), even when hugepages-2048kB/enabled != "global"; agreed? > > That would be my take; it will allocate other sizes already, so just glue it to > the global toggle and document for the other toggles that they only control > anonymous THP for now. ACK >