From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E5963C4332F for ; Tue, 31 Oct 2023 13:13:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=c70kqAIehUNrqOP6fx0zh1bd4NIhDWhFr5ORhzw4hTk=; b=dOe+20tKOcaUjn fA1OExO2uFV5ewunoh2bmjGUcX6u+Ih+4+g4cOAww7OGGXsGQ2P8PJqwYLWDlriyWhDD1CbjEBy14 kEtGJQ3fSXEB1Vuw/D6xDGXj6MRw09NXhtPG+Qt4udNCxHbK2un88T0NNS7IJZp4loc6C8gASn4Wp XKqwG8OXSYVsfoXR9WDWDmy/gzHInOmvAHdK90EBRzhThAZoQDA5IX7PUFq6PIy8IerCmwZnSSTxY vlnvjytzFiGvmRSFdsjhlOpygf/iEWHiMkcjIC4qq65guv8J74VXFLNvnZ/u+gGYSgAltcUB1Tl8Z YzkkCmT1FFEI3984e+3A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qxoYY-005KHY-1T; Tue, 31 Oct 2023 13:13:18 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qxoYU-005KFs-1m for linux-arm-kernel@lists.infradead.org; Tue, 31 Oct 2023 13:13:16 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2C522C15; Tue, 31 Oct 2023 06:13:55 -0700 (PDT) Received: from [10.1.34.180] (XHFQ2J9959.cambridge.arm.com [10.1.34.180]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1946A3F738; Tue, 31 Oct 2023 06:13:10 -0700 (PDT) Message-ID: <5001e231-795f-4d8c-bd9d-16096e428aef@arm.com> Date: Tue, 31 Oct 2023 13:13:10 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory Content-Language: en-GB To: David Hildenbrand , Andrew Morton , Matthew Wilcox , Yin Fengwei , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , John Hubbard , David Rientjes , Vlastimil Babka , Hugh Dickins Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <20230929114421.3761121-1-ryan.roberts@arm.com> <6d89fdc9-ef55-d44e-bf12-fafff318aef8@redhat.com> <7a3a2d49-528d-4297-ae19-56aa9e6c59c6@arm.com> From: Ryan Roberts In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231031_061314_682955_5CB647B6 X-CRM114-Status: GOOD ( 16.07 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 31/10/2023 12:03, David Hildenbrand wrote: > On 31.10.23 12:55, Ryan Roberts wrote: >> On 31/10/2023 11:50, Ryan Roberts wrote: >>> On 06/10/2023 21:06, David Hildenbrand wrote: >>> [...] >>>> >>>> Change 2: sysfs interface. >>>> >>>> If we call it THP, it shall go under "/sys/kernel/mm/transparent_hugepage/", I >>>> agree. >>>> >>>> What we expose there and how, is TBD. Again, not a friend of "orders" and >>>> bitmaps at all. We can do better if we want to go down that path. >>>> >>>> Maybe we should take a look at hugetlb, and how they added support for multiple >>>> sizes. What *might* make sense could be (depending on which values we actually >>>> support!) >>>> >>>> >>>> /sys/kernel/mm/transparent_hugepage/hugepages-64kB/ >>>> /sys/kernel/mm/transparent_hugepage/hugepages-128kB/ >>>> /sys/kernel/mm/transparent_hugepage/hugepages-256kB/ >>>> /sys/kernel/mm/transparent_hugepage/hugepages-512kB/ >>>> /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/ >>>> /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/ >>>> >>>> Each one would contain an "enabled" and "defrag" file. We want something >>>> minimal >>>> first? Start with the "enabled" option. >>>> >>>> >>>> enabled: always [global] madvise never >>>> >>>> Initially, we would set it for PMD-sized THP to "global" and for everything >>>> else >>>> to "never". >>> >>> Hi David, >>> >>> I've just started coding this, and it occurs to me that I might need a small >>> clarification here; the existing global "enabled" control is used to drive >>> decisions for both anonymous memory and (non-shmem) file-backed memory. But the >>> proposed new per-size "enabled" is implicitly only controlling anon memory (for >>> now). >>> >>> 1) Is this potentially confusing for the user? Should we rename the per-size >>> controls to "anon_enabled"? Or is it preferable to jsut keep it vague for now so >>> we can reuse the same control for file-backed memory in future? >>> >>> 2) The global control will continue to drive the file-backed memory decision >>> (for now), even when hugepages-2048kB/enabled != "global"; agreed? >>> >>> Thanks, >>> Ryan >>> >> >> Also, an implementation question: >> >> hugepage_vma_check() doesn't currently care whether enabled="never" for DAX VMAs >> (although it does honour MADV_NOHUGEPAGE and the prctl); It will return true >> regardless. Is that by design? It couldn't fathom any reasoning from the >> commit log: > > The whole DAX "hugepage" and THP mixup is just plain confusing. We're simply > using PUD/PMD mappings of DAX memory, and PMD/PTE- remap when required (VMA > split I assume, COW). > > It doesn't result in any memory waste, so who really cares how it's mapped? > Apparently we want individual processes to just disable PMD/PUD mappings of DAX > using the prctl and madvise. Maybe there are good reasons. > > Looks like a design decision, probably some legacy leftovers. OK, I'll ensure I keep this behaviour. Thanks! > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel