From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BC5DC4167D for ; Mon, 13 Nov 2023 05:20:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232970AbjKMFSi (ORCPT ); Mon, 13 Nov 2023 00:18:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230397AbjKMFSf (ORCPT ); Mon, 13 Nov 2023 00:18:35 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88E6C1735 for ; Sun, 12 Nov 2023 21:18:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Mw5XVGZ53qHqO6W75lYP3KgdM5arMrga6Cb5+Lo7O5w=; b=jEuPH2X3TXGxfIus5MqSUUo0rE WbVw0o+U0Q5hXmq+HbiaQlrcdxkwZDdFRWSX1yy5UzPeVxebXs3+jWvyz3dN5OF1ExcMg8u7ZkYVt Fq1ePl76US38DuAlbHhjPFEwWaQ1fY3u84J/UjOdHoIDuu7txdFJsNAHja7cfk9MHYfRtcb/Cdob3 hB28RS6D8bCKXnNBRCmv5MXJ9BQv89ucTxyUcrkIKyb5tz3AqyfJYjSLCw1nn+H4jlrajcrptNuNK LeiEeGoUS/gHIp54evg6FVSDhmCPJDqMRnstXRABo5Fo0EgV9eHdKHTP0ZRdYjlbbBjcg5E8RJs/6 UZfxZZFQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1r2PKq-00CLiX-S5; Mon, 13 Nov 2023 05:18:08 +0000 Date: Mon, 13 Nov 2023 05:18:08 +0000 From: Matthew Wilcox To: John Hubbard Cc: Ryan Roberts , Andrew Morton , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , David Rientjes , Vlastimil Babka , Hugh Dickins , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory Message-ID: References: <20230929114421.3761121-1-ryan.roberts@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 12, 2023 at 10:57:47PM -0500, John Hubbard wrote: > I've done some initial performance testing of this patchset on an arm64 > SBSA server. When these patches are combined with the arm64 arch contpte > patches in Ryan's git tree (he has conveniently combined everything > here: [1]), we are seeing a remarkable, consistent speedup of 10.5x on > some memory-intensive workloads. Many test runs, conducted independently > by different engineers and on different machines, have convinced me and > my colleagues that this is an accurate result. > > In order to achieve that result, we used the git tree in [1] with > following settings: > > echo always >/sys/kernel/mm/transparent_hugepage/enabled > echo recommend >/sys/kernel/mm/transparent_hugepage/anon_orders > > This was on a aarch64 machine configure to use a 64KB base page size. > That configuration means that the PMD size is 512MB, which is of course > too large for practical use as a pure PMD-THP. However, with with these > small-size (less than PMD-sized) THPs, we get the improvements in TLB > coverage, while still getting pages that are small enough to be > effectively usable. That is quite remarkable! My hope is to abolish the 64kB page size configuration. ie instead of using the mixture of page sizes that you currently are -- 64k and 1M (right? Order-0, and order-4), that 4k, 64k and 2MB (order-0, order-4 and order-9) will provide better performance. Have you run any experiements with a 4kB page size?