From: Matthew Wilcox <willy@infradead.org>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Yin Fengwei <fengwei.yin@intel.com>,
David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Yang Shi <shy828301@gmail.com>,
"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Itaru Kitayama <itaru.kitayama@gmail.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
David Rientjes <rientjes@google.com>,
Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory
Date: Mon, 13 Nov 2023 05:18:08 +0000 [thread overview]
Message-ID: <ZVGxkMeY50JSesaj@casper.infradead.org> (raw)
In-Reply-To: <c507308d-bdd4-5f9e-d4ff-e96e4520be85@nvidia.com>
On Sun, Nov 12, 2023 at 10:57:47PM -0500, John Hubbard wrote:
> I've done some initial performance testing of this patchset on an arm64
> SBSA server. When these patches are combined with the arm64 arch contpte
> patches in Ryan's git tree (he has conveniently combined everything
> here: [1]), we are seeing a remarkable, consistent speedup of 10.5x on
> some memory-intensive workloads. Many test runs, conducted independently
> by different engineers and on different machines, have convinced me and
> my colleagues that this is an accurate result.
>
> In order to achieve that result, we used the git tree in [1] with
> following settings:
>
> echo always >/sys/kernel/mm/transparent_hugepage/enabled
> echo recommend >/sys/kernel/mm/transparent_hugepage/anon_orders
>
> This was on a aarch64 machine configure to use a 64KB base page size.
> That configuration means that the PMD size is 512MB, which is of course
> too large for practical use as a pure PMD-THP. However, with with these
> small-size (less than PMD-sized) THPs, we get the improvements in TLB
> coverage, while still getting pages that are small enough to be
> effectively usable.
That is quite remarkable!
My hope is to abolish the 64kB page size configuration. ie instead of
using the mixture of page sizes that you currently are -- 64k and
1M (right? Order-0, and order-4), that 4k, 64k and 2MB (order-0,
order-4 and order-9) will provide better performance.
Have you run any experiements with a 4kB page size?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@infradead.org>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Yin Fengwei <fengwei.yin@intel.com>,
David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Yang Shi <shy828301@gmail.com>,
"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Itaru Kitayama <itaru.kitayama@gmail.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
David Rientjes <rientjes@google.com>,
Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory
Date: Mon, 13 Nov 2023 05:18:08 +0000 [thread overview]
Message-ID: <ZVGxkMeY50JSesaj@casper.infradead.org> (raw)
In-Reply-To: <c507308d-bdd4-5f9e-d4ff-e96e4520be85@nvidia.com>
On Sun, Nov 12, 2023 at 10:57:47PM -0500, John Hubbard wrote:
> I've done some initial performance testing of this patchset on an arm64
> SBSA server. When these patches are combined with the arm64 arch contpte
> patches in Ryan's git tree (he has conveniently combined everything
> here: [1]), we are seeing a remarkable, consistent speedup of 10.5x on
> some memory-intensive workloads. Many test runs, conducted independently
> by different engineers and on different machines, have convinced me and
> my colleagues that this is an accurate result.
>
> In order to achieve that result, we used the git tree in [1] with
> following settings:
>
> echo always >/sys/kernel/mm/transparent_hugepage/enabled
> echo recommend >/sys/kernel/mm/transparent_hugepage/anon_orders
>
> This was on a aarch64 machine configure to use a 64KB base page size.
> That configuration means that the PMD size is 512MB, which is of course
> too large for practical use as a pure PMD-THP. However, with with these
> small-size (less than PMD-sized) THPs, we get the improvements in TLB
> coverage, while still getting pages that are small enough to be
> effectively usable.
That is quite remarkable!
My hope is to abolish the 64kB page size configuration. ie instead of
using the mixture of page sizes that you currently are -- 64k and
1M (right? Order-0, and order-4), that 4k, 64k and 2MB (order-0,
order-4 and order-9) will provide better performance.
Have you run any experiements with a 4kB page size?
next prev parent reply other threads:[~2023-11-13 5:18 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-29 11:44 [PATCH v6 0/9] variable-order, large folios for anonymous memory Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 1/9] mm: Allow deferred splitting of arbitrary anon large folios Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-10-05 8:19 ` David Hildenbrand
2023-10-05 8:19 ` David Hildenbrand
2023-09-29 11:44 ` [PATCH v6 2/9] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-09-29 13:45 ` Kirill A. Shutemov
2023-09-29 13:45 ` Kirill A. Shutemov
2023-09-29 14:39 ` Ryan Roberts
2023-09-29 14:39 ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 3/9] mm: thp: Account pte-mapped anonymous THP usage Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-09-29 22:55 ` Andrew Morton
2023-09-29 22:55 ` Andrew Morton
2023-09-29 22:55 ` Andrew Morton
2023-10-02 10:15 ` Ryan Roberts
2023-10-02 10:15 ` Ryan Roberts
2023-10-02 10:15 ` Ryan Roberts
2023-10-07 22:54 ` Michael Ellerman
2023-10-07 22:54 ` Michael Ellerman
2023-10-07 22:54 ` Michael Ellerman
2023-10-10 0:20 ` Andrew Morton
2023-10-10 0:20 ` Andrew Morton
2023-10-10 0:20 ` Andrew Morton
2023-10-12 9:31 ` David Hildenbrand
2023-10-12 9:31 ` David Hildenbrand
2023-10-12 9:31 ` David Hildenbrand
2023-10-12 11:07 ` Michael Ellerman
2023-10-12 11:07 ` Michael Ellerman
2023-10-12 11:07 ` Michael Ellerman
2023-10-11 6:02 ` kernel test robot
2023-10-11 6:02 ` kernel test robot
2023-09-29 11:44 ` [PATCH v6 5/9] mm: thp: Extend THP to allocate anonymous large folios Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-10-05 12:05 ` Daniel Gomez
2023-10-05 12:05 ` Daniel Gomez
2023-10-05 12:49 ` Ryan Roberts
2023-10-05 12:49 ` Ryan Roberts
2023-10-05 14:59 ` Daniel Gomez
2023-10-05 14:59 ` Daniel Gomez
2023-10-27 23:04 ` John Hubbard
2023-10-27 23:04 ` John Hubbard
2023-10-30 11:43 ` Ryan Roberts
2023-10-30 11:43 ` Ryan Roberts
2023-10-30 23:25 ` John Hubbard
2023-10-30 23:25 ` John Hubbard
2023-11-01 13:56 ` Ryan Roberts
2023-11-01 13:56 ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 6/9] mm: thp: Add "recommend" option for anon_orders Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-10-06 20:08 ` David Hildenbrand
2023-10-06 20:08 ` David Hildenbrand
2023-10-06 22:28 ` Yu Zhao
2023-10-06 22:28 ` Yu Zhao
2023-10-09 11:45 ` Ryan Roberts
2023-10-09 11:45 ` Ryan Roberts
2023-10-09 14:43 ` David Hildenbrand
2023-10-09 14:43 ` David Hildenbrand
2023-10-09 20:04 ` Yu Zhao
2023-10-09 20:04 ` Yu Zhao
2023-10-10 10:16 ` Ryan Roberts
2023-10-10 10:16 ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 7/9] arm64/mm: Override arch_wants_pte_order() Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-10-02 15:21 ` Catalin Marinas
2023-10-02 15:21 ` Catalin Marinas
2023-10-03 7:32 ` Ryan Roberts
2023-10-03 7:32 ` Ryan Roberts
2023-10-03 12:05 ` Catalin Marinas
2023-10-03 12:05 ` Catalin Marinas
2023-09-29 11:44 ` [PATCH v6 8/9] selftests/mm/cow: Generalize do_run_with_thp() helper Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 9/9] selftests/mm/cow: Add tests for small-order anon THP Ryan Roberts
2023-09-29 11:44 ` Ryan Roberts
2023-10-06 20:06 ` [PATCH v6 0/9] variable-order, large folios for anonymous memory David Hildenbrand
2023-10-06 20:06 ` David Hildenbrand
2023-10-09 11:28 ` Ryan Roberts
2023-10-09 11:28 ` Ryan Roberts
2023-10-09 16:22 ` David Hildenbrand
2023-10-09 16:22 ` David Hildenbrand
2023-10-10 10:47 ` Ryan Roberts
2023-10-10 10:47 ` Ryan Roberts
2023-10-13 20:14 ` David Hildenbrand
2023-10-13 20:14 ` David Hildenbrand
2023-10-20 12:33 ` Ryan Roberts
2023-10-20 12:33 ` Ryan Roberts
2023-10-25 16:24 ` Ryan Roberts
2023-10-25 16:24 ` Ryan Roberts
2023-10-25 18:47 ` David Hildenbrand
2023-10-25 18:47 ` David Hildenbrand
2023-10-25 19:11 ` Yu Zhao
2023-10-25 19:11 ` Yu Zhao
2023-10-26 9:53 ` Ryan Roberts
2023-10-26 9:53 ` Ryan Roberts
2023-10-26 15:19 ` David Hildenbrand
2023-10-26 15:19 ` David Hildenbrand
2023-10-25 19:10 ` John Hubbard
2023-10-25 19:10 ` John Hubbard
2023-10-31 11:50 ` Ryan Roberts
2023-10-31 11:50 ` Ryan Roberts
2023-10-31 11:55 ` Ryan Roberts
2023-10-31 11:55 ` Ryan Roberts
2023-10-31 12:03 ` David Hildenbrand
2023-10-31 12:03 ` David Hildenbrand
2023-10-31 13:13 ` Ryan Roberts
2023-10-31 13:13 ` Ryan Roberts
2023-10-31 18:29 ` Yang Shi
2023-10-31 18:29 ` Yang Shi
2023-11-01 14:02 ` Ryan Roberts
2023-11-01 14:02 ` Ryan Roberts
2023-11-01 18:11 ` Yang Shi
2023-11-01 18:11 ` Yang Shi
2023-10-31 11:58 ` David Hildenbrand
2023-10-31 11:58 ` David Hildenbrand
2023-10-31 13:12 ` Ryan Roberts
2023-10-31 13:12 ` Ryan Roberts
2023-11-13 3:57 ` John Hubbard
2023-11-13 3:57 ` John Hubbard
2023-11-13 5:18 ` Matthew Wilcox [this message]
2023-11-13 5:18 ` Matthew Wilcox
2023-11-13 10:19 ` Ryan Roberts
2023-11-13 10:19 ` Ryan Roberts
2023-11-13 11:52 ` Kefeng Wang
2023-11-13 11:52 ` Kefeng Wang
2023-11-13 12:12 ` Ryan Roberts
2023-11-13 12:12 ` Ryan Roberts
2023-11-13 14:52 ` Kefeng Wang
2023-11-13 14:52 ` Kefeng Wang
2023-11-13 14:52 ` John Hubbard
2023-11-13 14:52 ` John Hubbard
2023-11-13 15:04 ` Matthew Wilcox
2023-11-13 15:04 ` Matthew Wilcox
2023-11-14 10:57 ` Ryan Roberts
2023-11-14 10:57 ` Ryan Roberts
2023-12-05 16:05 ` Matthew Wilcox
2023-12-05 16:05 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZVGxkMeY50JSesaj@casper.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=fengwei.yin@intel.com \
--cc=hughd@google.com \
--cc=itaru.kitayama@gmail.com \
--cc=jhubbard@nvidia.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=rientjes@google.com \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=vbabka@suse.cz \
--cc=ying.huang@intel.com \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.