public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Yin Fengwei <fengwei.yin@intel.com>,
	David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Itaru Kitayama <itaru.kitayama@gmail.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory
Date: Sun, 12 Nov 2023 22:57:47 -0500	[thread overview]
Message-ID: <c507308d-bdd4-5f9e-d4ff-e96e4520be85@nvidia.com> (raw)
In-Reply-To: <20230929114421.3761121-1-ryan.roberts@arm.com>

On 9/29/23 4:44 AM, Ryan Roberts wrote:
> Hi All,
> 
> This is v6 of a series to implement variable order, large folios for anonymous
> memory. (previously called "ANON_LARGE_FOLIO", "LARGE_ANON_FOLIO",
> "FLEXIBLE_THP", but now exposed as an extension to THP; "small-order THP"). The
> objective of this is to improve performance by allocating larger chunks of
> memory during anonymous page faults:
...
> 
> The major change in this revision is the addition of sysfs controls to allow
> this "small-order THP" to be enabled/disabled/configured independently of
> PMD-order THP. The approach I've taken differs a bit from previous discussions;
> instead of creating a whole new interface ("large_folio"), I'm extending THP. I
> personally think this makes things clearer and more extensible. See [6] for
> detailed rationale.
> 

Hi Ryan and all,

I've done some initial performance testing of this patchset on an arm64
SBSA server. When these patches are combined with the arm64 arch contpte
patches in Ryan's git tree (he has conveniently combined everything
here: [1]), we are seeing a remarkable, consistent speedup of 10.5x on
some memory-intensive workloads. Many test runs, conducted independently
by different engineers and on different machines, have convinced me and
my colleagues that this is an accurate result.

In order to achieve that result, we used the git tree in [1] with
following settings:

     echo always >/sys/kernel/mm/transparent_hugepage/enabled
     echo recommend >/sys/kernel/mm/transparent_hugepage/anon_orders

This was on a aarch64 machine configure to use a 64KB base page size.
That configuration means that the PMD size is 512MB, which is of course
too large for practical use as a pure PMD-THP. However, with with these
small-size (less than PMD-sized) THPs, we get the improvements in TLB
coverage, while still getting pages that are small enough to be
effectively usable.

These results are admittedly limited to aarch64 CPUs so far (because the
contpte TLB coalescing behavior plays a big role), but it's nice to see
real performance numbers from real computers.

Up until now, there has been some healthy discussion and debate about
various aspects of this patchset. This data point shows that at least
for some types of memory-intensive workloads (and I apologize for being
vague, at this point, about exactly *which* workloads), the performance
gains are really worth it: ~10x !

[1] https://gitlab.arm.com/linux-arm/linux-rr.git
         (branch: features/granule_perf/anonfolio-v6-contpte-v2)

thanks,

-- 
John Hubbard
NVIDIA

  parent reply	other threads:[~2023-11-13  3:58 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-29 11:44 [PATCH v6 0/9] variable-order, large folios for anonymous memory Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 1/9] mm: Allow deferred splitting of arbitrary anon large folios Ryan Roberts
2023-10-05  8:19   ` David Hildenbrand
2023-09-29 11:44 ` [PATCH v6 2/9] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2023-09-29 13:45   ` Kirill A. Shutemov
2023-09-29 14:39     ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 3/9] mm: thp: Account pte-mapped anonymous THP usage Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files Ryan Roberts
2023-09-29 22:55   ` Andrew Morton
2023-10-02 10:15     ` Ryan Roberts
2023-10-07 22:54     ` Michael Ellerman
2023-10-10  0:20       ` Andrew Morton
2023-10-12  9:31         ` David Hildenbrand
2023-10-12 11:07         ` Michael Ellerman
2023-10-11  6:02   ` kernel test robot
2023-09-29 11:44 ` [PATCH v6 5/9] mm: thp: Extend THP to allocate anonymous large folios Ryan Roberts
2023-10-05 12:05   ` Daniel Gomez
2023-10-05 12:49     ` Ryan Roberts
2023-10-05 14:59       ` Daniel Gomez
2023-10-27 23:04   ` John Hubbard
2023-10-30 11:43     ` Ryan Roberts
2023-10-30 23:25       ` John Hubbard
2023-11-01 13:56         ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 6/9] mm: thp: Add "recommend" option for anon_orders Ryan Roberts
2023-10-06 20:08   ` David Hildenbrand
2023-10-06 22:28     ` Yu Zhao
2023-10-09 11:45       ` Ryan Roberts
2023-10-09 14:43         ` David Hildenbrand
2023-10-09 20:04         ` Yu Zhao
2023-10-10 10:16           ` Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 7/9] arm64/mm: Override arch_wants_pte_order() Ryan Roberts
2023-10-02 15:21   ` Catalin Marinas
2023-10-03  7:32     ` Ryan Roberts
2023-10-03 12:05       ` Catalin Marinas
2023-09-29 11:44 ` [PATCH v6 8/9] selftests/mm/cow: Generalize do_run_with_thp() helper Ryan Roberts
2023-09-29 11:44 ` [PATCH v6 9/9] selftests/mm/cow: Add tests for small-order anon THP Ryan Roberts
2023-10-06 20:06 ` [PATCH v6 0/9] variable-order, large folios for anonymous memory David Hildenbrand
2023-10-09 11:28   ` Ryan Roberts
2023-10-09 16:22     ` David Hildenbrand
2023-10-10 10:47       ` Ryan Roberts
2023-10-13 20:14         ` David Hildenbrand
2023-10-20 12:33   ` Ryan Roberts
2023-10-25 16:24     ` Ryan Roberts
2023-10-25 18:47       ` David Hildenbrand
2023-10-25 19:11         ` Yu Zhao
2023-10-26  9:53           ` Ryan Roberts
2023-10-26 15:19             ` David Hildenbrand
2023-10-25 19:10       ` John Hubbard
2023-10-31 11:50   ` Ryan Roberts
2023-10-31 11:55     ` Ryan Roberts
2023-10-31 12:03       ` David Hildenbrand
2023-10-31 13:13         ` Ryan Roberts
2023-10-31 18:29       ` Yang Shi
2023-11-01 14:02         ` Ryan Roberts
2023-11-01 18:11           ` Yang Shi
2023-10-31 11:58     ` David Hildenbrand
2023-10-31 13:12       ` Ryan Roberts
2023-11-13  3:57 ` John Hubbard [this message]
2023-11-13  5:18   ` Matthew Wilcox
2023-11-13 10:19     ` Ryan Roberts
2023-11-13 11:52       ` Kefeng Wang
2023-11-13 12:12         ` Ryan Roberts
2023-11-13 14:52           ` Kefeng Wang
2023-11-13 14:52       ` John Hubbard
2023-11-13 15:04       ` Matthew Wilcox
2023-11-14 10:57         ` Ryan Roberts
2023-12-05 16:05           ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c507308d-bdd4-5f9e-d4ff-e96e4520be85@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=hughd@google.com \
    --cc=itaru.kitayama@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=rientjes@google.com \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox