public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Bharata B Rao <bharata@amd.com>
To: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Cc: <Jonathan.Cameron@huawei.com>, <dave.hansen@intel.com>,
	<gourry@gourry.net>, <mgorman@techsingularity.net>,
	<mingo@redhat.com>, <peterz@infradead.org>,
	<raghavendra.kt@amd.com>, <riel@surriel.com>,
	<rientjes@google.com>, <sj@kernel.org>, <weixugc@google.com>,
	<willy@infradead.org>, <ying.huang@linux.alibaba.com>,
	<ziy@nvidia.com>, <dave@stgolabs.net>, <nifan.cxl@gmail.com>,
	<xuezhengchu@huawei.com>, <yiannis@zptcorp.com>,
	<akpm@linux-foundation.org>, <david@kernel.org>,
	<byungchul@sk.com>, <kinseyho@google.com>,
	<joshua.hahnjy@gmail.com>, <yuanchu@google.com>,
	<balbirs@nvidia.com>, <alok.rathore@samsung.com>,
	<shivankg@amd.com>, <donettom@linux.ibm.com>
Subject: Re: [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure
Date: Tue, 5 May 2026 16:11:43 +0530	[thread overview]
Message-ID: <5110e313-8c1e-4f73-b77f-68d20c2046c8@amd.com> (raw)
In-Reply-To: <20260504060924.344313-1-bharata@amd.com>

On 04-May-26 11:39 AM, Bharata B Rao wrote:
> Results
> =======
> Posted as replies to this mail thread.

Graph500 benchmark results:

Test system details
-------------------
3 node AMD Zen5 system with 2 regular NUMA nodes (0, 1) and a CXL node (2)

$ numactl -H
available: 3 nodes (0-2)
node 0 cpus: 0-95,192-287
node 0 size: 128460 MB
node 1 cpus: 96-191,288-383
node 1 size: 128893 MB
node 2 cpus:
node 2 size: 257993 MB
node distances:
node   0   1   2
  0:  10  32  50
  1:  32  10  60
  2:  255  255  10

Hotness sources
---------------
NUMAB0 - Without NUMA Balancing in base case and with no source enabled
         in the pghot case. No migrations occur.
NUMAB2 - Existing hot page promotion for the base case and
         use of hint faults as source in the pghot case.
NUMAB3 - Enabled both regular and tiering mode of NUMA Balancing
         (kernel.numa_balancing=3)

Pghot by default promotes after two accesses but for NUMAB2 source,
promotion is done after one access to match the base behaviour.
(/sys/kernel/debug/pghot/freq_threshold=1)

Graph500 details
----------------
Command: mpirun -n 128 --bind-to core --map-by core
graph500/src/graph500_reference_bfs 28 16

After the graph creation, the processes are stopped and data is migrated
to CXL node 2 before continuing so that BFS phase starts accessing lower
tier memory.

Total memory usage is slightly over 100GB and will fit within Node 0 and 1.
Hence there is no memory pressure to induce demotions.

harmonic_mean_TEPS - Higher is better
=====================================================================================
                        Base            Base            pghot-default
pghot-precise
                        NUMAB0          NUMAB2          NUMAB2          NUMAB2
=====================================================================================
harmonic_mean_TEPS      5.08026e+08     7.48633e+08     5.46257e+08     7.45101e+08
mean_time               8.45413         5.73702         7.86245         5.76421
median_TEPS             5.09236e+08     7.25058e+08     5.40525e+08     7.63752e+08
max_TEPS                5.15244e+08     1.03391e+09     8.51317e+08     9.7552e+08

pgpromote_success       0               13809474        13763582        13763155
numa_pte_updates        0               26746117        39502157        36368086
numa_hint_faults        0               13811769        24248272        21172314
=====================================================================================
                                                        pghot-default
                                                        NUMAB3
=====================================================================================
harmonic_mean_TEPS                                      7.00515e+08
mean_time                                               6.13109
median_TEPS                                             7.06813e+08
max_TEPS                                                7.63164e+08

pgpromote_success                                       13762087
numa_pte_updates                                        93632490
numa_hint_faults                                        70566306
=====================================================================================
- The base case shows a good improvement with NUMAB2 in harmonic_mean_TEPS.
- The same improvement gets maintained with pghot-precise too.
- pghot-default mode doesn't show benefit even when achieving similar page promotion
  numbers. This mode doesn't track accessing NID and by default promotes to NID=0
  which probably isn't all that beneficial as processes are running on both Node 0
  and Node 1.
- pghot-default recovers the performance when balancing between toptier nodes
  0 and 1 is enabled in addition to hot page promotion.


  parent reply	other threads:[~2026-05-05 10:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-04  6:09 [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure Bharata B Rao
2026-05-04  6:09 ` [PATCH v7 1/7] mm: migrate: Allow misplaced migration without VMA Bharata B Rao
2026-05-04  6:09 ` [PATCH v7 2/7] mm: migrate: Add promote_misplaced_memcg_folios() Bharata B Rao
2026-05-04 18:14   ` Donet Tom
2026-05-06  6:15     ` Bharata B Rao
2026-05-04  6:09 ` [PATCH v7 3/7] mm: Hot page tracking and promotion - pghot Bharata B Rao
2026-05-04  6:09 ` [PATCH v7 4/7] mm: pghot: Precision mode for pghot Bharata B Rao
2026-05-04 18:41   ` Donet Tom
2026-05-06  6:17     ` Bharata B Rao
2026-05-04  6:09 ` [PATCH v7 5/7] mm: sched: move NUMA balancing tiering promotion to pghot Bharata B Rao
2026-05-05  4:44   ` Donet Tom
2026-05-06  6:20     ` Bharata B Rao
2026-05-04  6:09 ` [RFC PATCH v7 6/7] x86/ibs: Move IBS caps definitions into its own header Bharata B Rao
2026-05-04  6:09 ` [RFC PATCH v7 7/7] x86/mm/ibs: In-kernel driver for AMD IBS Memory Profiler Bharata B Rao
2026-05-04  6:23 ` [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure Bharata B Rao
2026-05-04 20:36 ` Matthew Wilcox
2026-05-05 22:17   ` Balbir Singh
2026-05-06  3:43     ` Bharata B Rao
2026-05-06  4:02       ` Balbir Singh
2026-05-06  5:00         ` Bharata B Rao
2026-05-06 15:22   ` Gregory Price
2026-05-05 10:41 ` Bharata B Rao [this message]
2026-05-05 13:42 ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5110e313-8c1e-4f73-b77f-68d20c2046c8@amd.com \
    --to=bharata@amd.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alok.rathore@samsung.com \
    --cc=balbirs@nvidia.com \
    --cc=byungchul@sk.com \
    --cc=dave.hansen@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@kernel.org \
    --cc=donettom@linux.ibm.com \
    --cc=gourry@gourry.net \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kinseyho@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=nifan.cxl@gmail.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=shivankg@amd.com \
    --cc=sj@kernel.org \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=xuezhengchu@huawei.com \
    --cc=yiannis@zptcorp.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox