From: Bharata B Rao <bharata@amd.com>
To: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Cc: <Jonathan.Cameron@huawei.com>, <dave.hansen@intel.com>,
<gourry@gourry.net>, <mgorman@techsingularity.net>,
<mingo@redhat.com>, <peterz@infradead.org>,
<raghavendra.kt@amd.com>, <riel@surriel.com>,
<rientjes@google.com>, <sj@kernel.org>, <weixugc@google.com>,
<willy@infradead.org>, <ying.huang@linux.alibaba.com>,
<ziy@nvidia.com>, <dave@stgolabs.net>, <nifan.cxl@gmail.com>,
<xuezhengchu@huawei.com>, <yiannis@zptcorp.com>,
<akpm@linux-foundation.org>, <david@redhat.com>,
<byungchul@sk.com>, <kinseyho@google.com>,
<joshua.hahnjy@gmail.com>, <yuanchu@google.com>,
<balbirs@nvidia.com>, <alok.rathore@samsung.com>,
<shivankg@amd.com>
Subject: Re: [RFC PATCH v6 0/5] mm: Hot page tracking and promotion infrastructure
Date: Mon, 23 Mar 2026 15:28:29 +0530 [thread overview]
Message-ID: <957f2242-56d4-4bf0-8aeb-9d60fbea8c8c@amd.com> (raw)
In-Reply-To: <20260323095104.238982-1-bharata@amd.com>
Redis-memtier results
Test system details
-------------------
3 node AMD Zen5 system with 2 regular NUMA nodes (0, 1) and a CXL node (2)
$ numactl -H
available: 3 nodes (0-2)
node 0 cpus: 0-95,192-287
node 0 size: 128460 MB
node 1 cpus: 96-191,288-383
node 1 size: 128893 MB
node 2 cpus:
node 2 size: 257993 MB
node distances:
node 0 1 2
0: 10 32 50
1: 32 10 60
2: 255 255 10
Hotness sources
---------------
NUMAB0 - Without NUMA Balancing in base case and with no source enabled
in the patched case. No migrations occur.
NUMAB2 - Existing hot page promotion for the base case and
use of hint faults as source in the patched case.
Pghot by default promotes after two accesses but for NUMAB2 source,
promotion is done after one access to match the base behaviour.
(/sys/kernel/debug/pghot/freq_threshold=1)
==============================================================
Scenario 1 - Enough memory in toptier and hence only promotion
==============================================================
In the setup phase, 64GB database is provisioned and explicitly moved
to Node 2 by migrating redis-server's memory to Node 2.
Memtier is run on Node 1.
Parallel distribution, 50% of the keys accessed, each 4 times.
16 Threads
100 Connections per thread
77808 Requests per client
==================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9
Latency KB/sec
--------------------------------------------------------------------------------------------------
Base, NUMAB0
Totals 226611.42 225.92873 224.25500 423.93500
454.65500 514886.68
--------------------------------------------------------------------------------------------------
Base, NUMAB2
Totals 257211.48 204.99755 216.06300 370.68700
454.65500 584413.47
--------------------------------------------------------------------------------------------------
pghot-default, NUMAB2
Totals 255631.78 209.20335 216.06300 378.87900
450.55900 580824.22
--------------------------------------------------------------------------------------------------
pghot-precise, NUMAB2
Totals 249494.46 209.31820 212.99100 380.92700
448.51100 566879.53
==================================================================================================
pgpromote_success
==================================
Base, NUMAB0 0
Base, NUMAB2 10,435,176
pghot-default, NUMAB2 10,435,235
pghot-precise, NUMAB2 10,435,294
==================================
- There is a clear benefit of hot page promotion seen. Both
base and pghot show similar benefits.
- The number of pages promoted in both cases are more or less
same.
==============================================================
Scenario 2 - Toptier memory overcommited, promotion + demotion
==============================================================
In the setup phase, 192GB database is provisioned. The database occupies
Node 1 entirely(~128GB) and spills over to Node 2 (~64GB).
Memtier is run on Node 1.
Parallel distribution, 50% of the keys accessed, each 4 times.
16 Threads
100 Connections per thread
233424 Requests per client
==================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9
Latency KB/sec
--------------------------------------------------------------------------------------------------
Base, NUMAB0
Totals 237743.40 217.72842 201.72700 395.26300
440.31900 540389.78
--------------------------------------------------------------------------------------------------
Base, NUMAB2
Totals 235935.72 219.36544 210.94300 411.64700
477.18300 536280.93
--------------------------------------------------------------------------------------------------
pghot-default, NUMAB2
Totals 248283.99 219.74875 211.96700 413.69500
509.95100 564348.49
--------------------------------------------------------------------------------------------------
pghot-precise, NUMAB2
Totals 240529.35 222.11878 215.03900 411.64700
464.89500 546722.22
==================================================================================================
pgpromote_success pgdemote_kswapd
===============================================================
Base, NUMAB0 0 672,591
Base, NUMAB2 350,632 689,751
pghot-default, NUMAB2 17,118,987 17,421,474
pghot-precise, NUMAB2 24,030,292 24,342,569
===============================================================
- No clear benefit is seen with hot page promotion both in base and pghot case.
- Most promotion attempts in base case fail because the NUMA hint fault latency
is found to exceed the threshold value (default threshold of 1000ms) in
majority of the promotion attempts.
- Unlike base NUMAB2 where the hint fault latency is the difference between the
PTE update time (during scanning) and the access time (hint fault), pghot uses
a single latency threshold (3000ms in pghot-default and 5000ms in
pghot-precise) for two purposes.
1. If the time difference between successive accesses are within the
threshold, the page is marked as hot.
2. Later when kmigrated picks up the page for migration, it will migrate
only if the difference between the current time and the time when the
page was marked hot is with the threshold.
Because of the above difference in behaviour, more number of pages get
qualified for promotion compared to base NUMAB2.
next prev parent reply other threads:[~2026-03-23 9:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-23 9:50 [RFC PATCH v6 0/5] mm: Hot page tracking and promotion infrastructure Bharata B Rao
2026-03-23 9:51 ` [RFC PATCH v6 1/5] mm: migrate: Allow misplaced migration without VMA Bharata B Rao
2026-03-23 9:51 ` [RFC PATCH v6 2/5] mm: migrate: Add migrate_misplaced_folios_batch() Bharata B Rao
2026-03-26 5:50 ` Bharata B Rao
2026-03-23 9:51 ` [RFC PATCH v6 3/5] mm: Hot page tracking and promotion - pghot Bharata B Rao
2026-03-23 9:51 ` [RFC PATCH v6 4/5] mm: pghot: Precision mode for pghot Bharata B Rao
2026-03-26 10:41 ` Bharata B Rao
2026-03-23 9:51 ` [RFC PATCH v6 5/5] mm: sched: move NUMA balancing tiering promotion to pghot Bharata B Rao
2026-03-23 9:56 ` [RFC PATCH v6 0/5] mm: Hot page tracking and promotion infrastructure Bharata B Rao
2026-03-23 9:58 ` Bharata B Rao [this message]
2026-03-23 9:59 ` Bharata B Rao
2026-03-23 10:01 ` Bharata B Rao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=957f2242-56d4-4bf0-8aeb-9d60fbea8c8c@amd.com \
--to=bharata@amd.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=alok.rathore@samsung.com \
--cc=balbirs@nvidia.com \
--cc=byungchul@sk.com \
--cc=dave.hansen@intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=gourry@gourry.net \
--cc=joshua.hahnjy@gmail.com \
--cc=kinseyho@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=nifan.cxl@gmail.com \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=shivankg@amd.com \
--cc=sj@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=xuezhengchu@huawei.com \
--cc=yiannis@zptcorp.com \
--cc=ying.huang@linux.alibaba.com \
--cc=yuanchu@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox