The Linux Kernel Mailing List
 help / color / mirror / Atom feed
  • [parent not found: <20260504060924.344313-5-bharata@amd.com>]
  • * Re: [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure
           [not found] <20260504060924.344313-1-bharata@amd.com>
           [not found] ` <20260504060924.344313-3-bharata@amd.com>
           [not found] ` <20260504060924.344313-5-bharata@amd.com>
    @ 2026-05-04 20:36 ` Matthew Wilcox
      2026-05-05 22:17   ` Balbir Singh
      2026-05-06 15:22   ` Gregory Price
           [not found] ` <20260504060924.344313-6-bharata@amd.com>
                       ` (2 subsequent siblings)
      5 siblings, 2 replies; 19+ messages in thread
    From: Matthew Wilcox @ 2026-05-04 20:36 UTC (permalink / raw)
      To: Bharata B Rao
      Cc: linux-kernel, linux-mm, Jonathan.Cameron, dave.hansen, gourry,
    	mgorman, mingo, peterz, raghavendra.kt, riel, rientjes, sj,
    	weixugc, ying.huang, ziy, dave, nifan.cxl, xuezhengchu, yiannis,
    	akpm, david, byungchul, kinseyho, joshua.hahnjy, yuanchu, balbirs,
    	alok.rathore, shivankg, donettom
    
    On Mon, May 04, 2026 at 11:39:17AM +0530, Bharata B Rao wrote:
    > This is v7 of pghot, a hot-page tracking and promotion subsystem. The
    
    I continue to think we should not do this.
    
    ^ permalink raw reply	[flat|nested] 19+ messages in thread
  • [parent not found: <20260504060924.344313-6-bharata@amd.com>]
  • * Re: [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure
           [not found] <20260504060924.344313-1-bharata@amd.com>
                       ` (3 preceding siblings ...)
           [not found] ` <20260504060924.344313-6-bharata@amd.com>
    @ 2026-05-05 10:41 ` Bharata B Rao
      2026-05-09  1:18   ` Andrew Morton
      2026-05-05 13:42 ` Bharata B Rao
      5 siblings, 1 reply; 19+ messages in thread
    From: Bharata B Rao @ 2026-05-05 10:41 UTC (permalink / raw)
      To: linux-kernel, linux-mm
      Cc: Jonathan.Cameron, dave.hansen, gourry, mgorman, mingo, peterz,
    	raghavendra.kt, riel, rientjes, sj, weixugc, willy, ying.huang,
    	ziy, dave, nifan.cxl, xuezhengchu, yiannis, akpm, david,
    	byungchul, kinseyho, joshua.hahnjy, yuanchu, balbirs,
    	alok.rathore, shivankg, donettom
    
    On 04-May-26 11:39 AM, Bharata B Rao wrote:
    > Results
    > =======
    > Posted as replies to this mail thread.
    
    Graph500 benchmark results:
    
    Test system details
    -------------------
    3 node AMD Zen5 system with 2 regular NUMA nodes (0, 1) and a CXL node (2)
    
    $ numactl -H
    available: 3 nodes (0-2)
    node 0 cpus: 0-95,192-287
    node 0 size: 128460 MB
    node 1 cpus: 96-191,288-383
    node 1 size: 128893 MB
    node 2 cpus:
    node 2 size: 257993 MB
    node distances:
    node   0   1   2
      0:  10  32  50
      1:  32  10  60
      2:  255  255  10
    
    Hotness sources
    ---------------
    NUMAB0 - Without NUMA Balancing in base case and with no source enabled
             in the pghot case. No migrations occur.
    NUMAB2 - Existing hot page promotion for the base case and
             use of hint faults as source in the pghot case.
    NUMAB3 - Enabled both regular and tiering mode of NUMA Balancing
             (kernel.numa_balancing=3)
    
    Pghot by default promotes after two accesses but for NUMAB2 source,
    promotion is done after one access to match the base behaviour.
    (/sys/kernel/debug/pghot/freq_threshold=1)
    
    Graph500 details
    ----------------
    Command: mpirun -n 128 --bind-to core --map-by core
    graph500/src/graph500_reference_bfs 28 16
    
    After the graph creation, the processes are stopped and data is migrated
    to CXL node 2 before continuing so that BFS phase starts accessing lower
    tier memory.
    
    Total memory usage is slightly over 100GB and will fit within Node 0 and 1.
    Hence there is no memory pressure to induce demotions.
    
    harmonic_mean_TEPS - Higher is better
    =====================================================================================
                            Base            Base            pghot-default
    pghot-precise
                            NUMAB0          NUMAB2          NUMAB2          NUMAB2
    =====================================================================================
    harmonic_mean_TEPS      5.08026e+08     7.48633e+08     5.46257e+08     7.45101e+08
    mean_time               8.45413         5.73702         7.86245         5.76421
    median_TEPS             5.09236e+08     7.25058e+08     5.40525e+08     7.63752e+08
    max_TEPS                5.15244e+08     1.03391e+09     8.51317e+08     9.7552e+08
    
    pgpromote_success       0               13809474        13763582        13763155
    numa_pte_updates        0               26746117        39502157        36368086
    numa_hint_faults        0               13811769        24248272        21172314
    =====================================================================================
                                                            pghot-default
                                                            NUMAB3
    =====================================================================================
    harmonic_mean_TEPS                                      7.00515e+08
    mean_time                                               6.13109
    median_TEPS                                             7.06813e+08
    max_TEPS                                                7.63164e+08
    
    pgpromote_success                                       13762087
    numa_pte_updates                                        93632490
    numa_hint_faults                                        70566306
    =====================================================================================
    - The base case shows a good improvement with NUMAB2 in harmonic_mean_TEPS.
    - The same improvement gets maintained with pghot-precise too.
    - pghot-default mode doesn't show benefit even when achieving similar page promotion
      numbers. This mode doesn't track accessing NID and by default promotes to NID=0
      which probably isn't all that beneficial as processes are running on both Node 0
      and Node 1.
    - pghot-default recovers the performance when balancing between toptier nodes
      0 and 1 is enabled in addition to hot page promotion.
    
    
    ^ permalink raw reply	[flat|nested] 19+ messages in thread
  • * Re: [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure
           [not found] <20260504060924.344313-1-bharata@amd.com>
                       ` (4 preceding siblings ...)
      2026-05-05 10:41 ` [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure Bharata B Rao
    @ 2026-05-05 13:42 ` Bharata B Rao
      5 siblings, 0 replies; 19+ messages in thread
    From: Bharata B Rao @ 2026-05-05 13:42 UTC (permalink / raw)
      To: linux-kernel, linux-mm
      Cc: Jonathan.Cameron, dave.hansen, gourry, mgorman, mingo, peterz,
    	raghavendra.kt, riel, rientjes, sj, weixugc, willy, ying.huang,
    	ziy, dave, nifan.cxl, xuezhengchu, yiannis, akpm, david,
    	byungchul, kinseyho, joshua.hahnjy, yuanchu, balbirs,
    	alok.rathore, shivankg, donettom
    
    On 04-May-26 11:39 AM, Bharata B Rao wrote:
    > Results
    > =======
    > Posted as replies to this mail thread.
    
    Initial Graph500 benchmark numbers for IBS Memory Profiler source:
    
    Test system details
    -------------------
    3 node AMD system with 2 regular NUMA nodes (0, 1) in NPS2 mode and a CXL node (2)
    
    $ numactl -H
    available: 3 nodes (0-2)
    node distances:
    node 0 cpus: 0-63,128-191
    node 0 size: 257715 MB
    node 1 cpus: 64-127,192-255
    node 1 size: 257845 MB
    node 2 cpus:
    node 2 size: 258032 MB
    node distances:
    node   0   1   2
      0:  10  12  50
      1:  12  10  50
      2:  255  255  10
    
    Hotness sources
    ---------------
    NUMAB0 - Without NUMA Balancing in base case and with no source enabled
             in the pghot case. No migrations occur.
    NUMAB2 - Existing hot page promotion for the base case and
             use of hint faults as source in the pghot case.
    HWHINTS - IBS Memory Profiler as source for pghot
    
    Pghot by default promotes after two accesses but for NUMAB2 and HWHINTS
    sources, promotion is done after one access to match the base behaviour.
    (/sys/kernel/debug/pghot/freq_threshold=1)
    
    Graph500 details
    ----------------
    Command: mpirun -n 128 --bind-to core --map-by core
    graph500/src/graph500_reference_bfs 28 16
    
    After the graph creation, the processes are stopped and data is migrated
    to CXL node 2 before continuing so that BFS phase starts accessing lower
    tier memory.
    
    Total memory usage is slightly over 100GB and will fit within Node 0 and 1.
    Hence there is no memory pressure to induce demotions.
    
    harmonic_mean_TEPS - Higher is better
    =============================================================================
                                    Base            Base            pghot-default
                                    NUMAB0          NUMAB2          NUMAB2
    =============================================================================
    harmonic_mean_TEPS              4.09614e+08     1.28401e+09     1.47926e+09
    mean_time                       10.4853         3.34492         2.90342
    median_TEPS                     4.10086e+08     1.44584e+09     1.85957e+09
    max_TEPS                        4.1661e+08      1.79773e+09     1.99242e+09
    
    pgpromote_success               0               13746029        13412213
    numa_hint_faults                0               13753808        26669823
    
    pghot_recorded_accesses         NA              NA              26669551
    pghot_recorded_hintfaults       NA              NA              26669823
    pghot_recorded_hwhints          NA              NA              0
    hwhint_total_events             NA              NA              0
    =============================================================================
                                                                    pghot-default
                                                                    HWHINTS
    =============================================================================
    harmonic_mean_TEPS                                              1.52334e+09
    mean_time                                                       2.81941
    median_TEPS                                                     1.57446e+09
    max_TEPS                                                        1.72014e+09
    
    pgpromote_success                                               3415599
    numa_hint_faults                                                0
    
    pghot_recorded_accesses                                         3440912
    pghot_recorded_hintfaults                                       0
    pghot_recorded_hwhints                                          24475210
    hwhint_total_events                                             24475244
    =============================================================================
    While no migration (NUMAB0) at all hurts Graph500, HWHINTS with pghot is able
    to provide similar benchmark numbers even when not migrating as aggressively
    as base NUMAB2.
    
    ^ permalink raw reply	[flat|nested] 19+ messages in thread

  • end of thread, other threads:[~2026-05-11 14:38 UTC | newest]
    
    Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <20260504060924.344313-1-bharata@amd.com>
         [not found] ` <20260504060924.344313-3-bharata@amd.com>
    2026-05-04 18:14   ` [PATCH v7 2/7] mm: migrate: Add promote_misplaced_memcg_folios() Donet Tom
    2026-05-06  6:15     ` Bharata B Rao
         [not found] ` <20260504060924.344313-5-bharata@amd.com>
    2026-05-04 18:41   ` [PATCH v7 4/7] mm: pghot: Precision mode for pghot Donet Tom
    2026-05-06  6:17     ` Bharata B Rao
    2026-05-04 20:36 ` [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure Matthew Wilcox
    2026-05-05 22:17   ` Balbir Singh
    2026-05-06  3:43     ` Bharata B Rao
    2026-05-06  4:02       ` Balbir Singh
    2026-05-06  5:00         ` Bharata B Rao
    2026-05-06 15:22   ` Gregory Price
    2026-05-11 10:02     ` Bharata B Rao
    2026-05-11 14:27       ` Gregory Price
         [not found] ` <20260504060924.344313-6-bharata@amd.com>
    2026-05-05  4:44   ` [PATCH v7 5/7] mm: sched: move NUMA balancing tiering promotion to pghot Donet Tom
    2026-05-06  6:20     ` Bharata B Rao
    2026-05-05 10:41 ` [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure Bharata B Rao
    2026-05-09  1:18   ` Andrew Morton
    2026-05-11 10:37     ` Bharata B Rao
    2026-05-11 14:38       ` Gregory Price
    2026-05-05 13:42 ` Bharata B Rao
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox