public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH 0/2] mm: batch TLB flushing for dirty folios in vmscan
@ 2026-03-09  8:17 Zhang Peng via B4 Relay
  2026-03-09  8:17 ` [PATCH 1/2] mm/vmscan: refactor shrink_folio_list for readability and maintainability Zhang Peng via B4 Relay
  2026-03-09  8:17 ` [PATCH 2/2] mm, vmscan: flush TLB for every 31 folios evictions Zhang Peng via B4 Relay
  0 siblings, 2 replies; 6+ messages in thread
From: Zhang Peng via B4 Relay @ 2026-03-09  8:17 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Qi Zheng,
	Shakeel Butt, Axel Rasmussen, Yuanchu Xie, Wei Xu, Michal Hocko
  Cc: linux-mm, linux-kernel, Kairui Song, Zhang Peng

This series introduces batch TLB flushing optimization for dirty folios
during memory reclaim, aiming to reduce IPI overhead on multi-core systems.

Background
----------
Currently, when performing pageout in memory reclaim, try_to_unmap_flush_dirty()
is called for each dirty folio individually. On multi-core systems, this causes
frequent IPIs which can significantly impact performance.

Approach
--------
This patch series accumulates dirty folios into batches and performs a single
TLB flush for the entire batch, rather than flushing for each individual folio.

Changes
-------
Patch 1: Refactor shrink_folio_list() to improve code readability and
         maintainability by extracting common logic into helper functions:
         - folio_active_bounce(): Handle folio activation logic
         - folio_free(): Handle folio freeing logic
         - pageout_one(): Handle single folio pageout logic

Patch 2: Implement batch TLB flushing logic. Dirty folios are accumulated
         in batches and a single TLB flush is performed for each batch
         before calling pageout.

Testing
-------
The benchmark script uses stress-ng to compare TLB shootdown behavior before and
after this patch. It constrains a stress-ng workload via memcg to force reclaim
through shrink_folio_list(), reporting TLB shootdowns and IPIs.

Core benchmark command: stress-ng --vm 16 --vm-bytes 2G --vm-keep --timeout 60

==========================================================================
                 batch_dirty_tlb_flush Benchmark Results
==========================================================================
  Kernel: 7.0.0-rc1+   CPUs: 16
  MemTotal: 31834M   SwapTotal: 8191M
  memcg limit: 512M   alloc: 2G   workers: 16   duration: 60s
--------------------------------------------------------------------------
Metric                 Before        After             Delta (abs / %)
--------------------------------------------------------------------------
bogo ops/s             28238.63      35833.97          +7595.34 (+26.9%)
TLB shootdowns         55428953      17621697          -37807256 (-68.2%)
Function call IPIs     34073695      14498768          -19574927 (-57.4%)
pgscan_anon (pages)    52856224      60252894          7396670 (+14.0%)
pgsteal_anon (pages)   29004962      34054753          5049791 (+17.4%)
--------------------------------------------------------------------------

Suggested-by: Kairui Song <kasong@tencent.com>
Signed-off-by: Zhang Peng <bruzzhang@tencent.com>
---
bruzzhang (2):
      mm/vmscan: refactor shrink_folio_list for readability and maintainability
      mm, vmscan: flush TLB for every 31 folios evictions

 include/linux/vmstat.h |   1 +
 mm/vmscan.c            | 387 +++++++++++++++++++++++++++++++------------------
 2 files changed, 245 insertions(+), 143 deletions(-)
---
base-commit: 49cb736d092aaa856283e33b78ec3afb3964d82f
change-id: 20260309-batch-tlb-flush-893f0e56b496

Best regards,
-- 
Zhang Peng <zippermonkey@icloud.com>




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-09 14:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09  8:17 [PATCH 0/2] mm: batch TLB flushing for dirty folios in vmscan Zhang Peng via B4 Relay
2026-03-09  8:17 ` [PATCH 1/2] mm/vmscan: refactor shrink_folio_list for readability and maintainability Zhang Peng via B4 Relay
2026-03-09  8:17 ` [PATCH 2/2] mm, vmscan: flush TLB for every 31 folios evictions Zhang Peng via B4 Relay
2026-03-09 12:29   ` Usama Arif
2026-03-09 13:19     ` Kairui Song
2026-03-09 14:56     ` Zhang Peng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox