linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Split list_lru lock into per-cgroup scope
@ 2024-06-24 17:53 Kairui Song
  2024-06-24 17:53 ` [PATCH 1/7] mm/swap, workingset: make anon workingset nodes memcg aware Kairui Song
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Kairui Song @ 2024-06-24 17:53 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Nhat Pham, Michal Hocko,
	Chengming Zhou, Qi Zheng, Muchun Song, Chris Li, Yosry Ahmed,
	Huang, Ying, Kairui Song

From: Kairui Song <kasong@tencent.com>

Currently, every list_lru has a per-node lock that protects adding,
deletion, isolation, and reparenting of all list_lru_one instances
belonging to this list_lru on this node. This lock contention is heavy
when multiple cgroups modify the same list_lru.

This can be alleviated by splitting the lock into per-cgroup scope.

To achieve this, this series reworks and optimizes the reparenting
process step by step, making it possible to have a stable list_lru_one,
and making it possible to pin the list_lru_one. Then split the lock
into per-cgroup scope.

The result is reduced LOC and better performance: I see a ~25%
improvement for multi-cgroup SWAP over ZRAM and a ~10% improvement for
multi-cgroup inode / dentry workload, as tested in PATCH 6/7:

memhog SWAP test (shadow nodes):
Before:
real    0m20.328s user    0m4.315s sys     10m23.639s
real    0m20.440s user    0m4.142s sys     10m34.756s
real    0m20.381s user    0m4.164s sys     10m29.035s

After:
real    0m15.156s user    0m4.590s sys     7m34.361s
real    0m15.161s user    0m4.776s sys     7m35.086s
real    0m15.429s user    0m4.734s sys     7m42.919s

File read test (inode / dentry):
Before:
real    0m26.939s user    0m36.322s sys     6m30.248s
real    0m15.111s user    0m33.749s sys     5m4.991s
real    0m16.796s user    0m33.438s sys     5m22.865s
real    0m15.256s user    0m34.060s sys     4m56.870s
real    0m14.826s user    0m33.531s sys     4m55.907s
real    0m15.664s user    0m35.619s sys     6m3.638s
real    0m15.746s user    0m34.066s sys     4m56.519s

After:
real    0m22.166s user    0m35.155s sys     6m21.045s
real    0m13.753s user    0m34.554s sys     4m40.982s
real    0m13.815s user    0m34.693s sys     4m39.605s
real    0m13.495s user    0m34.372s sys     4m40.776s
real    0m13.895s user    0m34.005s sys     4m39.061s
real    0m13.629s user    0m33.476s sys     4m43.626s
real    0m14.001s user    0m33.463s sys     4m41.261s

PATCH 1/7: Fixes a long-existing bug, so shadow nodes will be accounted
    to the right cgroup and put into the right list_lru.
PATCH 2/7 - 4/7: Clean up
PATCH 6/7: Reworks and optimizes reparenting process, avoids touching
    kmemcg_id on reparenting as first step.
PATCH 7/7: Makes it possible to pin the list_lru_one and prevent racing
    with reparenting, and splits the lock.

Kairui Song (7):
  mm/swap, workingset: make anon workingset nodes memcg aware
  mm/list_lru: don't pass unnecessary key parameters
  mm/list_lru: don't export list_lru_add
  mm/list_lru: code clean up for reparenting
  mm/list_lru: simplify reparenting and initial allocation
  mm/list_lru: split the lock to per-cgroup scope
  mm/list_lru: Simplify the list_lru walk callback function

 drivers/android/binder_alloc.c |   6 +-
 drivers/android/binder_alloc.h |   2 +-
 fs/dcache.c                    |   4 +-
 fs/gfs2/quota.c                |   2 +-
 fs/inode.c                     |   5 +-
 fs/nfs/nfs42xattr.c            |   4 +-
 fs/nfsd/filecache.c            |   5 +-
 fs/xfs/xfs_buf.c               |   2 -
 fs/xfs/xfs_qm.c                |   6 +-
 include/linux/list_lru.h       |  26 ++-
 mm/list_lru.c                  | 387 +++++++++++++++++----------------
 mm/memcontrol.c                |  10 +-
 mm/swap_state.c                |   3 +-
 mm/workingset.c                |  20 +-
 mm/zswap.c                     |  12 +-
 15 files changed, 246 insertions(+), 248 deletions(-)

-- 
2.45.2



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-08-27 18:36 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-24 17:53 [PATCH 0/7] Split list_lru lock into per-cgroup scope Kairui Song
2024-06-24 17:53 ` [PATCH 1/7] mm/swap, workingset: make anon workingset nodes memcg aware Kairui Song
2024-07-17  3:25   ` Muchun Song
2024-07-18 11:33     ` Kairui Song
2024-07-19  1:34   ` Shakeel Butt
2024-06-24 17:53 ` [PATCH 2/7] mm/list_lru: don't pass unnecessary key parameters Kairui Song
2024-06-24 17:53 ` [PATCH 3/7] mm/list_lru: don't export list_lru_add Kairui Song
2024-07-17  3:12   ` Muchun Song
2024-06-24 17:53 ` [PATCH 4/7] mm/list_lru: code clean up for reparenting Kairui Song
2024-07-15  9:10   ` Muchun Song
2024-07-16  8:15     ` Kairui Song
2024-06-24 17:53 ` [PATCH 5/7] mm/list_lru: simplify reparenting and initial allocation Kairui Song
2024-07-17  3:04   ` Muchun Song
2024-07-18 11:49     ` Kairui Song
2024-07-19  2:45       ` Muchun Song
2024-06-24 17:53 ` [PATCH 6/7] mm/list_lru: split the lock to per-cgroup scope Kairui Song
2024-06-24 17:53 ` [PATCH 7/7] mm/list_lru: Simplify the list_lru walk callback function Kairui Song
2024-06-27 19:58   ` kernel test robot
2024-06-24 21:26 ` [PATCH 0/7] Split list_lru lock into per-cgroup scope Andrew Morton
2024-06-25  7:47   ` Kairui Song
2024-06-25 17:00   ` Shakeel Butt
2024-08-27 18:35 ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).