cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v4)
@ 2025-10-16  2:31 Dave Airlie
  2025-10-16  2:31 ` [PATCH 01/16] mm: add gpu active/reclaim per-node stat counters (v2) Dave Airlie
                   ` (15 more replies)
  0 siblings, 16 replies; 18+ messages in thread
From: Dave Airlie @ 2025-10-16  2:31 UTC (permalink / raw)
  To: dri-devel, tj, christian.koenig, Johannes Weiner, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song
  Cc: cgroups, Dave Chinner, Waiman Long, simona

Hi all,

This is a another repost with some fixes and cleanups. I've added Christian's acks/reviews from the
previous round. I've fixed the obj_cgroup_put into the core, instead of in the drivers.

I'd really like to land this into drm-next, I've added Maarten xe support patch to this. I'd like
to get any missing acks/reviews.

Christian, I think you said patch 4 got lost last time, hopefully you get it this time.

Patches still needing ack/review:
ttm/pool: drop numa specific pools
ttm/pool: track allocated_pages per numa node.
ttm: add objcg pointer to bo and tt (v2)
ttm/pool: enable memcg tracking and shrinker. (v2)
amdgpu: add support for memory cgroups

Differences since v1 posting:
1. added ttm_bo_set_cgroup wrapper - the cgroup reference is passed to the ttm object.
2. put the cgroup reference in ttm object release
3. rebase onto 6.19-rc1
4. added xe support patch from Maarten.

Differences since v2 posting:
1. Squashed exports into where they are used (Shakeel)
2. Fixed bug in uncharge path memcg
3. Fixed config bug in the module option.

Differences since 1st posting:
1. Added patch 18: add a module option to allow pooled pages to not be stored in the lru per-memcg
   (Requested by Christian Konig)
2. Converged the naming and stats between vmstat and memcg (Suggested by Shakeel Butt)
3. Cleaned up the charge/uncharge code and some other bits.

Dave.

Original cover letter:
tl;dr: start using list_lru/numa/memcg in GPU driver core and amdgpu driver for now.

This is a complete series of patches, some of which have been sent before and reviewed,
but I want to get the complete picture for others, and try to figure out how best to land this.

There are 3 pieces to this:
01->02: add support for global gpu stat counters (previously posted, patch 2 is newer)
03->06: port ttm pools to list_lru for numa awareness
07->13: add memcg stats + gpu apis, then port ttm pools to memcg aware list_lru and shrinker
14: enable amdgpu to use new functionality.
15: add a module option to turn it all off.

The biggest difference in the memcg code from previously is I discovered what
obj cgroups were designed for and I'm reusing the page/objcg intergration that 
already exists, to avoid reinventing that wheel right now.

There are some igt-gpu-tools tests I've written at:
https://gitlab.freedesktop.org/airlied/igt-gpu-tools/-/tree/amdgpu-cgroups?ref_type=heads

One problem is there are a lot of delayed action, that probably means the testing
needs a bit more robustness, but the tests validate all the basic paths.

Regards,
Dave.



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-10-16  7:48 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-16  2:31 drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v4) Dave Airlie
2025-10-16  2:31 ` [PATCH 01/16] mm: add gpu active/reclaim per-node stat counters (v2) Dave Airlie
2025-10-16  7:48   ` Christian König
2025-10-16  2:31 ` [PATCH 02/16] drm/ttm: use gpu mm stats to track gpu memory allocations. (v4) Dave Airlie
2025-10-16  2:31 ` [PATCH 03/16] ttm/pool: port to list_lru. (v2) Dave Airlie
2025-10-16  2:31 ` [PATCH 04/16] ttm/pool: drop numa specific pools Dave Airlie
2025-10-16  2:31 ` [PATCH 05/16] ttm/pool: make pool shrinker NUMA aware Dave Airlie
2025-10-16  2:31 ` [PATCH 06/16] ttm/pool: track allocated_pages per numa node Dave Airlie
2025-10-16  2:31 ` [PATCH 07/16] memcg: add support for GPU page counters. (v3) Dave Airlie
2025-10-16  2:31 ` [PATCH 08/16] ttm: add a memcg accounting flag to the alloc/populate APIs Dave Airlie
2025-10-16  2:31 ` [PATCH 09/16] ttm/pool: initialise the shrinker earlier Dave Airlie
2025-10-16  2:31 ` [PATCH 10/16] ttm: add objcg pointer to bo and tt (v2) Dave Airlie
2025-10-16  2:31 ` [PATCH 11/16] ttm/pool: enable memcg tracking and shrinker. (v2) Dave Airlie
2025-10-16  2:31 ` [PATCH 12/16] ttm: hook up memcg placement flags Dave Airlie
2025-10-16  2:31 ` [PATCH 13/16] memcontrol: allow objcg api when memcg is config off Dave Airlie
2025-10-16  2:31 ` [PATCH 14/16] amdgpu: add support for memory cgroups Dave Airlie
2025-10-16  2:31 ` [PATCH 15/16] ttm: add support for a module option to disable memcg integration Dave Airlie
2025-10-16  2:31 ` [PATCH 16/16] xe: create a flag to enable memcg accounting for XE as well Dave Airlie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).