[PATCH 07/16] memcg: add support for GPU page counters. (v4)

public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed

From: Dave Airlie <airlied@gmail.com>
To: dri-devel@lists.freedesktop.org, tj@kernel.org,
	christian.koenig@amd.com, Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Muchun Song <muchun.song@linux.dev>
Cc: cgroups@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	Waiman Long <longman@redhat.com>,
	simona@ffwll.ch
Subject: [PATCH 07/16] memcg: add support for GPU page counters. (v4)
Date: Tue, 24 Feb 2026 12:06:24 +1000	[thread overview]
Message-ID: <20260224020854.791201-8-airlied@gmail.com> (raw)
In-Reply-To: <20260224020854.791201-1-airlied@gmail.com>

From: Dave Airlie <airlied@redhat.com>

This introduces 2 new statistics and 3 new memcontrol APIs for dealing
with GPU system memory allocations.

The stats corresponds to the same stats in the global vmstat,
for number of active GPU pages, and number of pages in pools that
can be reclaimed.

The first API charges a order of pages to a objcg, and sets
the objcg on the pages like kmem does, and updates the active/reclaim
statistic.

The second API uncharges a page from the obj cgroup it is currently charged
to.

The third API allows moving a page to/from reclaim and between obj cgroups.
When pages are added to the pool lru, this just updates accounting.
When pages are being removed from a pool lru, they can be taken from
the parent objcg so this allows them to be uncharged from there and transferred
to a new child objcg.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
---
v2: use memcg_node_stat_items
v3: fix null ptr dereference in uncharge
v4: AI review: fix parameter names, fix problem with reclaim moving doing wrong thing
---
 Documentation/admin-guide/cgroup-v2.rst |   6 ++
 include/linux/memcontrol.h              |  11 +++
 mm/memcontrol.c                         | 104 ++++++++++++++++++++++++
 3 files changed, 121 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 91beaa6798ce..3ea7f1a399e8 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1573,6 +1573,12 @@ The following nested keys are defined.
 	  vmalloc (npn)
 		Amount of memory used for vmap backed memory.
 
+	  gpu_active (npn)
+		Amount of system memory used for GPU devices.
+
+	  gpu_reclaim (npn)
+		Amount of system memory cached for GPU devices.
+
 	  shmem
 		Amount of cached filesystem data that is swap-backed,
 		such as tmpfs, shm segments, shared anonymous mmap()s
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 70b685a85bf4..4f75d64f5fca 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1583,6 +1583,17 @@ static inline void mem_cgroup_flush_foreign(struct bdi_writeback *wb)
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
 struct sock;
+bool mem_cgroup_charge_gpu_page(struct obj_cgroup *objcg, struct page *page,
+			   unsigned int order,
+			   gfp_t gfp_mask, bool reclaim);
+void mem_cgroup_uncharge_gpu_page(struct page *page,
+				  unsigned int order,
+				  bool reclaim);
+bool mem_cgroup_move_gpu_page_reclaim(struct obj_cgroup *objcg,
+				      struct page *page,
+				      unsigned int order,
+				      bool to_reclaim);
+
 #ifdef CONFIG_MEMCG
 extern struct static_key_false memcg_sockets_enabled_key;
 #define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_enabled_key)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a52da3a5e4fd..90bb3e00c258 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -333,6 +333,8 @@ static const unsigned int memcg_node_stat_items[] = {
 #ifdef CONFIG_HUGETLB_PAGE
 	NR_HUGETLB,
 #endif
+	NR_GPU_ACTIVE,
+	NR_GPU_RECLAIM,
 };
 
 static const unsigned int memcg_stat_items[] = {
@@ -1360,6 +1362,8 @@ static const struct memory_stat memory_stats[] = {
 	{ "percpu",			MEMCG_PERCPU_B			},
 	{ "sock",			MEMCG_SOCK			},
 	{ "vmalloc",			MEMCG_VMALLOC			},
+	{ "gpu_active",			NR_GPU_ACTIVE			},
+	{ "gpu_reclaim",		NR_GPU_RECLAIM	                },
 	{ "shmem",			NR_SHMEM			},
 #ifdef CONFIG_ZSWAP
 	{ "zswap",			MEMCG_ZSWAP_B			},
@@ -5133,6 +5137,106 @@ void mem_cgroup_flush_workqueue(void)
 	flush_workqueue(memcg_wq);
 }
 
+/**
+ * mem_cgroup_charge_gpu_page - charge a page to GPU memory tracking
+ * @objcg: objcg to charge, NULL charges root memcg
+ * @page: page to charge
+ * @order: page allocation order
+ * @gfp_mask: gfp mode
+ * @reclaim: charge the reclaim counter instead of the active one.
+ *
+ * Charge the order sized @page to the objcg. Returns %true if the charge fit within
+ * @objcg's configured limit, %false if it doesn't.
+ */
+bool mem_cgroup_charge_gpu_page(struct obj_cgroup *objcg, struct page *page,
+				unsigned int order, gfp_t gfp_mask, bool reclaim)
+{
+	unsigned int nr_pages = 1 << order;
+	struct mem_cgroup *memcg = NULL;
+	struct lruvec *lruvec;
+	int ret;
+
+	if (objcg) {
+		memcg = get_mem_cgroup_from_objcg(objcg);
+
+		ret = try_charge_memcg(memcg, gfp_mask, nr_pages);
+		if (ret) {
+			mem_cgroup_put(memcg);
+			return false;
+		}
+
+		obj_cgroup_get(objcg);
+		page_set_objcg(page, objcg);
+	}
+
+	lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page));
+	mod_lruvec_state(lruvec, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, nr_pages);
+
+	mem_cgroup_put(memcg);
+	return true;
+}
+EXPORT_SYMBOL_GPL(mem_cgroup_charge_gpu_page);
+
+/**
+ * mem_cgroup_uncharge_gpu_page - uncharge a page from GPU memory tracking
+ * @page: page to uncharge
+ * @order: order of the page allocation
+ * @reclaim: uncharge the reclaim counter instead of the active.
+ */
+void mem_cgroup_uncharge_gpu_page(struct page *page,
+				  unsigned int order, bool reclaim)
+{
+	struct obj_cgroup *objcg = page_objcg(page);
+	struct mem_cgroup *memcg;
+	struct lruvec *lruvec;
+	int nr_pages = 1 << order;
+
+	memcg = objcg ? get_mem_cgroup_from_objcg(objcg) : NULL;
+
+	lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page));
+	mod_lruvec_state(lruvec, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, -nr_pages);
+
+	if (memcg && !mem_cgroup_is_root(memcg))
+		refill_stock(memcg, nr_pages);
+	page->memcg_data = 0;
+	obj_cgroup_put(objcg);
+	mem_cgroup_put(memcg);
+}
+EXPORT_SYMBOL_GPL(mem_cgroup_uncharge_gpu_page);
+
+/**
+ * mem_cgroup_move_gpu_reclaim - move pages from gpu to gpu reclaim and back
+ * @new_objcg: objcg to move page to, NULL if just stats update.
+ * @nr_pages: number of pages to move
+ * @to_reclaim: true moves pages into reclaim, false moves them back
+ */
+bool mem_cgroup_move_gpu_page_reclaim(struct obj_cgroup *new_objcg,
+				      struct page *page,
+				      unsigned int order,
+				      bool to_reclaim)
+{
+	struct obj_cgroup *objcg = page_objcg(page);
+
+	if (!objcg || !new_objcg || objcg == new_objcg) {
+		struct mem_cgroup *memcg = objcg ? get_mem_cgroup_from_objcg(objcg) : NULL;
+		struct lruvec *lruvec;
+		unsigned long flags;
+		int nr_pages = 1 << order;
+
+		lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page));
+		local_irq_save(flags);
+		mod_lruvec_state(lruvec, to_reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, nr_pages);
+		mod_lruvec_state(lruvec, to_reclaim ? NR_GPU_ACTIVE : NR_GPU_RECLAIM, -nr_pages);
+		local_irq_restore(flags);
+		mem_cgroup_put(memcg);
+		return true;
+	} else {
+		mem_cgroup_uncharge_gpu_page(page, order, true);
+		return mem_cgroup_charge_gpu_page(new_objcg, page, order, 0, false);
+	}
+}
+EXPORT_SYMBOL_GPL(mem_cgroup_move_gpu_page_reclaim);
+
 static int __init cgroup_memory(char *s)
 {
 	char *token;
-- 
2.52.0

next prev parent reply	other threads:[~2026-02-24  2:10 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-24  2:06 drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v5) Dave Airlie
2026-02-24  2:06 ` [PATCH 01/16] mm: add gpu active/reclaim per-node stat counters (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 02/16] drm/ttm: use gpu mm stats to track gpu memory allocations. (v4) Dave Airlie
2026-02-24  2:06 ` [PATCH 03/16] ttm/pool: port to list_lru. (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 04/16] ttm/pool: drop numa specific pools Dave Airlie
2026-02-24  2:06 ` [PATCH 05/16] ttm/pool: make pool shrinker NUMA aware (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 06/16] ttm/pool: track allocated_pages per numa node Dave Airlie
2026-02-24  2:06 ` Dave Airlie [this message]
2026-02-24  7:20   ` [PATCH 07/16] memcg: add support for GPU page counters. (v4) kernel test robot
2026-02-24  7:50   ` Christian König
2026-02-24 19:28     ` Dave Airlie
2026-02-25  9:09       ` Christian König
2026-03-02 14:15         ` Shakeel Butt
2026-03-02 14:37           ` Christian König
2026-03-02 15:40             ` Shakeel Butt
2026-03-02 15:51               ` Christian König
2026-03-02 17:16                 ` Shakeel Butt
2026-03-02 19:36                   ` Christian König
2026-03-05  3:23                     ` Dave Airlie
2026-03-02 19:35                 ` T.J. Mercier
2026-03-03  9:29                   ` Christian König
2026-03-03 17:25                     ` T.J. Mercier
2026-03-05  3:19                   ` Dave Airlie
2026-03-05  9:25                     ` Christian König
2026-03-10  1:27                     ` T.J. Mercier
2026-02-24  2:06 ` [PATCH 08/16] ttm: add a memcg accounting flag to the alloc/populate APIs Dave Airlie
2026-02-24  8:42   ` kernel test robot
2026-02-24  2:06 ` [PATCH 09/16] ttm/pool: initialise the shrinker earlier Dave Airlie
2026-02-24  2:06 ` [PATCH 10/16] ttm: add objcg pointer to bo and tt (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 11/16] ttm/pool: enable memcg tracking and shrinker. (v3) Dave Airlie
2026-02-24  2:06 ` [PATCH 12/16] ttm: hook up memcg placement flags Dave Airlie
2026-02-24  2:06 ` [PATCH 13/16] memcontrol: allow objcg api when memcg is config off Dave Airlie
2026-02-24  2:06 ` [PATCH 14/16] amdgpu: add support for memory cgroups Dave Airlie
2026-02-24  2:06 ` [PATCH 15/16] ttm: add support for a module option to disable memcg integration Dave Airlie
2026-02-24  2:06 ` [PATCH 16/16] xe: create a flag to enable memcg accounting for XE as well Dave Airlie

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:91beaa6798c dfblob:3ea7f1a399e dfblob:70b685a85bf
dfblob:4f75d64f5fc dfblob:a52da3a5e4f dfblob:90bb3e00c25 )
 OR (
bs:"[PATCH 07/16] memcg: add support for GPU page counters. (v4)" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260224020854.791201-8-airlied@gmail.com \
    --to=airlied@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=david@fromorbit.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=simona@ffwll.ch \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox