Linux Documentation
 help / color / mirror / Atom feed
* [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg
@ 2026-05-19 15:59 Eric Chanudet
  2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
  2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Chanudet @ 2026-05-19 15:59 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
	Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
	Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
	Shuah Khan
  Cc: cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
	Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
	linux-doc, Eric Chanudet

Following suggestion[1], offer a cgroupfs entry to allow an
administrator to request that a dmem controlled region also charges to
the memory controller.

Add mem_cgroup_dmem_charge/uncharge helpers to resolve the effective
cgroup from a dmem pool's cgroup, perform the charge and update a
MEMCG_DMEM stat counter.

Add a "dmem.memcg" control file at the root level to configure memcg
charging per region. The setting is disabled by default and locked on
first charge attempt.

[1] https://lore.kernel.org/all/a446b598-5041-450b-aaa9-3c39a09ff6a0@amd.com/

Signed-off-by: Eric Chanudet <echanude@redhat.com>
---
Changes in v2:
- Use mem_cgroup_dmem_{,un}charge to account for memcg pages instead of
  exposing raw nr_pages functions. Use it to centralize where to find
  the effective cgroup from the pool's cgroup (Johannes)
- Set depends_on for cgrp_memory if CONFIG_MEMCG by having a memory
  controller in children cgroup (Michal)
- Move dmem.memcg to the root level as it applies by region for all
  cgroups
- Add a dmem memory.stats entry for reporting memcg charges for dmem
  allocations.
- Wrap the memcg enable/disable/lock configuration under a single state
  to avoid toctou races and simplify transitions.
- Link to v1: https://lore.kernel.org/r/20260403-cgroup-dmem-memcg-double-charge-v1-0-c371d155de2a@redhat.com

---
Eric Chanudet (2):
      mm/memcontrol: add dmem charge/uncharge functions
      cgroup/dmem: add dmem.memcg control file for double-charging to memcg

 Documentation/admin-guide/cgroup-v2.rst |  23 +++++
 include/linux/memcontrol.h              |  16 ++++
 kernel/cgroup/dmem.c                    | 158 +++++++++++++++++++++++++++++++-
 mm/memcontrol.c                         |  65 +++++++++++++
 4 files changed, 259 insertions(+), 3 deletions(-)
---
base-commit: d989f135f71699294bb2ffd4726b526456e2db68
change-id: 20260327-cgroup-dmem-memcg-double-charge-0f100a9ffbf2

Best regards,
-- 
Eric Chanudet <echanude@redhat.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
  2026-05-19 15:59 [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg Eric Chanudet
@ 2026-05-19 15:59 ` Eric Chanudet
  2026-05-20  7:22   ` Albert Esteve
  2026-05-22 15:53   ` Shakeel Butt
  2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
  1 sibling, 2 replies; 8+ messages in thread
From: Eric Chanudet @ 2026-05-19 15:59 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
	Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
	Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
	Shuah Khan
  Cc: cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
	Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
	linux-doc, Eric Chanudet

Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
dmem pool allocations to optionally be double-charged against the memory
controller. Take the struct cgroup from the dmem pool's css as there is
no convenient object exported to represent these allocations. These will
resolve the effective memory css from that cgroup and perform the
charge.

Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
dmem charge visible.

Signed-off-by: Eric Chanudet <echanude@redhat.com>
---
 include/linux/memcontrol.h | 16 ++++++++++++
 mm/memcontrol.c            | 65 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -39,6 +39,7 @@ enum memcg_stat_item {
 	MEMCG_ZSWAP_B,
 	MEMCG_ZSWAPPED,
 	MEMCG_ZSWAP_INCOMP,
+	MEMCG_DMEM,
 	MEMCG_NR_STAT,
 };
 
@@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
 }
 #endif
 
+#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
+bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
+			    gfp_t gfp_mask);
+void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
+#else
+static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
+					  unsigned int nr_pages, gfp_t gfp_mask)
+{
+	return true;
+}
+static inline void mem_cgroup_dmem_uncharge(struct cgroup *cgrp,
+					    unsigned int nr_pages)
+{
+}
+#endif
 
 /* Cgroup v1-related declarations */
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c03d4787d466803db49cdaa90e6d6ba426b7afe2..91a7ac16b6eac2d6c3700b6885a068bf8b640706 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -433,6 +433,7 @@ static const unsigned int memcg_stat_items[] = {
 	MEMCG_ZSWAP_B,
 	MEMCG_ZSWAPPED,
 	MEMCG_ZSWAP_INCOMP,
+	MEMCG_DMEM,
 };
 
 #define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
@@ -1606,6 +1607,9 @@ static const struct memory_stat memory_stats[] = {
 #ifdef CONFIG_NUMA_BALANCING
 	{ "pgpromote_success",		PGPROMOTE_SUCCESS	},
 #endif
+#ifdef CONFIG_CGROUP_DMEM
+	{ "dmem",			MEMCG_DMEM		},
+#endif
 };
 
 /* The actual unit of the state item, not the same as the output unit */
@@ -5909,6 +5913,67 @@ static struct cftype zswap_files[] = {
 };
 #endif /* CONFIG_ZSWAP */
 
+#ifdef CONFIG_CGROUP_DMEM
+/**
+ * mem_cgroup_dmem_charge - charge memcg for a dmem pool allocation
+ * @cgrp: cgroup of the dmem pool
+ * @nr_pages: number of pages to charge
+ * @gfp_mask: reclaim mode
+ *
+ * Charges @nr_pages to @memcg. Returns %true if the charge fit within
+ * @memcg's configured limit, %false if it doesn't.
+ */
+bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
+			    gfp_t gfp_mask)
+{
+	struct cgroup_subsys_state *mem_css;
+	struct mem_cgroup *memcg;
+
+	/* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
+	mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
+
+	/* Use the memcg, if any, of the dmem cgroup. */
+	memcg = mem_cgroup_from_css(mem_css);
+	if (!memcg || mem_cgroup_is_root(memcg)) {
+		css_put(mem_css);
+		return false;
+	}
+
+	if (try_charge_memcg(memcg, gfp_mask, nr_pages)) {
+		css_put(mem_css);
+		return false;
+	}
+
+	mod_memcg_state(memcg, MEMCG_DMEM, nr_pages);
+	css_put(mem_css);
+	return true;
+}
+
+/**
+ * mem_cgroup_dmem_uncharge - uncharge memcg from a dmem pool allocation
+ * @cgrp: cgroup of the dmem pool
+ * @nr_pages: number of pages to uncharge
+ */
+void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages)
+{
+	struct cgroup_subsys_state *mem_css;
+	struct mem_cgroup *memcg;
+
+	/* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
+	mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
+
+	memcg = mem_cgroup_from_css(mem_css);
+	if (!memcg || mem_cgroup_is_root(memcg)) {
+		css_put(mem_css);
+		return;
+	}
+
+	mod_memcg_state(memcg, MEMCG_DMEM, -nr_pages);
+	refill_stock(memcg, nr_pages);
+	css_put(mem_css);
+}
+#endif /* CONFIG_CGROUP_DMEM */
+
 static int __init mem_cgroup_swap_init(void)
 {
 	if (mem_cgroup_disabled())

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg
  2026-05-19 15:59 [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg Eric Chanudet
  2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
@ 2026-05-19 15:59 ` Eric Chanudet
  2026-05-22 15:26   ` Michal Koutný
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Chanudet @ 2026-05-19 15:59 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
	Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
	Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
	Shuah Khan
  Cc: cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
	Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
	linux-doc, Eric Chanudet

Add a root-only cgroupfs file "dmem.memcg" that lets an administrator
configure whether allocations in a dmem region should also be charged to
the memory controller.

To handle inheritance, dmem adds a depends_on the memory controller,
unless MEMCG isn't configured in.

Double-charging is disabled by default. Once a charge is attempted, the
setting is locked to prevent inconsistent accounting by a small 4-state
machine (off, on, locked off, locked on).

The memcg to charge is derived from the pool's cgroup, since the pool
holds a reference to the dmem cgroup state that keeps the cgroup alive
until it gets uncharged.

Signed-off-by: Eric Chanudet <echanude@redhat.com>
---
 Documentation/admin-guide/cgroup-v2.rst |  23 +++++
 kernel/cgroup/dmem.c                    | 158 +++++++++++++++++++++++++++++++-
 2 files changed, 178 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 6efd0095ed995b1550317662bc1b56c7a7f3db23..1d2fa55ddf0faa17baa916a8914d3033e8e42359 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2828,6 +2828,29 @@ DMEM Interface Files
 	  drm/0000:03:00.0/vram0 12550144
 	  drm/0000:03:00.0/stolen 8650752
 
+  dmem.memcg
+	A readwrite nested-keyed file that exists only on the root
+	cgroup. It configures whether allocations in a dmem region
+	should also be charged to the memory controller.
+
+	Upon the first charge to a region, its setting can no longer be changed
+	and is reported as "[true|false] (locked)".
+
+	Charges to the memory controller are visible in ``memory.stat`` as the
+	``dmem`` entry, reported in bytes.
+
+	An example read output follows::
+
+	  drm/0000:03:00.0/vram0 false
+	  drm/0000:03:00.0/stolen false (locked)
+
+	Writing uses the same nested-keyed format::
+
+	  echo "drm/0000:03:00.0/vram0 true" > dmem.memcg
+
+	This file is only available when the kernel is built with
+	``CONFIG_MEMCG``.
+
 HugeTLB
 -------
 
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 1ab1fb47f2711ecc60dd13e611a8a4920b48f3e9..e07b20b8025c528f190f84c76b088cb8a32a7f5e 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -17,6 +17,14 @@
 #include <linux/refcount.h>
 #include <linux/rculist.h>
 #include <linux/slab.h>
+#include <linux/memcontrol.h>
+
+enum dmem_memcg_status {
+	DMEM_MEMCG_OFF,
+	DMEM_MEMCG_ON,
+	DMEM_MEMCG_LOCKED_OFF,
+	DMEM_MEMCG_LOCKED_ON,
+};
 
 struct dmem_cgroup_region {
 	/**
@@ -51,6 +59,14 @@ struct dmem_cgroup_region {
 	 * No new pools should be added to the region afterwards.
 	 */
 	bool unregistered;
+
+	/**
+	 * @memcg_status: Whether allocation in this region should charge memcg.
+	 * DMEM_MEMCG_OFF/DMEM_MEMCG_ON or
+	 * DMEM_MEMCG_LOCKED_OFF/DMEM_MEMCG_LOCKED_ON, frozen after first allocation.
+	 * Transitions to a locked state are one-way.
+	 */
+	atomic_t memcg_status;
 };
 
 struct dmemcg_state {
@@ -609,6 +625,34 @@ get_cg_pool_unlocked(struct dmemcg_state *cg, struct dmem_cgroup_region *region)
 	return pool;
 }
 
+static bool apply_memcg_charge(atomic_t *status)
+{
+	int state = atomic_read(status);
+
+	for (;;) {
+		switch (state) {
+		case DMEM_MEMCG_OFF:
+			state = atomic_cmpxchg(status, DMEM_MEMCG_OFF,
+					       DMEM_MEMCG_LOCKED_OFF);
+			if (state != DMEM_MEMCG_OFF)
+				continue;
+			return false;
+		case DMEM_MEMCG_LOCKED_OFF:
+			return false;
+		case DMEM_MEMCG_ON:
+			state = atomic_cmpxchg(status, DMEM_MEMCG_ON,
+					       DMEM_MEMCG_LOCKED_ON);
+			if (state != DMEM_MEMCG_ON)
+				continue;
+			return true;
+		case DMEM_MEMCG_LOCKED_ON:
+			return true;
+		}
+		WARN_ONCE(1, "Invalid memcg_status (%#x).\n", state);
+		return false;
+	}
+}
+
 /**
  * dmem_cgroup_uncharge() - Uncharge a pool.
  * @pool: Pool to uncharge.
@@ -624,6 +668,12 @@ void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *pool, u64 size)
 		return;
 
 	page_counter_uncharge(&pool->cnt, size);
+
+	if (atomic_read(&pool->region->memcg_status) == DMEM_MEMCG_LOCKED_ON &&
+	    !WARN_ON_ONCE(size > (u64)UINT_MAX << PAGE_SHIFT))
+		mem_cgroup_dmem_uncharge(pool->cs->css.cgroup,
+					 PAGE_ALIGN(size) >> PAGE_SHIFT);
+
 	css_put(&pool->cs->css);
 	dmemcg_pool_put(pool);
 }
@@ -655,6 +705,8 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
 	struct dmemcg_state *cg;
 	struct dmem_cgroup_pool_state *pool;
 	struct page_counter *fail;
+	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	bool charge_memcg;
 	int ret;
 
 	*ret_pool = NULL;
@@ -670,7 +722,28 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
 	pool = get_cg_pool_unlocked(cg, region);
 	if (IS_ERR(pool)) {
 		ret = PTR_ERR(pool);
-		goto err;
+		goto err_css_put;
+	}
+
+	charge_memcg = apply_memcg_charge(&region->memcg_status);
+	if (charge_memcg) {
+		/* mem_cgroup_dmem_charge limitation from try_charge_memcg */
+		if (size > (u64)UINT_MAX << PAGE_SHIFT) {
+			ret = -EINVAL;
+			dmemcg_pool_put(pool);
+			goto err_css_put;
+		}
+
+		if (!mem_cgroup_dmem_charge(pool->cs->css.cgroup, nr_pages,
+					    GFP_KERNEL)) {
+			/*
+			 * No dmem_cgroup_state_evict_valuable() could help,
+			 * there's no ret_limit_pool to return.
+			 */
+			ret = -ENOMEM;
+			dmemcg_pool_put(pool);
+			goto err_css_put;
+		}
 	}
 
 	if (!page_counter_try_charge(&pool->cnt, size, &fail)) {
@@ -681,14 +754,17 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
 		}
 		dmemcg_pool_put(pool);
 		ret = -EAGAIN;
-		goto err;
+		goto err_uncharge_memcg;
 	}
 
 	/* On success, reference from get_current_dmemcs is transferred to *ret_pool */
 	*ret_pool = pool;
 	return 0;
 
-err:
+err_uncharge_memcg:
+	if (charge_memcg)
+		mem_cgroup_dmem_uncharge(pool->cs->css.cgroup, nr_pages);
+err_css_put:
 	css_put(&cg->css);
 	return ret;
 }
@@ -845,6 +921,71 @@ static ssize_t dmem_cgroup_region_max_write(struct kernfs_open_file *of,
 	return dmemcg_limit_write(of, buf, nbytes, off, set_resource_max);
 }
 
+#ifdef CONFIG_MEMCG
+static int dmem_cgroup_memcg_show(struct seq_file *sf, void *v)
+{
+	struct dmem_cgroup_region *region;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(region, &dmem_cgroup_regions, region_node) {
+		int state = atomic_read(&region->memcg_status);
+
+		seq_printf(sf, "%s %s\n", region->name,
+			   state == DMEM_MEMCG_ON ? "true" :
+			   state == DMEM_MEMCG_OFF ? "false" :
+			   state == DMEM_MEMCG_LOCKED_ON ? "true (locked)" :
+			   state == DMEM_MEMCG_LOCKED_OFF ? "false (locked)" :
+			   "(invalid)");
+	}
+	rcu_read_unlock();
+	return 0;
+}
+
+static ssize_t dmem_cgroup_memcg_write(struct kernfs_open_file *of, char *buf,
+				       size_t nbytes, loff_t off)
+{
+	while (buf) {
+		struct dmem_cgroup_region *region;
+		char *options, *name;
+		bool flag;
+
+		options = buf;
+		buf = strchr(buf, '\n');
+		if (buf)
+			*buf++ = '\0';
+
+		options = strstrip(options);
+		if (!options[0])
+			continue;
+
+		name = strsep(&options, " \t");
+		if (!name[0])
+			continue;
+
+		if (!options || !options[0])
+			return -EINVAL;
+
+		if (kstrtobool(options, &flag))
+			return -EINVAL;
+
+		rcu_read_lock();
+		region = dmemcg_get_region_by_name(name);
+		rcu_read_unlock();
+		if (!region)
+			return -ENODEV;
+
+		atomic_cmpxchg(&region->memcg_status,
+			       flag ? DMEM_MEMCG_OFF : DMEM_MEMCG_ON,
+			       flag ? DMEM_MEMCG_ON : DMEM_MEMCG_OFF);
+		/* Continue if a region is already locked. */
+
+		kref_put(&region->ref, dmemcg_free_region);
+	}
+
+	return nbytes;
+}
+#endif
+
 static struct cftype files[] = {
 	{
 		.name = "capacity",
@@ -873,6 +1014,14 @@ static struct cftype files[] = {
 		.seq_show = dmem_cgroup_region_max_show,
 		.flags = CFTYPE_NOT_ON_ROOT,
 	},
+#ifdef CONFIG_MEMCG
+	{
+		.name = "memcg",
+		.write = dmem_cgroup_memcg_write,
+		.seq_show = dmem_cgroup_memcg_show,
+		.flags = CFTYPE_ONLY_ON_ROOT,
+	},
+#endif
 	{ } /* Zero entry terminates. */
 };
 
@@ -882,4 +1031,7 @@ struct cgroup_subsys dmem_cgrp_subsys = {
 	.css_offline	= dmemcs_offline,
 	.legacy_cftypes	= files,
 	.dfl_cftypes	= files,
+#ifdef CONFIG_MEMCG
+	.depends_on	= 1 << memory_cgrp_id,
+#endif
 };

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
  2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
@ 2026-05-20  7:22   ` Albert Esteve
  2026-05-22 15:53   ` Shakeel Butt
  1 sibling, 0 replies; 8+ messages in thread
From: Albert Esteve @ 2026-05-20  7:22 UTC (permalink / raw)
  To: Eric Chanudet
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
	Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
	Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
	Shuah Khan, cgroups, linux-mm, linux-kernel, dri-devel,
	T.J. Mercier, Christian König, Maxime Ripard, Dave Airlie,
	linux-doc

On Tue, May 19, 2026 at 6:01 PM Eric Chanudet <echanude@redhat.com> wrote:
>
> Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
> dmem pool allocations to optionally be double-charged against the memory
> controller. Take the struct cgroup from the dmem pool's css as there is
> no convenient object exported to represent these allocations. These will
> resolve the effective memory css from that cgroup and perform the
> charge.
>
> Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
> dmem charge visible.
>
> Signed-off-by: Eric Chanudet <echanude@redhat.com>

Reviewed-by: Albert Esteve <aesteve@redhat.com>

> ---
>  include/linux/memcontrol.h | 16 ++++++++++++
>  mm/memcontrol.c            | 65 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -39,6 +39,7 @@ enum memcg_stat_item {
>         MEMCG_ZSWAP_B,
>         MEMCG_ZSWAPPED,
>         MEMCG_ZSWAP_INCOMP,
> +       MEMCG_DMEM,
>         MEMCG_NR_STAT,
>  };
>
> @@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
>  }
>  #endif
>
> +#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
> +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> +                           gfp_t gfp_mask);
> +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
> +#else
> +static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
> +                                         unsigned int nr_pages, gfp_t gfp_mask)
> +{
> +       return true;
> +}
> +static inline void mem_cgroup_dmem_uncharge(struct cgroup *cgrp,
> +                                           unsigned int nr_pages)
> +{
> +}
> +#endif
>
>  /* Cgroup v1-related declarations */
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c03d4787d466803db49cdaa90e6d6ba426b7afe2..91a7ac16b6eac2d6c3700b6885a068bf8b640706 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -433,6 +433,7 @@ static const unsigned int memcg_stat_items[] = {
>         MEMCG_ZSWAP_B,
>         MEMCG_ZSWAPPED,
>         MEMCG_ZSWAP_INCOMP,
> +       MEMCG_DMEM,
>  };
>
>  #define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
> @@ -1606,6 +1607,9 @@ static const struct memory_stat memory_stats[] = {
>  #ifdef CONFIG_NUMA_BALANCING
>         { "pgpromote_success",          PGPROMOTE_SUCCESS       },
>  #endif
> +#ifdef CONFIG_CGROUP_DMEM
> +       { "dmem",                       MEMCG_DMEM              },
> +#endif
>  };
>
>  /* The actual unit of the state item, not the same as the output unit */
> @@ -5909,6 +5913,67 @@ static struct cftype zswap_files[] = {
>  };
>  #endif /* CONFIG_ZSWAP */
>
> +#ifdef CONFIG_CGROUP_DMEM
> +/**
> + * mem_cgroup_dmem_charge - charge memcg for a dmem pool allocation
> + * @cgrp: cgroup of the dmem pool
> + * @nr_pages: number of pages to charge
> + * @gfp_mask: reclaim mode
> + *
> + * Charges @nr_pages to @memcg. Returns %true if the charge fit within
> + * @memcg's configured limit, %false if it doesn't.
> + */
> +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> +                           gfp_t gfp_mask)
> +{
> +       struct cgroup_subsys_state *mem_css;
> +       struct mem_cgroup *memcg;
> +
> +       /* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
> +       mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
> +
> +       /* Use the memcg, if any, of the dmem cgroup. */
> +       memcg = mem_cgroup_from_css(mem_css);
> +       if (!memcg || mem_cgroup_is_root(memcg)) {
> +               css_put(mem_css);
> +               return false;
> +       }
> +
> +       if (try_charge_memcg(memcg, gfp_mask, nr_pages)) {
> +               css_put(mem_css);
> +               return false;
> +       }
> +
> +       mod_memcg_state(memcg, MEMCG_DMEM, nr_pages);
> +       css_put(mem_css);
> +       return true;
> +}
> +
> +/**
> + * mem_cgroup_dmem_uncharge - uncharge memcg from a dmem pool allocation
> + * @cgrp: cgroup of the dmem pool
> + * @nr_pages: number of pages to uncharge
> + */
> +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages)
> +{
> +       struct cgroup_subsys_state *mem_css;
> +       struct mem_cgroup *memcg;
> +
> +       /* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
> +       mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
> +
> +       memcg = mem_cgroup_from_css(mem_css);
> +       if (!memcg || mem_cgroup_is_root(memcg)) {
> +               css_put(mem_css);
> +               return;
> +       }
> +
> +       mod_memcg_state(memcg, MEMCG_DMEM, -nr_pages);
> +       refill_stock(memcg, nr_pages);
> +       css_put(mem_css);
> +}
> +#endif /* CONFIG_CGROUP_DMEM */
> +
>  static int __init mem_cgroup_swap_init(void)
>  {
>         if (mem_cgroup_disabled())
>
> --
> 2.52.0
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg
  2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
@ 2026-05-22 15:26   ` Michal Koutný
  2026-05-22 16:17     ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Koutný @ 2026-05-22 15:26 UTC (permalink / raw)
  To: Eric Chanudet
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
	Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
	Natalie Vock, Tejun Heo, Jonathan Corbet, Shuah Khan, cgroups,
	linux-mm, linux-kernel, dri-devel, T.J. Mercier,
	Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
	linux-doc

[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]

Hello Eric.

On Tue, May 19, 2026 at 11:59:02AM -0400, Eric Chanudet <echanude@redhat.com> wrote:
> Add a root-only cgroupfs file "dmem.memcg" that lets an administrator
> configure whether allocations in a dmem region should also be charged to
> the memory controller.

This kinda makes sense as it is not unlike io.cost.* device
configurators.

Just for my better understanding -- will there be a space for userspace
to switch this? (No charged dmem allocations happen before responsible
userspace runs, so that the attribute remains unlocked.)

(I'm rather indifferent about the actual double charging/non-charging
matter.)


> 
> To handle inheritance, dmem adds a depends_on the memory controller,
> unless MEMCG isn't configured in.
> 
> Double-charging is disabled by default. Once a charge is attempted, the
> setting is locked to prevent inconsistent accounting by a small 4-state
> machine (off, on, locked off, locked on).
> 
> The memcg to charge is derived from the pool's cgroup, since the pool
> holds a reference to the dmem cgroup state that keeps the cgroup alive
> until it gets uncharged.
> 
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
> ---
>  Documentation/admin-guide/cgroup-v2.rst |  23 +++++
>  kernel/cgroup/dmem.c                    | 158 +++++++++++++++++++++++++++++++-
>  2 files changed, 178 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 6efd0095ed995b1550317662bc1b56c7a7f3db23..1d2fa55ddf0faa17baa916a8914d3033e8e42359 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -2828,6 +2828,29 @@ DMEM Interface Files
>  	  drm/0000:03:00.0/vram0 12550144
>  	  drm/0000:03:00.0/stolen 8650752
>  
> +  dmem.memcg
> +	A readwrite nested-keyed file that exists only on the root
> +	cgroup.

Strictly speaking this is not nested-keyed but flat keyed [1],
which leads me to realization that this is the first instance of a boolean.
All in call, such a composition comes to my mind (latter is RO):

	drm/0000:03:00.0/vram0 enable=0|1 locked=0|1




> +static ssize_t dmem_cgroup_memcg_write(struct kernfs_open_file *of, char *buf,
> +				       size_t nbytes, loff_t off)
> +{
> +	while (buf) {
> +		struct dmem_cgroup_region *region;
> +		char *options, *name;
> +		bool flag;
> +
> +		options = buf;
> +		buf = strchr(buf, '\n');
> +		if (buf)
> +			*buf++ = '\0';

I recall there was a discussion about accepting only a single device per
write(2) (at the same time I see this idiom is still present in other
dmem.* files, so this is nothing to change in _this_ patch).

Thanks,
Michal

[1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#format

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
  2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
  2026-05-20  7:22   ` Albert Esteve
@ 2026-05-22 15:53   ` Shakeel Butt
  2026-05-22 15:55     ` Shakeel Butt
  1 sibling, 1 reply; 8+ messages in thread
From: Shakeel Butt @ 2026-05-22 15:53 UTC (permalink / raw)
  To: Eric Chanudet
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Andrew Morton, Maarten Lankhorst, Maxime Ripard, Natalie Vock,
	Tejun Heo, Michal Koutný, Jonathan Corbet, Shuah Khan,
	cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
	Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
	linux-doc

On Tue, May 19, 2026 at 11:59:01AM -0400, Eric Chanudet wrote:
> Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
> dmem pool allocations to optionally be double-charged against the memory
> controller. Take the struct cgroup from the dmem pool's css as there is
> no convenient object exported to represent these allocations. These will
> resolve the effective memory css from that cgroup and perform the
> charge.
> 
> Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
> dmem charge visible.
> 
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
> ---
>  include/linux/memcontrol.h | 16 ++++++++++++
>  mm/memcontrol.c            | 65 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -39,6 +39,7 @@ enum memcg_stat_item {
>  	MEMCG_ZSWAP_B,
>  	MEMCG_ZSWAPPED,
>  	MEMCG_ZSWAP_INCOMP,
> +	MEMCG_DMEM,
>  	MEMCG_NR_STAT,
>  };
>  
> @@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
>  }
>  #endif
>  
> +#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
> +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> +			    gfp_t gfp_mask);
> +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
> +#else
> +static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
> +					  unsigned int nr_pages, gfp_t gfp_mask)

Please follow Johannes's request to pass the actually memory object instead of
naked numbers.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
  2026-05-22 15:53   ` Shakeel Butt
@ 2026-05-22 15:55     ` Shakeel Butt
  0 siblings, 0 replies; 8+ messages in thread
From: Shakeel Butt @ 2026-05-22 15:55 UTC (permalink / raw)
  To: Eric Chanudet
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Andrew Morton, Maarten Lankhorst, Maxime Ripard, Natalie Vock,
	Tejun Heo, Michal Koutný, Jonathan Corbet, Shuah Khan,
	cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
	Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
	linux-doc

On Fri, May 22, 2026 at 08:53:10AM -0700, Shakeel Butt wrote:
> On Tue, May 19, 2026 at 11:59:01AM -0400, Eric Chanudet wrote:
> > Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
> > dmem pool allocations to optionally be double-charged against the memory
> > controller. Take the struct cgroup from the dmem pool's css as there is
> > no convenient object exported to represent these allocations. These will
> > resolve the effective memory css from that cgroup and perform the
> > charge.
> > 
> > Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
> > dmem charge visible.
> > 
> > Signed-off-by: Eric Chanudet <echanude@redhat.com>
> > ---
> >  include/linux/memcontrol.h | 16 ++++++++++++
> >  mm/memcontrol.c            | 65 ++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 81 insertions(+)
> > 
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -39,6 +39,7 @@ enum memcg_stat_item {
> >  	MEMCG_ZSWAP_B,
> >  	MEMCG_ZSWAPPED,
> >  	MEMCG_ZSWAP_INCOMP,
> > +	MEMCG_DMEM,
> >  	MEMCG_NR_STAT,
> >  };
> >  
> > @@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
> >  }
> >  #endif
> >  
> > +#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
> > +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> > +			    gfp_t gfp_mask);
> > +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
> > +#else
> > +static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
> > +					  unsigned int nr_pages, gfp_t gfp_mask)
> 
> Please follow Johannes's request to pass the actually memory object instead of
> naked numbers.
> 

Also what exactly is the backing memory here? Is it system memory? If yes, then
you need to pass struct page. For non-system memory, I am not sure memcg is the
right place to charge such memory.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg
  2026-05-22 15:26   ` Michal Koutný
@ 2026-05-22 16:17     ` Tejun Heo
  0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2026-05-22 16:17 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Eric Chanudet, Johannes Weiner, Michal Hocko, Roman Gushchin,
	Shakeel Butt, Muchun Song, Andrew Morton, Maarten Lankhorst,
	Maxime Ripard, Natalie Vock, Jonathan Corbet, Shuah Khan, cgroups,
	linux-mm, linux-kernel, dri-devel, T.J. Mercier,
	Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
	linux-doc

Hello,

On Fri, May 22, 2026 at 05:26:16PM +0200, Michal Koutný wrote:
> Hello Eric.
> 
> On Tue, May 19, 2026 at 11:59:02AM -0400, Eric Chanudet <echanude@redhat.com> wrote:
> > Add a root-only cgroupfs file "dmem.memcg" that lets an administrator
> > configure whether allocations in a dmem region should also be charged to
> > the memory controller.
> 
> This kinda makes sense as it is not unlike io.cost.* device
> configurators.
> 
> Just for my better understanding -- will there be a space for userspace
> to switch this? (No charged dmem allocations happen before responsible
> userspace runs, so that the attribute remains unlocked.)
> 
> (I'm rather indifferent about the actual double charging/non-charging
> matter.)

I wonder whether this would make more sense as a mount flag? What's the use
case for e.g. having different config for different devices? Wouldn't that
be really confusing?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-22 16:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19 15:59 [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg Eric Chanudet
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
2026-05-20  7:22   ` Albert Esteve
2026-05-22 15:53   ` Shakeel Butt
2026-05-22 15:55     ` Shakeel Butt
2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
2026-05-22 15:26   ` Michal Koutný
2026-05-22 16:17     ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox