* [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool
[not found] <20260311195153.4013476-1-joshua.hahnjy@gmail.com>
@ 2026-03-11 19:51 ` Joshua Hahn
2026-03-11 20:12 ` Nhat Pham
2026-03-11 20:16 ` Johannes Weiner
2026-03-11 19:51 ` [PATCH 04/11] mm/zsmalloc: Introduce objcgs pointer in struct zspage Joshua Hahn
2026-03-11 19:51 ` [PATCH 05/11] mm/zsmalloc: Store obj_cgroup pointer in zspage Joshua Hahn
2 siblings, 2 replies; 11+ messages in thread
From: Joshua Hahn @ 2026-03-11 19:51 UTC (permalink / raw)
To: Minchan Kim, Sergey Senozhatsky
Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Nhat Pham,
Chengming Zhou, Andrew Morton, linux-mm, linux-block,
linux-kernel, kernel-team
Introduce 3 new fields to struct zs_pool to allow individual zpools to
be "memcg-aware": memcg_aware, compressed_stat, and uncompressed_stat.
memcg_aware is used in later patches to determine whether memory
should be allocated to keep track of per-compresed object objgs.
compressed_stat and uncompressed_stat are enum indices that point into
memcg (node) stats that zsmalloc will account towards.
In reality, these fields help distinguish between the two users of
zsmalloc, zswap and zram. The enum indices compressed_stat and
uncompressed_stat are parametrized to minimize zswap-specific hardcoding
in zsmalloc.
Suggested-by: Yosry Ahmed <yosry@kernel.org>
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
drivers/block/zram/zram_drv.c | 3 ++-
include/linux/zsmalloc.h | 5 ++++-
mm/zsmalloc.c | 13 ++++++++++++-
mm/zswap.c | 3 ++-
4 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index bca33403fc8b..d1eae5c20df7 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1980,7 +1980,8 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
if (!zram->table)
return false;
- zram->mem_pool = zs_create_pool(zram->disk->disk_name);
+ /* zram does not support memcg accounting */
+ zram->mem_pool = zs_create_pool(zram->disk->disk_name, false, 0, 0);
if (!zram->mem_pool) {
vfree(zram->table);
zram->table = NULL;
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 478410c880b1..24fb2e0fdf67 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -23,8 +23,11 @@ struct zs_pool_stats {
struct zs_pool;
struct scatterlist;
+enum memcg_stat_item;
-struct zs_pool *zs_create_pool(const char *name);
+struct zs_pool *zs_create_pool(const char *name, bool memcg_aware,
+ enum memcg_stat_item compressed_stat,
+ enum memcg_stat_item uncompressed_stat);
void zs_destroy_pool(struct zs_pool *pool);
unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t flags,
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 7758486e1d06..3f0f42b78314 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -214,6 +214,9 @@ struct zs_pool {
#ifdef CONFIG_COMPACTION
struct work_struct free_work;
#endif
+ bool memcg_aware;
+ enum memcg_stat_item compressed_stat;
+ enum memcg_stat_item uncompressed_stat;
/* protect zspage migration/compaction */
rwlock_t lock;
atomic_t compaction_in_progress;
@@ -2050,6 +2053,9 @@ static int calculate_zspage_chain_size(int class_size)
/**
* zs_create_pool - Creates an allocation pool to work from.
* @name: pool name to be created
+ * @memcg_aware: whether the consumer of this pool will account memcg stats
+ * @compressed_stat: compressed memcontrol stat item to account
+ * @uncompressed_stat: uncompressed memcontrol stat item to account
*
* This function must be called before anything when using
* the zsmalloc allocator.
@@ -2057,7 +2063,9 @@ static int calculate_zspage_chain_size(int class_size)
* On success, a pointer to the newly created pool is returned,
* otherwise NULL.
*/
-struct zs_pool *zs_create_pool(const char *name)
+struct zs_pool *zs_create_pool(const char *name, bool memcg_aware,
+ enum memcg_stat_item compressed_stat,
+ enum memcg_stat_item uncompressed_stat)
{
int i;
struct zs_pool *pool;
@@ -2071,6 +2079,9 @@ struct zs_pool *zs_create_pool(const char *name)
rwlock_init(&pool->lock);
atomic_set(&pool->compaction_in_progress, 0);
+ pool->memcg_aware = memcg_aware;
+ pool->compressed_stat = compressed_stat;
+ pool->uncompressed_stat = uncompressed_stat;
pool->name = kstrdup(name, GFP_KERNEL);
if (!pool->name)
goto err;
diff --git a/mm/zswap.c b/mm/zswap.c
index e6ec3295bdb0..ff9abaa8aa38 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -257,7 +257,8 @@ static struct zswap_pool *zswap_pool_create(char *compressor)
/* unique name for each pool specifically required by zsmalloc */
snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
- pool->zs_pool = zs_create_pool(name);
+ pool->zs_pool = zs_create_pool(name, true, MEMCG_ZSWAP_B,
+ MEMCG_ZSWAPPED);
if (!pool->zs_pool)
goto error;
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 04/11] mm/zsmalloc: Introduce objcgs pointer in struct zspage
[not found] <20260311195153.4013476-1-joshua.hahnjy@gmail.com>
2026-03-11 19:51 ` [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool Joshua Hahn
@ 2026-03-11 19:51 ` Joshua Hahn
2026-03-11 20:17 ` Nhat Pham
2026-03-11 19:51 ` [PATCH 05/11] mm/zsmalloc: Store obj_cgroup pointer in zspage Joshua Hahn
2 siblings, 1 reply; 11+ messages in thread
From: Joshua Hahn @ 2026-03-11 19:51 UTC (permalink / raw)
To: Minchan Kim, Sergey Senozhatsky
Cc: Johannes Weiner, Harry Yoo, Yosry Ahmed, Nhat Pham, Nhat Pham,
Chengming Zhou, Andrew Morton, linux-mm, linux-block,
linux-kernel, kernel-team
Introduce an array of struct obj_cgroup pointers to zspage to keep track
of compressed objects' memcg ownership, if the zs_pool has been made to
be memcg-aware at creation time.
Move the error path for alloc_zspage to a jump label to simplify the
growing error handling path for a failed zpdesc allocation.
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
mm/zsmalloc.c | 34 ++++++++++++++++++++++++++--------
1 file changed, 26 insertions(+), 8 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 3f0f42b78314..dcf99516227c 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -39,6 +39,7 @@
#include <linux/zsmalloc.h>
#include <linux/fs.h>
#include <linux/workqueue.h>
+#include <linux/memcontrol.h>
#include "zpdesc.h"
#define ZSPAGE_MAGIC 0x58
@@ -273,6 +274,7 @@ struct zspage {
struct zpdesc *first_zpdesc;
struct list_head list; /* fullness list */
struct zs_pool *pool;
+ struct obj_cgroup **objcgs;
struct zspage_lock zsl;
};
@@ -825,6 +827,8 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class,
zpdesc = next;
} while (zpdesc != NULL);
+ if (pool->memcg_aware)
+ kfree(zspage->objcgs);
cache_free_zspage(zspage);
class_stat_sub(class, ZS_OBJS_ALLOCATED, class->objs_per_zspage);
@@ -946,6 +950,16 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
if (!IS_ENABLED(CONFIG_COMPACTION))
gfp &= ~__GFP_MOVABLE;
+ if (pool->memcg_aware) {
+ zspage->objcgs = kcalloc(class->objs_per_zspage,
+ sizeof(struct obj_cgroup *),
+ gfp & ~__GFP_HIGHMEM);
+ if (!zspage->objcgs) {
+ cache_free_zspage(zspage);
+ return NULL;
+ }
+ }
+
zspage->magic = ZSPAGE_MAGIC;
zspage->pool = pool;
zspage->class = class->index;
@@ -955,14 +969,8 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
struct zpdesc *zpdesc;
zpdesc = alloc_zpdesc(gfp, nid);
- if (!zpdesc) {
- while (--i >= 0) {
- zpdesc_dec_zone_page_state(zpdescs[i]);
- free_zpdesc(zpdescs[i]);
- }
- cache_free_zspage(zspage);
- return NULL;
- }
+ if (!zpdesc)
+ goto err;
__zpdesc_set_zsmalloc(zpdesc);
zpdesc_inc_zone_page_state(zpdesc);
@@ -973,6 +981,16 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
init_zspage(class, zspage);
return zspage;
+
+err:
+ while (--i >= 0) {
+ zpdesc_dec_zone_page_state(zpdescs[i]);
+ free_zpdesc(zpdescs[i]);
+ }
+ if (pool->memcg_aware)
+ kfree(zspage->objcgs);
+ cache_free_zspage(zspage);
+ return NULL;
}
static struct zspage *find_get_zspage(struct size_class *class)
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 05/11] mm/zsmalloc: Store obj_cgroup pointer in zspage
[not found] <20260311195153.4013476-1-joshua.hahnjy@gmail.com>
2026-03-11 19:51 ` [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool Joshua Hahn
2026-03-11 19:51 ` [PATCH 04/11] mm/zsmalloc: Introduce objcgs pointer in struct zspage Joshua Hahn
@ 2026-03-11 19:51 ` Joshua Hahn
2026-03-11 20:17 ` Yosry Ahmed
2 siblings, 1 reply; 11+ messages in thread
From: Joshua Hahn @ 2026-03-11 19:51 UTC (permalink / raw)
To: Minchan Kim, Sergey Senozhatsky
Cc: Johannes Weiner, Jens Axboe, Yosry Ahmed, Nhat Pham, Nhat Pham,
Chengming Zhou, Andrew Morton, linux-mm, linux-block,
linux-kernel, kernel-team
With each zspage now having an array of obj_cgroup pointers, plumb the
obj_cgroup pointer from the zswap / zram layer down to zsmalloc.
zram still sees no visible change from its end. For the zswap path,
store the obj_cgroup pointer after compression when writing the object,
and erase the pointer when the object gets freed.
The lifetime and charging of the obj_cgroup is still handled in the
zswap layer.
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
drivers/block/zram/zram_drv.c | 7 ++++---
include/linux/zsmalloc.h | 3 ++-
mm/zsmalloc.c | 25 ++++++++++++++++++++++++-
mm/zswap.c | 6 +++---
4 files changed, 33 insertions(+), 8 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d1eae5c20df7..e68e408992e7 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -2232,7 +2232,7 @@ static int write_incompressible_page(struct zram *zram, struct page *page,
}
src = kmap_local_page(page);
- zs_obj_write(zram->mem_pool, handle, src, PAGE_SIZE);
+ zs_obj_write(zram->mem_pool, handle, src, PAGE_SIZE, NULL);
kunmap_local(src);
slot_lock(zram, index);
@@ -2297,7 +2297,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
return -ENOMEM;
}
- zs_obj_write(zram->mem_pool, handle, zstrm->buffer, comp_len);
+ zs_obj_write(zram->mem_pool, handle, zstrm->buffer, comp_len, NULL);
zcomp_stream_put(zstrm);
slot_lock(zram, index);
@@ -2521,7 +2521,8 @@ static int recompress_slot(struct zram *zram, u32 index, struct page *page,
return PTR_ERR((void *)handle_new);
}
- zs_obj_write(zram->mem_pool, handle_new, zstrm->buffer, comp_len_new);
+ zs_obj_write(zram->mem_pool, handle_new, zstrm->buffer,
+ comp_len_new, NULL);
zcomp_stream_put(zstrm);
slot_free(zram, index);
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 24fb2e0fdf67..645957a156c4 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -23,6 +23,7 @@ struct zs_pool_stats {
struct zs_pool;
struct scatterlist;
+struct obj_cgroup;
enum memcg_stat_item;
struct zs_pool *zs_create_pool(const char *name, bool memcg_aware,
@@ -51,7 +52,7 @@ void zs_obj_read_sg_begin(struct zs_pool *pool, unsigned long handle,
struct scatterlist *sg, size_t mem_len);
void zs_obj_read_sg_end(struct zs_pool *pool, unsigned long handle);
void zs_obj_write(struct zs_pool *pool, unsigned long handle,
- void *handle_mem, size_t mem_len);
+ void *handle_mem, size_t mem_len, struct obj_cgroup *objcg);
extern const struct movable_operations zsmalloc_mops;
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index dcf99516227c..d4735451c273 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1195,7 +1195,7 @@ void zs_obj_read_sg_end(struct zs_pool *pool, unsigned long handle)
EXPORT_SYMBOL_GPL(zs_obj_read_sg_end);
void zs_obj_write(struct zs_pool *pool, unsigned long handle,
- void *handle_mem, size_t mem_len)
+ void *handle_mem, size_t mem_len, struct obj_cgroup *objcg)
{
struct zspage *zspage;
struct zpdesc *zpdesc;
@@ -1216,6 +1216,11 @@ void zs_obj_write(struct zs_pool *pool, unsigned long handle,
class = zspage_class(pool, zspage);
off = offset_in_page(class->size * obj_idx);
+ if (objcg) {
+ WARN_ON_ONCE(!pool->memcg_aware);
+ zspage->objcgs[obj_idx] = objcg;
+ }
+
if (!ZsHugePage(zspage))
off += ZS_HANDLE_SIZE;
@@ -1388,6 +1393,9 @@ static void obj_free(int class_size, unsigned long obj)
f_offset = offset_in_page(class_size * f_objidx);
zspage = get_zspage(f_zpdesc);
+ if (zspage->pool->memcg_aware)
+ zspage->objcgs[f_objidx] = NULL;
+
vaddr = kmap_local_zpdesc(f_zpdesc);
link = (struct link_free *)(vaddr + f_offset);
@@ -1538,6 +1546,16 @@ static unsigned long find_alloced_obj(struct size_class *class,
return handle;
}
+static void zs_migrate_objcg(struct zspage *s_zspage, struct zspage *d_zspage,
+ unsigned long used_obj, unsigned long free_obj)
+{
+ unsigned int s_idx = used_obj & OBJ_INDEX_MASK;
+ unsigned int d_idx = free_obj & OBJ_INDEX_MASK;
+
+ d_zspage->objcgs[d_idx] = s_zspage->objcgs[s_idx];
+ s_zspage->objcgs[s_idx] = NULL;
+}
+
static void migrate_zspage(struct zs_pool *pool, struct zspage *src_zspage,
struct zspage *dst_zspage)
{
@@ -1560,6 +1578,11 @@ static void migrate_zspage(struct zs_pool *pool, struct zspage *src_zspage,
used_obj = handle_to_obj(handle);
free_obj = obj_malloc(pool, dst_zspage, handle);
zs_obj_copy(class, free_obj, used_obj);
+
+ if (pool->memcg_aware)
+ zs_migrate_objcg(src_zspage, dst_zspage,
+ used_obj, free_obj);
+
obj_idx++;
obj_free(class->size, used_obj);
diff --git a/mm/zswap.c b/mm/zswap.c
index ff9abaa8aa38..68b87c3cc326 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -852,7 +852,7 @@ static void acomp_ctx_put_unlock(struct crypto_acomp_ctx *acomp_ctx)
}
static bool zswap_compress(struct page *page, struct zswap_entry *entry,
- struct zswap_pool *pool)
+ struct zswap_pool *pool, struct obj_cgroup *objcg)
{
struct crypto_acomp_ctx *acomp_ctx;
struct scatterlist input, output;
@@ -912,7 +912,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
goto unlock;
}
- zs_obj_write(pool->zs_pool, handle, dst, dlen);
+ zs_obj_write(pool->zs_pool, handle, dst, dlen, objcg);
entry->handle = handle;
entry->length = dlen;
@@ -1414,7 +1414,7 @@ static bool zswap_store_page(struct page *page,
return false;
}
- if (!zswap_compress(page, entry, pool))
+ if (!zswap_compress(page, entry, pool, objcg))
goto compress_failed;
old = xa_store(swap_zswap_tree(page_swpentry),
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool
2026-03-11 19:51 ` [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool Joshua Hahn
@ 2026-03-11 20:12 ` Nhat Pham
2026-03-11 20:16 ` Johannes Weiner
1 sibling, 0 replies; 11+ messages in thread
From: Nhat Pham @ 2026-03-11 20:12 UTC (permalink / raw)
To: Joshua Hahn
Cc: Minchan Kim, Sergey Senozhatsky, Johannes Weiner, Yosry Ahmed,
Nhat Pham, Chengming Zhou, Andrew Morton, linux-mm, linux-block,
linux-kernel, kernel-team
On Wed, Mar 11, 2026 at 12:51 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> Introduce 3 new fields to struct zs_pool to allow individual zpools to
> be "memcg-aware": memcg_aware, compressed_stat, and uncompressed_stat.
>
> memcg_aware is used in later patches to determine whether memory
> should be allocated to keep track of per-compresed object objgs.
> compressed_stat and uncompressed_stat are enum indices that point into
> memcg (node) stats that zsmalloc will account towards.
>
> In reality, these fields help distinguish between the two users of
> zsmalloc, zswap and zram. The enum indices compressed_stat and
> uncompressed_stat are parametrized to minimize zswap-specific hardcoding
> in zsmalloc.
>
> Suggested-by: Yosry Ahmed <yosry@kernel.org>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Zswap side LGTM :) And for that:
Acked-by: Nhat Pham <nphamcs@gmail.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool
2026-03-11 19:51 ` [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool Joshua Hahn
2026-03-11 20:12 ` Nhat Pham
@ 2026-03-11 20:16 ` Johannes Weiner
2026-03-11 20:19 ` Yosry Ahmed
2026-03-11 20:20 ` Joshua Hahn
1 sibling, 2 replies; 11+ messages in thread
From: Johannes Weiner @ 2026-03-11 20:16 UTC (permalink / raw)
To: Joshua Hahn
Cc: Minchan Kim, Sergey Senozhatsky, Yosry Ahmed, Nhat Pham,
Nhat Pham, Chengming Zhou, Andrew Morton, linux-mm, linux-block,
linux-kernel, kernel-team
On Wed, Mar 11, 2026 at 12:51:40PM -0700, Joshua Hahn wrote:
> Introduce 3 new fields to struct zs_pool to allow individual zpools to
> be "memcg-aware": memcg_aware, compressed_stat, and uncompressed_stat.
>
> memcg_aware is used in later patches to determine whether memory
> should be allocated to keep track of per-compresed object objgs.
> compressed_stat and uncompressed_stat are enum indices that point into
> memcg (node) stats that zsmalloc will account towards.
>
> In reality, these fields help distinguish between the two users of
> zsmalloc, zswap and zram. The enum indices compressed_stat and
> uncompressed_stat are parametrized to minimize zswap-specific hardcoding
> in zsmalloc.
>
> Suggested-by: Yosry Ahmed <yosry@kernel.org>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
> drivers/block/zram/zram_drv.c | 3 ++-
> include/linux/zsmalloc.h | 5 ++++-
> mm/zsmalloc.c | 13 ++++++++++++-
> mm/zswap.c | 3 ++-
> 4 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index bca33403fc8b..d1eae5c20df7 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -1980,7 +1980,8 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
> if (!zram->table)
> return false;
>
> - zram->mem_pool = zs_create_pool(zram->disk->disk_name);
> + /* zram does not support memcg accounting */
> + zram->mem_pool = zs_create_pool(zram->disk->disk_name, false, 0, 0);
It's a bit awkward that 0 is valid (MEMCG_SWAP). Plus you store these
values in every pool, even though they're always the same for all
zswap pools.
How about:
/* zsmalloc.h */
struct zs_memcg_params {
enum memcg_stat_item compressed;
enum memcg_stat_item uncompressed;
};
struct zs_pool *zs_create_pool(const char *name, struct zs_memcg_params *memcg_params);
/* zswap.c */
static struct zs_memcg_params zswap_memcg_params = {
.compressed = MEMCG_ZSWAP_B,
.uncompressed = MEMCG_ZSWAPPED,
};
then pass &zswap_memcg_params from zswap and NULL from zram.
> @@ -2071,6 +2079,9 @@ struct zs_pool *zs_create_pool(const char *name)
> rwlock_init(&pool->lock);
> atomic_set(&pool->compaction_in_progress, 0);
>
> + pool->memcg_aware = memcg_aware;
> + pool->compressed_stat = compressed_stat;
> + pool->uncompressed_stat = uncompressed_stat;
pool->memcg_params = memcg_params;
And then use if (pool->memcg_params) to gate in zsmalloc.c.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 04/11] mm/zsmalloc: Introduce objcgs pointer in struct zspage
2026-03-11 19:51 ` [PATCH 04/11] mm/zsmalloc: Introduce objcgs pointer in struct zspage Joshua Hahn
@ 2026-03-11 20:17 ` Nhat Pham
2026-03-11 20:22 ` Joshua Hahn
0 siblings, 1 reply; 11+ messages in thread
From: Nhat Pham @ 2026-03-11 20:17 UTC (permalink / raw)
To: Joshua Hahn
Cc: Minchan Kim, Sergey Senozhatsky, Johannes Weiner, Harry Yoo,
Yosry Ahmed, Nhat Pham, Chengming Zhou, Andrew Morton, linux-mm,
linux-block, linux-kernel, kernel-team
On Wed, Mar 11, 2026 at 12:52 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> Introduce an array of struct obj_cgroup pointers to zspage to keep track
> of compressed objects' memcg ownership, if the zs_pool has been made to
> be memcg-aware at creation time.
>
> Move the error path for alloc_zspage to a jump label to simplify the
> growing error handling path for a failed zpdesc allocation.
>
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
> mm/zsmalloc.c | 34 ++++++++++++++++++++++++++--------
> 1 file changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 3f0f42b78314..dcf99516227c 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -39,6 +39,7 @@
> #include <linux/zsmalloc.h>
> #include <linux/fs.h>
> #include <linux/workqueue.h>
> +#include <linux/memcontrol.h>
> #include "zpdesc.h"
>
> #define ZSPAGE_MAGIC 0x58
> @@ -273,6 +274,7 @@ struct zspage {
> struct zpdesc *first_zpdesc;
> struct list_head list; /* fullness list */
> struct zs_pool *pool;
> + struct obj_cgroup **objcgs;
> struct zspage_lock zsl;
> };
>
> @@ -825,6 +827,8 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class,
> zpdesc = next;
> } while (zpdesc != NULL);
>
> + if (pool->memcg_aware)
> + kfree(zspage->objcgs);
> cache_free_zspage(zspage);
>
> class_stat_sub(class, ZS_OBJS_ALLOCATED, class->objs_per_zspage);
> @@ -946,6 +950,16 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
> if (!IS_ENABLED(CONFIG_COMPACTION))
> gfp &= ~__GFP_MOVABLE;
>
> + if (pool->memcg_aware) {
> + zspage->objcgs = kcalloc(class->objs_per_zspage,
> + sizeof(struct obj_cgroup *),
> + gfp & ~__GFP_HIGHMEM);
I remembered asking this, so my apologies if I missed/forgot your
response - but would vmalloc work here? i.e kvcalloc to fallback to
vmalloc etc.?
> + if (!zspage->objcgs) {
> + cache_free_zspage(zspage);
> + return NULL;
> + }
> + }
> +
> zspage->magic = ZSPAGE_MAGIC;
> zspage->pool = pool;
> zspage->class = class->index;
> @@ -955,14 +969,8 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
> struct zpdesc *zpdesc;
>
> zpdesc = alloc_zpdesc(gfp, nid);
> - if (!zpdesc) {
> - while (--i >= 0) {
> - zpdesc_dec_zone_page_state(zpdescs[i]);
> - free_zpdesc(zpdescs[i]);
> - }
> - cache_free_zspage(zspage);
> - return NULL;
> - }
> + if (!zpdesc)
> + goto err;
> __zpdesc_set_zsmalloc(zpdesc);
>
> zpdesc_inc_zone_page_state(zpdesc);
> @@ -973,6 +981,16 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
> init_zspage(class, zspage);
>
> return zspage;
> +
> +err:
> + while (--i >= 0) {
> + zpdesc_dec_zone_page_state(zpdescs[i]);
> + free_zpdesc(zpdescs[i]);
> + }
> + if (pool->memcg_aware)
> + kfree(zspage->objcgs);
> + cache_free_zspage(zspage);
> + return NULL;
> }
>
> static struct zspage *find_get_zspage(struct size_class *class)
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 05/11] mm/zsmalloc: Store obj_cgroup pointer in zspage
2026-03-11 19:51 ` [PATCH 05/11] mm/zsmalloc: Store obj_cgroup pointer in zspage Joshua Hahn
@ 2026-03-11 20:17 ` Yosry Ahmed
2026-03-11 20:24 ` Joshua Hahn
0 siblings, 1 reply; 11+ messages in thread
From: Yosry Ahmed @ 2026-03-11 20:17 UTC (permalink / raw)
To: Joshua Hahn
Cc: Minchan Kim, Sergey Senozhatsky, Johannes Weiner, Jens Axboe,
Yosry Ahmed, Nhat Pham, Nhat Pham, Chengming Zhou, Andrew Morton,
linux-mm, linux-block, linux-kernel, kernel-team
[..]
> @@ -1216,6 +1216,11 @@ void zs_obj_write(struct zs_pool *pool, unsigned long handle,
> class = zspage_class(pool, zspage);
> off = offset_in_page(class->size * obj_idx);
>
> + if (objcg) {
> + WARN_ON_ONCE(!pool->memcg_aware);
> + zspage->objcgs[obj_idx] = objcg;
> + }
If pool->memcg_aware is not set the warning will fire, but the
following line will write to uninitialized memory and probably crash.
We should avoid the write if the warning fires.
Maybe:
if (objcg && !WARN_ON_ONCE(!pool->memcg_aware))
zspage->objcgs[obj_idx] = objcg;
Not pretty, but the same pattern is followed in many places in the kernel.
> +
> if (!ZsHugePage(zspage))
> off += ZS_HANDLE_SIZE;
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool
2026-03-11 20:16 ` Johannes Weiner
@ 2026-03-11 20:19 ` Yosry Ahmed
2026-03-11 20:20 ` Joshua Hahn
1 sibling, 0 replies; 11+ messages in thread
From: Yosry Ahmed @ 2026-03-11 20:19 UTC (permalink / raw)
To: Johannes Weiner
Cc: Joshua Hahn, Minchan Kim, Sergey Senozhatsky, Yosry Ahmed,
Nhat Pham, Nhat Pham, Chengming Zhou, Andrew Morton, linux-mm,
linux-block, linux-kernel, kernel-team
> It's a bit awkward that 0 is valid (MEMCG_SWAP). Plus you store these
> values in every pool, even though they're always the same for all
> zswap pools.
>
> How about:
>
> /* zsmalloc.h */
> struct zs_memcg_params {
> enum memcg_stat_item compressed;
> enum memcg_stat_item uncompressed;
> };
> struct zs_pool *zs_create_pool(const char *name, struct zs_memcg_params *memcg_params);
>
> /* zswap.c */
> static struct zs_memcg_params zswap_memcg_params = {
> .compressed = MEMCG_ZSWAP_B,
> .uncompressed = MEMCG_ZSWAPPED,
> };
>
> then pass &zswap_memcg_params from zswap and NULL from zram.
>
> > @@ -2071,6 +2079,9 @@ struct zs_pool *zs_create_pool(const char *name)
> > rwlock_init(&pool->lock);
> > atomic_set(&pool->compaction_in_progress, 0);
> >
> > + pool->memcg_aware = memcg_aware;
> > + pool->compressed_stat = compressed_stat;
> > + pool->uncompressed_stat = uncompressed_stat;
>
> pool->memcg_params = memcg_params;
>
> And then use if (pool->memcg_params) to gate in zsmalloc.c.
I like this.
I also didn't like the suggested prototype for zs_create_pool(), and
didn't like that compressed_stat and uncompressed_stat did not have
memcg anywhere in the name. Was going to suggest adding a warning if
memcg_aware=false but the stat indices are non-zero, but your
suggestion is so much cleaner.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool
2026-03-11 20:16 ` Johannes Weiner
2026-03-11 20:19 ` Yosry Ahmed
@ 2026-03-11 20:20 ` Joshua Hahn
1 sibling, 0 replies; 11+ messages in thread
From: Joshua Hahn @ 2026-03-11 20:20 UTC (permalink / raw)
To: Johannes Weiner
Cc: Minchan Kim, Sergey Senozhatsky, Yosry Ahmed, Nhat Pham,
Nhat Pham, Chengming Zhou, Andrew Morton, linux-mm, linux-block,
linux-kernel, kernel-team
On Wed, 11 Mar 2026 16:16:34 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Wed, Mar 11, 2026 at 12:51:40PM -0700, Joshua Hahn wrote:
> > Introduce 3 new fields to struct zs_pool to allow individual zpools to
> > be "memcg-aware": memcg_aware, compressed_stat, and uncompressed_stat.
> >
> > memcg_aware is used in later patches to determine whether memory
> > should be allocated to keep track of per-compresed object objgs.
> > compressed_stat and uncompressed_stat are enum indices that point into
> > memcg (node) stats that zsmalloc will account towards.
> >
> > In reality, these fields help distinguish between the two users of
> > zsmalloc, zswap and zram. The enum indices compressed_stat and
> > uncompressed_stat are parametrized to minimize zswap-specific hardcoding
> > in zsmalloc.
> >
> > Suggested-by: Yosry Ahmed <yosry@kernel.org>
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> > drivers/block/zram/zram_drv.c | 3 ++-
> > include/linux/zsmalloc.h | 5 ++++-
> > mm/zsmalloc.c | 13 ++++++++++++-
> > mm/zswap.c | 3 ++-
> > 4 files changed, 20 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index bca33403fc8b..d1eae5c20df7 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -1980,7 +1980,8 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
> > if (!zram->table)
> > return false;
> >
> > - zram->mem_pool = zs_create_pool(zram->disk->disk_name);
> > + /* zram does not support memcg accounting */
> > + zram->mem_pool = zs_create_pool(zram->disk->disk_name, false, 0, 0);
Hello Johannes,
Thank you for your review! I hope you are doing well : -)
> It's a bit awkward that 0 is valid (MEMCG_SWAP). Plus you store these
> values in every pool, even though they're always the same for all
> zswap pools.
Agreed. Originally I thought of removing memcg_aware and doing a bitwise
OR check of the two stats, but thought this was also a bit strange
(also because 0 is not a valid enum state for memcg_stat_item anyways)
> How about:
>
> /* zsmalloc.h */
> struct zs_memcg_params {
> enum memcg_stat_item compressed;
> enum memcg_stat_item uncompressed;
> };
> struct zs_pool *zs_create_pool(const char *name, struct zs_memcg_params *memcg_params);
>
> /* zswap.c */
> static struct zs_memcg_params zswap_memcg_params = {
> .compressed = MEMCG_ZSWAP_B,
> .uncompressed = MEMCG_ZSWAPPED,
> };
>
> then pass &zswap_memcg_params from zswap and NULL from zram.
>
> > @@ -2071,6 +2079,9 @@ struct zs_pool *zs_create_pool(const char *name)
> > rwlock_init(&pool->lock);
> > atomic_set(&pool->compaction_in_progress, 0);
> >
> > + pool->memcg_aware = memcg_aware;
> > + pool->compressed_stat = compressed_stat;
> > + pool->uncompressed_stat = uncompressed_stat;
>
> pool->memcg_params = memcg_params;
>
> And then use if (pool->memcg_params) to gate in zsmalloc.c.
These definitely look a lot cleaner. Will make these changes in v3!
Thanks again. I hope you have a great day!
Joshua
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 04/11] mm/zsmalloc: Introduce objcgs pointer in struct zspage
2026-03-11 20:17 ` Nhat Pham
@ 2026-03-11 20:22 ` Joshua Hahn
0 siblings, 0 replies; 11+ messages in thread
From: Joshua Hahn @ 2026-03-11 20:22 UTC (permalink / raw)
To: Nhat Pham
Cc: Minchan Kim, Sergey Senozhatsky, Johannes Weiner, Harry Yoo,
Yosry Ahmed, Nhat Pham, Chengming Zhou, Andrew Morton, linux-mm,
linux-block, linux-kernel, kernel-team
On Wed, 11 Mar 2026 13:17:22 -0700 Nhat Pham <nphamcs@gmail.com> wrote:
> On Wed, Mar 11, 2026 at 12:52 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> >
> > Introduce an array of struct obj_cgroup pointers to zspage to keep track
> > of compressed objects' memcg ownership, if the zs_pool has been made to
> > be memcg-aware at creation time.
> >
> > Move the error path for alloc_zspage to a jump label to simplify the
> > growing error handling path for a failed zpdesc allocation.
> >
> > Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> > mm/zsmalloc.c | 34 ++++++++++++++++++++++++++--------
> > 1 file changed, 26 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 3f0f42b78314..dcf99516227c 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -39,6 +39,7 @@
> > #include <linux/zsmalloc.h>
> > #include <linux/fs.h>
> > #include <linux/workqueue.h>
> > +#include <linux/memcontrol.h>
> > #include "zpdesc.h"
> >
> > #define ZSPAGE_MAGIC 0x58
> > @@ -273,6 +274,7 @@ struct zspage {
> > struct zpdesc *first_zpdesc;
> > struct list_head list; /* fullness list */
> > struct zs_pool *pool;
> > + struct obj_cgroup **objcgs;
> > struct zspage_lock zsl;
> > };
> >
> > @@ -825,6 +827,8 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class,
> > zpdesc = next;
> > } while (zpdesc != NULL);
> >
> > + if (pool->memcg_aware)
> > + kfree(zspage->objcgs);
> > cache_free_zspage(zspage);
> >
> > class_stat_sub(class, ZS_OBJS_ALLOCATED, class->objs_per_zspage);
> > @@ -946,6 +950,16 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
> > if (!IS_ENABLED(CONFIG_COMPACTION))
> > gfp &= ~__GFP_MOVABLE;
> >
> > + if (pool->memcg_aware) {
> > + zspage->objcgs = kcalloc(class->objs_per_zspage,
> > + sizeof(struct obj_cgroup *),
> > + gfp & ~__GFP_HIGHMEM);
>
> I remembered asking this, so my apologies if I missed/forgot your
> response - but would vmalloc work here? i.e kvcalloc to fallback to
> vmalloc etc.?
Hello Nhat : -)
Thank you for reviewing, and for your acks on the other parts!
You're right, I missed changing that on my end after v1. No reason
vmalloc shouldn't work here, let me make that change in v3.
Thanks, I hope you have a great day!
Joshua
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 05/11] mm/zsmalloc: Store obj_cgroup pointer in zspage
2026-03-11 20:17 ` Yosry Ahmed
@ 2026-03-11 20:24 ` Joshua Hahn
0 siblings, 0 replies; 11+ messages in thread
From: Joshua Hahn @ 2026-03-11 20:24 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Minchan Kim, Sergey Senozhatsky, Johannes Weiner, Jens Axboe,
Yosry Ahmed, Nhat Pham, Nhat Pham, Chengming Zhou, Andrew Morton,
linux-mm, linux-block, linux-kernel, kernel-team
On Wed, 11 Mar 2026 13:17:26 -0700 Yosry Ahmed <yosry@kernel.org> wrote:
> [..]
> > @@ -1216,6 +1216,11 @@ void zs_obj_write(struct zs_pool *pool, unsigned long handle,
> > class = zspage_class(pool, zspage);
> > off = offset_in_page(class->size * obj_idx);
> >
> > + if (objcg) {
> > + WARN_ON_ONCE(!pool->memcg_aware);
> > + zspage->objcgs[obj_idx] = objcg;
> > + }
Hello Yosry,
I hope you are doing well. Thank you for reviewing this series! : -)
> If pool->memcg_aware is not set the warning will fire, but the
> following line will write to uninitialized memory and probably crash.
> We should avoid the write if the warning fires.
>
> Maybe:
>
> if (objcg && !WARN_ON_ONCE(!pool->memcg_aware))
> zspage->objcgs[obj_idx] = objcg;
Ack.
> Not pretty, but the same pattern is followed in many places in the kernel.
>
> > +
> > if (!ZsHugePage(zspage))
> > off += ZS_HANDLE_SIZE;
> >
Definitely better than writing garbage and crashing : -)
I'll make this change in the next version, I think I should also sprinkle
these WARN_ON_ONCEs in a few other places as well. I'll be more
mindful of this for those cases as well.
Thank you again Yosry, I hope you have a great day!
Joshua
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-03-11 20:24 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260311195153.4013476-1-joshua.hahnjy@gmail.com>
2026-03-11 19:51 ` [PATCH 03/11] mm/zsmalloc: Introduce conditional memcg awareness to zs_pool Joshua Hahn
2026-03-11 20:12 ` Nhat Pham
2026-03-11 20:16 ` Johannes Weiner
2026-03-11 20:19 ` Yosry Ahmed
2026-03-11 20:20 ` Joshua Hahn
2026-03-11 19:51 ` [PATCH 04/11] mm/zsmalloc: Introduce objcgs pointer in struct zspage Joshua Hahn
2026-03-11 20:17 ` Nhat Pham
2026-03-11 20:22 ` Joshua Hahn
2026-03-11 19:51 ` [PATCH 05/11] mm/zsmalloc: Store obj_cgroup pointer in zspage Joshua Hahn
2026-03-11 20:17 ` Yosry Ahmed
2026-03-11 20:24 ` Joshua Hahn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox