linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg
@ 2025-07-17  8:28 Hao Jia
  2025-07-17 19:47 ` Shakeel Butt
  2025-07-17 20:18 ` Yuanchu Xie
  0 siblings, 2 replies; 5+ messages in thread
From: Hao Jia @ 2025-07-17  8:28 UTC (permalink / raw)
  To: akpm, yuzhao, yuanchu, shakeel.butt, mhocko, lorenzo.stoakes,
	kinseyho, hannes, gthelen, david, axelrasmussen, zhengqi.arch
  Cc: linux-mm, linux-kernel, Hao Jia

From: Hao Jia <jiahao1@lixiang.com>

Users can use /sys/kernel/debug/lru_gen to trigger proactive memory reclaim
of a specified memcg. Currently, statistics such as pgrefill, pgscan and
pgsteal will be updated to the /proc/vmstat system memory statistics.

This will confuse some system memory pressure monitoring tools, making
it difficult to determine whether pgscan and pgsteal are caused by
system-level pressure or by proactive memory reclaim of some specific
memory cgroup.

Therefore, make this interface behave similarly to memory.reclaim.
Update proactive memory reclaim statistics only to its memory cgroup.

Signed-off-by: Hao Jia <jiahao1@lixiang.com>
---
 mm/vmscan.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index f8dfd2864bbf..bc92ec338065 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5545,6 +5545,7 @@ static int run_cmd(char cmd, int memcg_id, int nid, unsigned long seq,
 	if (memcg_id != mem_cgroup_id(memcg))
 		goto done;
 
+	sc->target_mem_cgroup = memcg;
 	lruvec = get_lruvec(memcg, nid);
 
 	if (swappiness < MIN_SWAPPINESS)
@@ -5581,6 +5582,7 @@ static ssize_t lru_gen_seq_write(struct file *file, const char __user *src,
 		.may_swap = true,
 		.reclaim_idx = MAX_NR_ZONES - 1,
 		.gfp_mask = GFP_KERNEL,
+		.proactive = true,
 	};
 
 	buf = kvmalloc(len + 1, GFP_KERNEL);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg
  2025-07-17  8:28 [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg Hao Jia
@ 2025-07-17 19:47 ` Shakeel Butt
  2025-07-18  3:09   ` Hao Jia
  2025-07-17 20:18 ` Yuanchu Xie
  1 sibling, 1 reply; 5+ messages in thread
From: Shakeel Butt @ 2025-07-17 19:47 UTC (permalink / raw)
  To: Hao Jia
  Cc: akpm, yuzhao, yuanchu, mhocko, lorenzo.stoakes, kinseyho, hannes,
	gthelen, david, axelrasmussen, zhengqi.arch, linux-mm,
	linux-kernel, Hao Jia

Hi Hao,

On Thu, Jul 17, 2025 at 04:28:45PM +0800, Hao Jia wrote:
> From: Hao Jia <jiahao1@lixiang.com>
> 
> Users can use /sys/kernel/debug/lru_gen to trigger proactive memory reclaim
> of a specified memcg.

Are you using this interface for proactively reclaiming a specific
memcg? I see run_cmd() using mem_cgroup_from_id() to get memcg from a
given id but I don't think we expose ids from mem_cgroup_ids to the
userspace. Usually we use cgroup_id which is just an inode number for
the cgroup folder. I wonder if the current users of this interface are
providing memcg id.

> Currently, statistics such as pgrefill, pgscan and
> pgsteal will be updated to the /proc/vmstat system memory statistics.
> 
> This will confuse some system memory pressure monitoring tools, making
> it difficult to determine whether pgscan and pgsteal are caused by
> system-level pressure or by proactive memory reclaim of some specific
> memory cgroup.
> 
> Therefore, make this interface behave similarly to memory.reclaim.
> Update proactive memory reclaim statistics only to its memory cgroup.
> 
> Signed-off-by: Hao Jia <jiahao1@lixiang.com>

The patch looks fine though.

> ---
>  mm/vmscan.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index f8dfd2864bbf..bc92ec338065 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -5545,6 +5545,7 @@ static int run_cmd(char cmd, int memcg_id, int nid, unsigned long seq,
>  	if (memcg_id != mem_cgroup_id(memcg))
>  		goto done;
>  
> +	sc->target_mem_cgroup = memcg;
>  	lruvec = get_lruvec(memcg, nid);
>  
>  	if (swappiness < MIN_SWAPPINESS)
> @@ -5581,6 +5582,7 @@ static ssize_t lru_gen_seq_write(struct file *file, const char __user *src,
>  		.may_swap = true,
>  		.reclaim_idx = MAX_NR_ZONES - 1,
>  		.gfp_mask = GFP_KERNEL,
> +		.proactive = true,
>  	};
>  
>  	buf = kvmalloc(len + 1, GFP_KERNEL);
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg
  2025-07-17  8:28 [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg Hao Jia
  2025-07-17 19:47 ` Shakeel Butt
@ 2025-07-17 20:18 ` Yuanchu Xie
  2025-07-18  3:30   ` Hao Jia
  1 sibling, 1 reply; 5+ messages in thread
From: Yuanchu Xie @ 2025-07-17 20:18 UTC (permalink / raw)
  To: Hao Jia
  Cc: akpm, yuzhao, shakeel.butt, mhocko, lorenzo.stoakes, kinseyho,
	hannes, gthelen, david, axelrasmussen, zhengqi.arch, linux-mm,
	linux-kernel, Hao Jia

Hi Hao,

On Thu, Jul 17, 2025 at 1:29 AM Hao Jia <jiahao.kernel@gmail.com> wrote:
>
> From: Hao Jia <jiahao1@lixiang.com>
>
> Users can use /sys/kernel/debug/lru_gen to trigger proactive memory reclaim
> of a specified memcg. Currently, statistics such as pgrefill, pgscan and
> pgsteal will be updated to the /proc/vmstat system memory statistics.

This is a debugfs interface and it's not meant for use in production
or provide a stable ABI. Does memory.reclaim not work for your needs?

I'm not against the change; I just hope you don't depend on it
continuing to exist/behave a certain way.

Shakeel's comment is accurate. The lru_gen interface uses the internal
memcg id which is not usually used to interface with the userspace.
Reading this file does show the cgroup path and memcg id association.

>
> This will confuse some system memory pressure monitoring tools, making
> it difficult to determine whether pgscan and pgsteal are caused by
> system-level pressure or by proactive memory reclaim of some specific
> memory cgroup.
>
> Therefore, make this interface behave similarly to memory.reclaim.
> Update proactive memory reclaim statistics only to its memory cgroup.
>
> Signed-off-by: Hao Jia <jiahao1@lixiang.com>

The patch looks okay to me too.

Thanks,
Yuanchu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg
  2025-07-17 19:47 ` Shakeel Butt
@ 2025-07-18  3:09   ` Hao Jia
  0 siblings, 0 replies; 5+ messages in thread
From: Hao Jia @ 2025-07-18  3:09 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: akpm, yuzhao, yuanchu, mhocko, lorenzo.stoakes, kinseyho, hannes,
	gthelen, david, axelrasmussen, zhengqi.arch, linux-mm,
	linux-kernel, Hao Jia



On 2025/7/18 03:47, Shakeel Butt wrote:
> Hi Hao,
> 
> On Thu, Jul 17, 2025 at 04:28:45PM +0800, Hao Jia wrote:
>> From: Hao Jia <jiahao1@lixiang.com>
>>
>> Users can use /sys/kernel/debug/lru_gen to trigger proactive memory reclaim
>> of a specified memcg.
> 

Hi Shakeel,

> Are you using this interface for proactively reclaiming a specific
> memcg? 

I am comparing using /sys/kernel/debug/lru_gen or cgroup.reclaim to 
trigger memory reclaim when MG-LRU is enabled.
For user-mode agents, the two interfaces can achieve the same function.


I see run_cmd() using mem_cgroup_from_id() to get memcg from a
> given id but I don't think we expose ids from mem_cgroup_ids to the
> userspace. Usually we use cgroup_id which is just an inode number for
> the cgroup folder. I wonder if the current users of this interface are
> providing memcg id.

We can get memcg id through ` cat /sys/kernel/debug/lru_gen `.


Thanks,
Hao
> 
>> Currently, statistics such as pgrefill, pgscan and
>> pgsteal will be updated to the /proc/vmstat system memory statistics.
>>
>> This will confuse some system memory pressure monitoring tools, making
>> it difficult to determine whether pgscan and pgsteal are caused by
>> system-level pressure or by proactive memory reclaim of some specific
>> memory cgroup.
>>
>> Therefore, make this interface behave similarly to memory.reclaim.
>> Update proactive memory reclaim statistics only to its memory cgroup.
>>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg
  2025-07-17 20:18 ` Yuanchu Xie
@ 2025-07-18  3:30   ` Hao Jia
  0 siblings, 0 replies; 5+ messages in thread
From: Hao Jia @ 2025-07-18  3:30 UTC (permalink / raw)
  To: Yuanchu Xie
  Cc: akpm, yuzhao, shakeel.butt, mhocko, lorenzo.stoakes, kinseyho,
	hannes, gthelen, david, axelrasmussen, zhengqi.arch, linux-mm,
	linux-kernel, Hao Jia



On 2025/7/18 04:18, Yuanchu Xie wrote:

Hi Yuanchu,

> Hi Hao,
> 
> On Thu, Jul 17, 2025 at 1:29 AM Hao Jia <jiahao.kernel@gmail.com> wrote:
>>
>> From: Hao Jia <jiahao1@lixiang.com>
>>
>> Users can use /sys/kernel/debug/lru_gen to trigger proactive memory reclaim
>> of a specified memcg. Currently, statistics such as pgrefill, pgscan and
>> pgsteal will be updated to the /proc/vmstat system memory statistics.
> 
> This is a debugfs interface and it's not meant for use in production
> or provide a stable ABI. Does memory.reclaim not work for your needs?
> 

No, I am comparing the two interfaces.

Thanks for your reminder, but I want to use this interface run_aging() 
to age folios, and separate proactive memory reclamation from multiple 
walk_mm() by combining it with BIT(LRU_GEN_MM_WALK).

For example, user-space agent, enable LRU_GEN_MM_WALK, and then trigger 
run_aging(). Then turn off LRU_GEN_MM_WALK and use cgroup.reclaim to 
trigger proactive reclamation. Avoid the long latency caused by walk_mm().


Maybe it would be more reasonable to put walk_mm() in workqueues?

I don't know if my idea is reasonable, any suggestions are welcome.

Thanks,
Hao

> I'm not against the change; I just hope you don't depend on it
> continuing to exist/behave a certain way.
> 
> Shakeel's comment is accurate. The lru_gen interface uses the internal
> memcg id which is not usually used to interface with the userspace.
> Reading this file does show the cgroup path and memcg id association.
> 
>>
>> This will confuse some system memory pressure monitoring tools, making
>> it difficult to determine whether pgscan and pgsteal are caused by
>> system-level pressure or by proactive memory reclaim of some specific
>> memory cgroup.
>>
>> Therefore, make this interface behave similarly to memory.reclaim.
>> Update proactive memory reclaim statistics only to its memory cgroup.
>>
>> Signed-off-by: Hao Jia <jiahao1@lixiang.com>
> 
> The patch looks okay to me too.
> 
> Thanks,
> Yuanchu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-18  3:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-17  8:28 [PATCH] mm/mglru: Update MG-LRU proactive reclaim statistics only to memcg Hao Jia
2025-07-17 19:47 ` Shakeel Butt
2025-07-18  3:09   ` Hao Jia
2025-07-17 20:18 ` Yuanchu Xie
2025-07-18  3:30   ` Hao Jia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).