public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Bing Jiao <bingjiao@google.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [RFC PATCH 6/6] mm/memcontrol: Make memory.high tier-aware
Date: Wed, 11 Mar 2026 22:05:16 +0000	[thread overview]
Message-ID: <abHnHN74V3okn28D@google.com> (raw)
In-Reply-To: <20260223223830.586018-7-joshua.hahnjy@gmail.com>

On Mon, Feb 23, 2026 at 02:38:29PM -0800, Joshua Hahn wrote:
> @@ -4485,15 +4527,22 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
>  		return err;
>
>  	page_counter_set_high(&memcg->memory, high);
> +	toptier_high = page_counter_toptier_high(&memcg->memory);
>
>  	if (of->file->f_flags & O_NONBLOCK)
>  		goto out;
>
>  	for (;;) {
>  		unsigned long nr_pages = page_counter_read(&memcg->memory);
> +		unsigned long toptier_pages = mem_cgroup_toptier_usage(memcg);
>  		unsigned long reclaimed;
> +		unsigned long to_free;
> +		nodemask_t toptier_nodes, *reclaim_nodes;
> +		bool mem_high_ok = nr_pages <= high;
> +		bool toptier_high_ok = !(tier_aware_memcg_limits &&
> +					 toptier_pages > toptier_high);
>
> -		if (nr_pages <= high)
> +		if (mem_high_ok && toptier_high_ok)
>  			break;
>
>  		if (signal_pending(current))
> @@ -4505,8 +4554,17 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
>  			continue;
>  		}
>
> -		reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
> -					GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP, NULL);
> +		mt_get_toptier_nodemask(&toptier_nodes, NULL);
> +		if (mem_high_ok && !toptier_high_ok) {
> +			reclaim_nodes = &toptier_nodes;
> +			to_free = toptier_pages - toptier_high;
> +		} else {
> +			reclaim_nodes = NULL;
> +			to_free = nr_pages - high;
> +		}
> +		reclaimed = try_to_free_mem_cgroup_pages(memcg, to_free,
> +					GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP,
> +					NULL, reclaim_nodes);
>
>  		if (!reclaimed && !nr_retries--)
>  			break;

Hi Joshua, thanks for the patch.

I have a concern regarding the system behavior when both the total
memory.high limit and the new toptier_high limit are breached.

If both mem_high_ok and toptier_high are false, memory_high_write()
invokes try_to_free_mem_cgroup_pages() with reclaim_nodes set to NULL
to target all nodes. Under these conditions, the reclaimer might attempt
to satisfy the target bytes by demoting pages from the top-tier to lower
tiers. While this fulfills the toptier_high requirement, it fails to
reduce the total memory charge for the cgroup because the counter tracks
the sum across all tiers. Consequently, since the total memory usage
remains unchanged, the reclaimer will likely become trapped in the loop
until it reaches MAX_RECLAIM_RETRIES and other situations (e.g.,
both !reclaimed && !nr_retries–), leading to excessive CPU consumption
without successfully bringing the cgroup below its total memory limit,
or causing all top-tier pages demoted to far-tier, or causing premature
OOM kills.

Given your tier-aware memcg limits, I think it is better to reclaim from
lower tiers to swap to satisfy mem_high_ok by setting the allowed nodemask
to far-tier nodes. Then demote pages from top tiers to ensure
toptier_high is okay. This also prevents reclaiming pages directly from
top tiers to swap and ensures that demotion actually contributes to
reaching the targeted memory state without unnecessary performance
penalties.

To address the issue where a memcg exceeds its total limit and demotion
cannot help to relief the memory memcg pressure, I am considering to
introduce a reclaim_options setting that prevents page demotion by
setting sc.no_demote = 1. I have a local patch for this and am preparing
it for submission.

Please let me know if I have misunderstood any part of your
implementation or if you see any issues with this proposed adjustment.

Best,
Bing



  reply	other threads:[~2026-03-11 22:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 22:38 [RFC PATCH 0/6] mm/memcontrol: Make memcg limits tier-aware Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 1/6] mm/memory-tiers: Introduce tier-aware memcg limit sysfs Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 2/6] mm/page_counter: Introduce tiered memory awareness to page_counter Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 3/6] mm/memory-tiers, memcontrol: Introduce toptier capacity updates Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 4/6] mm/memcontrol: Charge and uncharge from toptier Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 5/6] mm/memcontrol, page_counter: Make memory.low tier-aware Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 6/6] mm/memcontrol: Make memory.high tier-aware Joshua Hahn
2026-03-11 22:05   ` Bing Jiao [this message]
2026-03-12 19:44     ` Joshua Hahn
2026-03-24 10:51   ` Donet Tom
2026-03-24 15:23     ` Gregory Price
2026-03-24 15:46       ` Donet Tom
2026-03-24 15:44     ` Joshua Hahn
2026-03-24 16:06       ` Donet Tom
2026-02-24 11:27 ` [RFC PATCH 0/6] mm/memcontrol: Make memcg limits tier-aware Michal Hocko
2026-02-24 16:13   ` Joshua Hahn
2026-02-24 18:49     ` Gregory Price
2026-02-24 20:03       ` Kaiyang Zhao
2026-02-26  8:04     ` Michal Hocko
2026-02-26 16:08       ` Joshua Hahn
2026-03-24 10:30 ` Donet Tom
2026-03-24 14:58   ` Joshua Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abHnHN74V3okn28D@google.com \
    --to=bingjiao@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox