From: Waiman Long <longman@redhat.com>
To: "Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Hocko" <mhocko@kernel.org>,
"Roman Gushchin" <roman.gushchin@linux.dev>,
"Shakeel Butt" <shakeel.butt@linux.dev>,
"Muchun Song" <muchun.song@linux.dev>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Tejun Heo" <tj@kernel.org>, "Michal Koutný" <mkoutny@suse.com>,
"Shuah Khan" <shuah@kernel.org>
Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
Waiman Long <longman@redhat.com>
Subject: [PATCH v6 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()
Date: Sun, 13 Apr 2025 22:12:48 -0400 [thread overview]
Message-ID: <20250414021249.3232315-2-longman@redhat.com> (raw)
In-Reply-To: <20250414021249.3232315-1-longman@redhat.com>
The test_memcontrol selftest consistently fails its test_memcg_low
sub-test due to the fact that two of its test child cgroups which
have a memmory.low of 0 or an effective memory.low of 0 still have low
events generated for them since mem_cgroup_below_low() use the ">="
operator when comparing to elow.
The two failed use cases are as follows:
1) memory.low is set to 0, but low events can still be triggered and
so the cgroup may have a non-zero low event count.
2) memory.low is set to a non-zero value but the cgroup has no task in
it so that it has an effective low value of 0. Again it may have a
non-zero low event count if memory reclaim happens. This is probably
not a result expected by the users and it is really doubtful that
users will check an empty cgroup with no task in it and expecting
some non-zero event counts.
In the first case, even though memory.low isn't set, it may still have
some low protection if memory.low is set in the parent and the cgroup2
memory_recursiveprot mount option is enabled. So low event may still
be recorded. The test_memcontrol.c test has to be modified to account
for that.
For the second case, it really doesn't make sense to have non-zero
low event if the cgroup has 0 usage. So we need to skip this corner
case in shrink_node_memcgs() by skipping the !usage case.
With this patch applied, the test_memcg_low sub-test finishes
successfully without failure in most cases. Though both test_memcg_low
and test_memcg_min sub-tests may still fail occasionally if the
memory.current values fall outside of the expected ranges.
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
mm/internal.h | 9 +++++++++
mm/memcontrol-v1.h | 2 --
mm/vmscan.c | 4 ++++
tools/testing/selftests/cgroup/test_memcontrol.c | 16 +++++++++++-----
4 files changed, 24 insertions(+), 7 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 50c2f590b2d0..c06fb0e8d75c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1535,6 +1535,15 @@ void __meminit __init_page_from_nid(unsigned long pfn, int nid);
unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg,
int priority);
+#ifdef CONFIG_MEMCG
+unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap);
+#else
+static inline unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
+{
+ return 1UL;
+}
+#endif
+
#ifdef CONFIG_SHRINKER_DEBUG
static inline __printf(2, 0) int shrinker_debugfs_name_alloc(
struct shrinker *shrinker, const char *fmt, va_list ap)
diff --git a/mm/memcontrol-v1.h b/mm/memcontrol-v1.h
index 6358464bb416..e92b21af92b1 100644
--- a/mm/memcontrol-v1.h
+++ b/mm/memcontrol-v1.h
@@ -22,8 +22,6 @@
iter != NULL; \
iter = mem_cgroup_iter(NULL, iter, NULL))
-unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap);
-
void drain_all_stock(struct mem_cgroup *root_memcg);
unsigned long memcg_events(struct mem_cgroup *memcg, int event);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b620d74b0f66..a771a0145a12 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5963,6 +5963,10 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
mem_cgroup_calculate_protection(target_memcg, memcg);
+ /* Skip memcg with no usage */
+ if (!mem_cgroup_usage(memcg, false))
+ continue;
+
if (mem_cgroup_below_min(target_memcg, memcg)) {
/*
* Hard protection.
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 16f5d74ae762..5a5dcbe57b56 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -380,10 +380,10 @@ static bool reclaim_until(const char *memcg, long goal);
*
* Then it checks actual memory usages and expects that:
* A/B memory.current ~= 50M
- * A/B/C memory.current ~= 29M
- * A/B/D memory.current ~= 21M
- * A/B/E memory.current ~= 0
- * A/B/F memory.current = 0
+ * A/B/C memory.current ~= 29M [memory.events:low > 0]
+ * A/B/D memory.current ~= 21M [memory.events:low > 0]
+ * A/B/E memory.current ~= 0 [memory.events:low == 0 if !memory_recursiveprot, > 0 otherwise]
+ * A/B/F memory.current = 0 [memory.events:low == 0]
* (for origin of the numbers, see model in memcg_protection.m.)
*
* After that it tries to allocate more than there is
@@ -525,8 +525,14 @@ static int test_memcg_protection(const char *root, bool min)
goto cleanup;
}
+ /*
+ * Child 2 has memory.low=0, but some low protection is still being
+ * distributed down from its parent with memory.low=50M if cgroup2
+ * memory_recursiveprot mount option is enabled. So the low event
+ * count will be non-zero in this case.
+ */
for (i = 0; i < ARRAY_SIZE(children); i++) {
- int no_low_events_index = 1;
+ int no_low_events_index = has_recursiveprot ? 2 : 1;
long low, oom;
oom = cg_read_key_long(children[i], "memory.events", "oom ");
--
2.48.1
next prev parent reply other threads:[~2025-04-14 2:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-14 2:12 [PATCH v6 0/2] memcg: Fix test_memcg_min/low test failures Waiman Long
2025-04-14 2:12 ` Waiman Long [this message]
2025-04-14 12:42 ` [PATCH v6 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs() Michal Koutný
2025-04-14 13:15 ` Waiman Long
2025-04-14 13:55 ` Michal Koutný
2025-04-14 16:47 ` Johannes Weiner
2025-04-14 18:01 ` Michal Koutný
2025-04-14 18:10 ` Johannes Weiner
2025-04-14 2:12 ` [PATCH v6 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection() Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250414021249.3232315-2-longman@redhat.com \
--to=longman@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox