* [PATCH v3 1/3] mm/memcontrol: respect zswap.writeback setting from parent cg too
@ 2024-08-23 16:27 Mike Yuan
2024-08-23 16:27 ` [PATCH v3 2/3] selftests: test_zswap: add test for hierarchical zswap.writeback Mike Yuan
2024-08-23 16:27 ` [PATCH v3 3/3] Documentation/cgroup-v2: clarify that zswap.writeback is ignored if zswap is disabled Mike Yuan
0 siblings, 2 replies; 5+ messages in thread
From: Mike Yuan @ 2024-08-23 16:27 UTC (permalink / raw)
To: linux-kernel
Cc: Mike Yuan, linux-mm, cgroups, Nhat Pham, Yosry Ahmed,
Johannes Weiner, Andrew Morton, Muchun Song, Shakeel Butt,
Roman Gushchin, Michal Hocko, Michal Koutný, stable
Currently, the behavior of zswap.writeback wrt.
the cgroup hierarchy seems a bit odd. Unlike zswap.max,
it doesn't honor the value from parent cgroups. This
surfaced when people tried to globally disable zswap writeback,
i.e. reserve physical swap space only for hibernation [1] -
disabling zswap.writeback only for the root cgroup results
in subcgroups with zswap.writeback=1 still performing writeback.
The inconsistency became more noticeable after I introduced
the MemoryZSwapWriteback= systemd unit setting [2] for
controlling the knob. The patch assumed that the kernel would
enforce the value of parent cgroups. It could probably be
workarounded from systemd's side, by going up the slice unit
tree and inheriting the value. Yet I think it's more sensible
to make it behave consistently with zswap.max and friends.
[1] https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernate#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation
[2] https://github.com/systemd/systemd/pull/31734
Changes in v3:
- Additionally drop inheritance of zswap.writeback setting
on cgroup creation, which is no longer needed
Link to v2: https://lore.kernel.org/linux-kernel/20240816144344.18135-1-me@yhndnzj.com/
Changes in v2:
- Actually base on latest tree (is_zswap_enabled() -> zswap_is_enabled())
- Update Documentation/admin-guide/cgroup-v2.rst to reflect the change
Link to v1: https://lore.kernel.org/linux-kernel/20240814171800.23558-1-me@yhndnzj.com/
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Koutný <mkoutny@suse.com>
Fixes: 501a06fe8e4c ("zswap: memcontrol: implement zswap writeback disabling")
Cc: <stable@vger.kernel.org>
Signed-off-by: Mike Yuan <me@yhndnzj.com>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Yosry Ahmed <yosryahmed@google.com>
---
Documentation/admin-guide/cgroup-v2.rst | 7 ++++---
mm/memcontrol.c | 12 +++++++++---
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 86311c2907cd..95c18bc17083 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1717,9 +1717,10 @@ The following nested keys are defined.
entries fault back in or are written out to disk.
memory.zswap.writeback
- A read-write single value file. The default value is "1". The
- initial value of the root cgroup is 1, and when a new cgroup is
- created, it inherits the current value of its parent.
+ A read-write single value file. The default value is "1".
+ Note that this setting is hierarchical, i.e. the writeback would be
+ implicitly disabled for child cgroups if the upper hierarchy
+ does so.
When this is set to 0, all swapping attempts to swapping devices
are disabled. This included both zswap writebacks, and swapping due
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f29157288b7d..d563fb515766 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3613,8 +3613,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
memcg1_soft_limit_reset(memcg);
#ifdef CONFIG_ZSWAP
memcg->zswap_max = PAGE_COUNTER_MAX;
- WRITE_ONCE(memcg->zswap_writeback,
- !parent || READ_ONCE(parent->zswap_writeback));
+ WRITE_ONCE(memcg->zswap_writeback, true);
#endif
page_counter_set_high(&memcg->swap, PAGE_COUNTER_MAX);
if (parent) {
@@ -5320,7 +5319,14 @@ void obj_cgroup_uncharge_zswap(struct obj_cgroup *objcg, size_t size)
bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
{
/* if zswap is disabled, do not block pages going to the swapping device */
- return !zswap_is_enabled() || !memcg || READ_ONCE(memcg->zswap_writeback);
+ if (!zswap_is_enabled())
+ return true;
+
+ for (; memcg; memcg = parent_mem_cgroup(memcg))
+ if (!READ_ONCE(memcg->zswap_writeback))
+ return false;
+
+ return true;
}
static u64 zswap_current_read(struct cgroup_subsys_state *css,
base-commit: 47ac09b91befbb6a235ab620c32af719f8208399
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 2/3] selftests: test_zswap: add test for hierarchical zswap.writeback
2024-08-23 16:27 [PATCH v3 1/3] mm/memcontrol: respect zswap.writeback setting from parent cg too Mike Yuan
@ 2024-08-23 16:27 ` Mike Yuan
2024-08-23 16:27 ` [PATCH v3 3/3] Documentation/cgroup-v2: clarify that zswap.writeback is ignored if zswap is disabled Mike Yuan
1 sibling, 0 replies; 5+ messages in thread
From: Mike Yuan @ 2024-08-23 16:27 UTC (permalink / raw)
To: linux-kernel
Cc: Mike Yuan, linux-mm, cgroups, Nhat Pham, Yosry Ahmed,
Johannes Weiner, Andrew Morton, Muchun Song, Shakeel Butt,
Roman Gushchin, Michal Hocko
Ensure that zswap.writeback check goes up the cgroup tree, i.e.
is hierarchical. Create a subcgroup which has zswap.writeback
set to 1, and the upper hierarchy's restrictions shall apply.
Changes in v3:
- Skip the zswap.writeback test if root cgroup has it disabled
(I still think adding a whole new test is a bit of overkill,
though, instead came up with this alternative approach)
- Add comment about hardcoding child cg's zswap.writeback=1
which might not be obvious
Link to v2: https://lore.kernel.org/all/20240816144344.18135-2-me@yhndnzj.com/
Signed-off-by: Mike Yuan <me@yhndnzj.com>
---
tools/testing/selftests/cgroup/test_zswap.c | 75 +++++++++++++++------
1 file changed, 54 insertions(+), 21 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_zswap.c b/tools/testing/selftests/cgroup/test_zswap.c
index 190096017f80..40de679248b8 100644
--- a/tools/testing/selftests/cgroup/test_zswap.c
+++ b/tools/testing/selftests/cgroup/test_zswap.c
@@ -263,15 +263,13 @@ static int test_zswapin(const char *root)
static int attempt_writeback(const char *cgroup, void *arg)
{
long pagesize = sysconf(_SC_PAGESIZE);
- char *test_group = arg;
size_t memsize = MB(4);
char buf[pagesize];
long zswap_usage;
- bool wb_enabled;
+ bool wb_enabled = *(bool *) arg;
int ret = -1;
char *mem;
- wb_enabled = cg_read_long(test_group, "memory.zswap.writeback");
mem = (char *)malloc(memsize);
if (!mem)
return ret;
@@ -288,12 +286,12 @@ static int attempt_writeback(const char *cgroup, void *arg)
memcpy(&mem[i], buf, pagesize);
/* Try and reclaim allocated memory */
- if (cg_write_numeric(test_group, "memory.reclaim", memsize)) {
+ if (cg_write_numeric(cgroup, "memory.reclaim", memsize)) {
ksft_print_msg("Failed to reclaim all of the requested memory\n");
goto out;
}
- zswap_usage = cg_read_long(test_group, "memory.zswap.current");
+ zswap_usage = cg_read_long(cgroup, "memory.zswap.current");
/* zswpin */
for (int i = 0; i < memsize; i += pagesize) {
@@ -303,7 +301,7 @@ static int attempt_writeback(const char *cgroup, void *arg)
}
}
- if (cg_write_numeric(test_group, "memory.zswap.max", zswap_usage/2))
+ if (cg_write_numeric(cgroup, "memory.zswap.max", zswap_usage/2))
goto out;
/*
@@ -312,7 +310,7 @@ static int attempt_writeback(const char *cgroup, void *arg)
* If writeback is disabled, memory reclaim will fail as zswap is limited and
* it can't writeback to swap.
*/
- ret = cg_write_numeric(test_group, "memory.reclaim", memsize);
+ ret = cg_write_numeric(cgroup, "memory.reclaim", memsize);
if (!wb_enabled)
ret = (ret == -EAGAIN) ? 0 : -1;
@@ -321,12 +319,41 @@ static int attempt_writeback(const char *cgroup, void *arg)
return ret;
}
+static int test_zswap_writeback_one(const char *cgroup, bool wb)
+{
+ long zswpwb_before, zswpwb_after;
+
+ zswpwb_before = get_cg_wb_count(cgroup);
+ if (zswpwb_before != 0) {
+ ksft_print_msg("zswpwb_before = %ld instead of 0\n", zswpwb_before);
+ return -1;
+ }
+
+ if (cg_run(cgroup, attempt_writeback, (void *) &wb))
+ return -1;
+
+ /* Verify that zswap writeback occurred only if writeback was enabled */
+ zswpwb_after = get_cg_wb_count(cgroup);
+ if (zswpwb_after < 0)
+ return -1;
+
+ if (wb != !!zswpwb_after) {
+ ksft_print_msg("zswpwb_after is %ld while wb is %s",
+ zswpwb_after, wb ? "enabled" : "disabled");
+ return -1;
+ }
+
+ return 0;
+}
+
/* Test to verify the zswap writeback path */
static int test_zswap_writeback(const char *root, bool wb)
{
- long zswpwb_before, zswpwb_after;
int ret = KSFT_FAIL;
- char *test_group;
+ char *test_group, *test_group_child = NULL;
+
+ if (cg_read_strcmp(root, "memory.zswap.writeback", "1"))
+ return KSFT_SKIP;
test_group = cg_name(root, "zswap_writeback_test");
if (!test_group)
@@ -336,29 +363,35 @@ static int test_zswap_writeback(const char *root, bool wb)
if (cg_write(test_group, "memory.zswap.writeback", wb ? "1" : "0"))
goto out;
- zswpwb_before = get_cg_wb_count(test_group);
- if (zswpwb_before != 0) {
- ksft_print_msg("zswpwb_before = %ld instead of 0\n", zswpwb_before);
+ if (test_zswap_writeback_one(test_group, wb))
goto out;
- }
- if (cg_run(test_group, attempt_writeback, (void *) test_group))
+ /* Reset memory.zswap.max to max (modified by attempt_writeback), and
+ * set up child cgroup, whose memory.zswap.writeback is hardcoded to 1.
+ * Thus, the parent's setting shall be what's in effect. */
+ if (cg_write(test_group, "memory.zswap.max", "max"))
+ goto out;
+ if (cg_write(test_group, "cgroup.subtree_control", "+memory"))
goto out;
- /* Verify that zswap writeback occurred only if writeback was enabled */
- zswpwb_after = get_cg_wb_count(test_group);
- if (zswpwb_after < 0)
+ test_group_child = cg_name(test_group, "zswap_writeback_test_child");
+ if (!test_group_child)
+ goto out;
+ if (cg_create(test_group_child))
+ goto out;
+ if (cg_write(test_group_child, "memory.zswap.writeback", "1"))
goto out;
- if (wb != !!zswpwb_after) {
- ksft_print_msg("zswpwb_after is %ld while wb is %s",
- zswpwb_after, wb ? "enabled" : "disabled");
+ if (test_zswap_writeback_one(test_group_child, wb))
goto out;
- }
ret = KSFT_PASS;
out:
+ if (test_group_child) {
+ cg_destroy(test_group_child);
+ free(test_group_child);
+ }
cg_destroy(test_group);
free(test_group);
return ret;
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 3/3] Documentation/cgroup-v2: clarify that zswap.writeback is ignored if zswap is disabled
2024-08-23 16:27 [PATCH v3 1/3] mm/memcontrol: respect zswap.writeback setting from parent cg too Mike Yuan
2024-08-23 16:27 ` [PATCH v3 2/3] selftests: test_zswap: add test for hierarchical zswap.writeback Mike Yuan
@ 2024-08-23 16:27 ` Mike Yuan
2024-09-02 14:13 ` Mike Yuan
1 sibling, 1 reply; 5+ messages in thread
From: Mike Yuan @ 2024-08-23 16:27 UTC (permalink / raw)
To: linux-kernel
Cc: Mike Yuan, linux-mm, cgroups, Nhat Pham, Yosry Ahmed,
Johannes Weiner, Andrew Morton, Muchun Song, Shakeel Butt,
Roman Gushchin, Michal Hocko
As discussed in [1], zswap-related settings natually
lose their effect when zswap is disabled, specifically
zswap.writeback here. Be explicit about this behavior.
[1] https://lore.kernel.org/linux-kernel/CAKEwX=Mhbwhh-=xxCU-RjMXS_n=RpV3Gtznb2m_3JgL+jzz++g@mail.gmail.com/
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Mike Yuan <me@yhndnzj.com>
---
Documentation/admin-guide/cgroup-v2.rst | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 95c18bc17083..a1723e402596 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1731,6 +1731,8 @@ The following nested keys are defined.
Note that this is subtly different from setting memory.swap.max to
0, as it still allows for pages to be written to the zswap pool.
+ This setting has no effect if zswap is disabled, and swapping
+ would be allowed unless memory.swap.max is set to 0.
memory.pressure
A read-only nested-keyed file.
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3 3/3] Documentation/cgroup-v2: clarify that zswap.writeback is ignored if zswap is disabled
2024-08-23 16:27 ` [PATCH v3 3/3] Documentation/cgroup-v2: clarify that zswap.writeback is ignored if zswap is disabled Mike Yuan
@ 2024-09-02 14:13 ` Mike Yuan
2024-09-02 21:02 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: Mike Yuan @ 2024-09-02 14:13 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, cgroups, Nhat Pham, Yosry Ahmed,
Johannes Weiner, Muchun Song, Shakeel Butt, Roman Gushchin,
Michal Hocko
> As discussed in [1], zswap-related settings natually
> lose their effect when zswap is disabled, specifically
> zswap.writeback here. Be explicit about this behavior.
>
> [1]
> https://lore.kernel.org/linux-kernel/CAKEwX=Mhbwhh-=xxCU-RjMXS_n=RpV3Gtznb2m_3JgL+jzz++g@mail.gmail.com/
>
> Cc: Nhat Pham <nphamcs@gmail.com>
> Cc: Yosry Ahmed <yosryahmed@google.com>
>
> Signed-off-by: Mike Yuan <me@yhndnzj.com>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst
> b/Documentation/admin-guide/cgroup-v2.rst
> index 95c18bc17083..a1723e402596 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1731,6 +1731,8 @@ The following nested keys are defined.
>
> Note that this is subtly different from setting
> memory.swap.max to
> 0, as it still allows for pages to be written to the zswap
> pool.
> + This setting has no effect if zswap is disabled, and
> swapping
> + would be allowed unless memory.swap.max is set to 0.
>
> memory.pressure
> A read-only nested-keyed file.
Hmm, Andrew, it seems that the commit messages of this and the previous
patch are somehow reversed/mismatched? [1][2] Could you please confirm
and fix it?
[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-unstable&id=eef275964326760bb55803b167854981cab97e55
[2]:
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-unstable&id=42c3628a37400c2bc4199b9f8423be701646d4e0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 3/3] Documentation/cgroup-v2: clarify that zswap.writeback is ignored if zswap is disabled
2024-09-02 14:13 ` Mike Yuan
@ 2024-09-02 21:02 ` Andrew Morton
0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2024-09-02 21:02 UTC (permalink / raw)
To: Mike Yuan
Cc: linux-kernel, linux-mm, cgroups, Nhat Pham, Yosry Ahmed,
Johannes Weiner, Muchun Song, Shakeel Butt, Roman Gushchin,
Michal Hocko
On Mon, 02 Sep 2024 14:13:43 +0000 Mike Yuan <me@yhndnzj.com> wrote:
> > + This setting has no effect if zswap is disabled, and
> > swapping
> > + would be allowed unless memory.swap.max is set to 0.
> >
> > memory.pressure
> > A read-only nested-keyed file.
>
> Hmm, Andrew, it seems that the commit messages of this and the previous
> patch are somehow reversed/mismatched? [1][2] Could you please confirm
> and fix it?
Yup thanks, a side-effect of turning a three-patch series into a
one-patch hotfix and a two-patch series.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-09-02 21:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-23 16:27 [PATCH v3 1/3] mm/memcontrol: respect zswap.writeback setting from parent cg too Mike Yuan
2024-08-23 16:27 ` [PATCH v3 2/3] selftests: test_zswap: add test for hierarchical zswap.writeback Mike Yuan
2024-08-23 16:27 ` [PATCH v3 3/3] Documentation/cgroup-v2: clarify that zswap.writeback is ignored if zswap is disabled Mike Yuan
2024-09-02 14:13 ` Mike Yuan
2024-09-02 21:02 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).