* [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
@ 2024-06-25 21:03 Andrew Morton
2024-06-26 6:12 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2024-06-25 21:03 UTC (permalink / raw)
To: mm-commits, yosryahmed, willy, wangkefeng.wang, tj, shakeel.butt,
shakeelb, roman.gushchin, rientjes, nphamcs, muchun.song, mhocko,
lizefan.x, hughd, hannes, findns94, david, corbet, chrisl,
schatzberg.dan, akpm
The quilt patch titled
Subject: mm: add swappiness= arg to memory.reclaim
has been removed from the -mm tree. Its filename was
mm-add-swapiness=-arg-to-memoryreclaim.patch
This patch was dropped because it was nacked
------------------------------------------------------
From: Dan Schatzberg <schatzberg.dan@gmail.com>
Subject: mm: add swappiness= arg to memory.reclaim
Date: Wed, 3 Jan 2024 08:48:37 -0800
Allow proactive reclaimers to submit an additional swappiness=<val>
argument to memory.reclaim. This overrides the global or per-memcg
swappiness setting for that reclaim attempt.
For example:
echo "2M swappiness=0" > /sys/fs/cgroup/memory.reclaim
will perform reclaim on the rootcg with a swappiness setting of 0 (no
swap) regardless of the vm.swappiness sysctl setting.
Userspace proactive reclaimers use the memory.reclaim interface to trigger
reclaim. The memory.reclaim interface does not allow for any way to
effect the balance of file vs anon during proactive reclaim. The only
approach is to adjust the vm.swappiness setting. However, there are a few
reasons we look to control the balance of file vs anon during proactive
reclaim, separately from reactive reclaim:
* Swapout should be limited to manage SSD write endurance. In near-OOM
situations we are fine with lots of swap-out to avoid OOMs. As these
are typically rare events, they have relatively little impact on write
endurance. However, proactive reclaim runs continuously and so its
impact on SSD write endurance is more significant. Therefore it is
desireable to control swap-out for proactive reclaim separately from
reactive reclaim
* Some userspace OOM killers like systemd-oomd[1] support OOM killing on
swap exhaustion. This makes sense if the swap exhaustion is triggered
due to reactive reclaim but less so if it is triggered due to proactive
reclaim (e.g. one could see OOMs when free memory is ample but anon is
just particularly cold). Therefore, it's desireable to have proactive
reclaim reduce or stop swap-out before the threshold at which OOM
killing occurs.
In the case of Meta's Senpai proactive reclaimer, we adjust vm.swappiness
before writes to memory.reclaim[2]. This has been in production for
nearly two years and has addressed our needs to control proactive vs
reactive reclaim behavior but is still not ideal for a number of reasons:
* vm.swappiness is a global setting, adjusting it can race/interfere
with other system administration that wishes to control vm.swappiness.
In our case, we need to disable Senpai before adjusting vm.swappiness.
* vm.swappiness is stateful - so a crash or restart of Senpai can leave
a misconfigured setting. This requires some additional management to
record the "desired" setting and ensure Senpai always adjusts to it.
With this patch, we avoid these downsides of adjusting vm.swappiness
globally.
[1]https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html
[2]https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai.cpp#L585-L598
Link: https://lkml.kernel.org/r/20240103164841.2800183-3-schatzberg.dan@gmail.com
Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
Suggested-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yue Zhao <findns94@gmail.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/admin-guide/cgroup-v2.rst | 18 ++++--
include/linux/swap.h | 3 -
mm/memcontrol.c | 57 +++++++++++++++++-----
mm/vmscan.c | 25 ++++++++-
4 files changed, 81 insertions(+), 22 deletions(-)
--- a/Documentation/admin-guide/cgroup-v2.rst~mm-add-swapiness=-arg-to-memoryreclaim
+++ a/Documentation/admin-guide/cgroup-v2.rst
@@ -1299,17 +1299,10 @@ PAGE_SIZE multiple when read back.
This is a simple interface to trigger memory reclaim in the
target cgroup.
- This file accepts a single key, the number of bytes to reclaim.
- No nested keys are currently supported.
-
Example::
echo "1G" > memory.reclaim
- The interface can be later extended with nested keys to
- configure the reclaim behavior. For example, specify the
- type of memory to reclaim from (anon, file, ..).
-
Please note that the kernel can over or under reclaim from
the target cgroup. If less bytes are reclaimed than the
specified amount, -EAGAIN is returned.
@@ -1321,6 +1314,17 @@ PAGE_SIZE multiple when read back.
This means that the networking layer will not adapt based on
reclaim induced by memory.reclaim.
+The following nested keys are defined.
+
+ ========== ================================
+ swappiness Swappiness value to reclaim with
+ ========== ================================
+
+ Specifying a swappiness value instructs the kernel to perform
+ the reclaim with that swappiness value. Note that this has the
+ same semantics as vm.swappiness applied to memcg reclaim with
+ all the existing limitations and potential future extensions.
+
memory.peak
A read-only single value file which exists on non-root
cgroups.
--- a/include/linux/swap.h~mm-add-swapiness=-arg-to-memoryreclaim
+++ a/include/linux/swap.h
@@ -410,7 +410,8 @@ extern unsigned long try_to_free_pages(s
extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
unsigned long nr_pages,
gfp_t gfp_mask,
- unsigned int reclaim_options);
+ unsigned int reclaim_options,
+ int *swappiness);
extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem,
gfp_t gfp_mask, bool noswap,
pg_data_t *pgdat,
--- a/mm/memcontrol.c~mm-add-swapiness=-arg-to-memoryreclaim
+++ a/mm/memcontrol.c
@@ -53,6 +53,7 @@
#include <linux/sort.h>
#include <linux/fs.h>
#include <linux/seq_file.h>
+#include <linux/parser.h>
#include <linux/vmpressure.h>
#include <linux/memremap.h>
#include <linux/mm_inline.h>
@@ -2636,7 +2637,8 @@ static unsigned long reclaim_high(struct
psi_memstall_enter(&pflags);
nr_reclaimed += try_to_free_mem_cgroup_pages(memcg, nr_pages,
gfp_mask,
- MEMCG_RECLAIM_MAY_SWAP);
+ MEMCG_RECLAIM_MAY_SWAP,
+ NULL);
psi_memstall_leave(&pflags);
} while ((memcg = parent_mem_cgroup(memcg)) &&
!mem_cgroup_is_root(memcg));
@@ -2942,7 +2944,7 @@ retry:
psi_memstall_enter(&pflags);
nr_reclaimed = try_to_free_mem_cgroup_pages(mem_over_limit, nr_pages,
- gfp_mask, reclaim_options);
+ gfp_mask, reclaim_options, NULL);
psi_memstall_leave(&pflags);
if (mem_cgroup_margin(mem_over_limit) >= nr_pages)
@@ -3911,7 +3913,7 @@ static int mem_cgroup_resize_max(struct
}
if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL,
- memsw ? 0 : MEMCG_RECLAIM_MAY_SWAP)) {
+ memsw ? 0 : MEMCG_RECLAIM_MAY_SWAP, NULL)) {
ret = -EBUSY;
break;
}
@@ -4025,7 +4027,7 @@ static int mem_cgroup_force_empty(struct
return -EINTR;
if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL,
- MEMCG_RECLAIM_MAY_SWAP))
+ MEMCG_RECLAIM_MAY_SWAP, NULL))
nr_retries--;
}
@@ -7000,7 +7002,7 @@ static ssize_t memory_high_write(struct
}
reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
- GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP);
+ GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP, NULL);
if (!reclaimed && !nr_retries--)
break;
@@ -7049,7 +7051,7 @@ static ssize_t memory_max_write(struct k
if (nr_reclaims) {
if (!try_to_free_mem_cgroup_pages(memcg, nr_pages - max,
- GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP))
+ GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP, NULL))
nr_reclaims--;
continue;
}
@@ -7179,19 +7181,50 @@ static ssize_t memory_oom_group_write(st
return nbytes;
}
+enum {
+ MEMORY_RECLAIM_SWAPPINESS = 0,
+ MEMORY_RECLAIM_NULL,
+};
+
+static const match_table_t tokens = {
+ { MEMORY_RECLAIM_SWAPPINESS, "swappiness=%d"},
+ { MEMORY_RECLAIM_NULL, NULL },
+};
+
static ssize_t memory_reclaim(struct kernfs_open_file *of, char *buf,
size_t nbytes, loff_t off)
{
struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
unsigned int nr_retries = MAX_RECLAIM_RETRIES;
unsigned long nr_to_reclaim, nr_reclaimed = 0;
+ int swappiness = -1;
unsigned int reclaim_options;
- int err;
+ char *old_buf, *start;
+ substring_t args[MAX_OPT_ARGS];
buf = strstrip(buf);
- err = page_counter_memparse(buf, "", &nr_to_reclaim);
- if (err)
- return err;
+
+ old_buf = buf;
+ nr_to_reclaim = memparse(buf, &buf) / PAGE_SIZE;
+ if (buf == old_buf)
+ return -EINVAL;
+
+ buf = strstrip(buf);
+
+ while ((start = strsep(&buf, " ")) != NULL) {
+ if (!strlen(start))
+ continue;
+ switch (match_token(start, tokens, args)) {
+ case MEMORY_RECLAIM_SWAPPINESS:
+ if (match_int(&args[0], &swappiness))
+ return -EINVAL;
+ if (swappiness < MIN_SWAPPINESS || swappiness > MAX_SWAPPINESS)
+ return -EINVAL;
+ break;
+ default:
+ return -EINVAL;
+ }
+ }
reclaim_options = MEMCG_RECLAIM_MAY_SWAP | MEMCG_RECLAIM_PROACTIVE;
while (nr_reclaimed < nr_to_reclaim) {
@@ -7211,7 +7244,9 @@ static ssize_t memory_reclaim(struct ker
lru_add_drain_all();
reclaimed = try_to_free_mem_cgroup_pages(memcg,
- batch_size, GFP_KERNEL, reclaim_options);
+ batch_size, GFP_KERNEL,
+ reclaim_options,
+ swappiness == -1 ? NULL : &swappiness);
if (!reclaimed && !nr_retries--)
return -EAGAIN;
--- a/mm/vmscan.c~mm-add-swapiness=-arg-to-memoryreclaim
+++ a/mm/vmscan.c
@@ -92,6 +92,11 @@ struct scan_control {
unsigned long anon_cost;
unsigned long file_cost;
+#ifdef CONFIG_MEMCG
+ /* Swappiness value for proactive reclaim. Always use sc_swappiness()! */
+ int *proactive_swappiness;
+#endif
+
/* Can active folios be deactivated as part of reclaim? */
#define DEACTIVATE_ANON 1
#define DEACTIVATE_FILE 2
@@ -236,6 +241,13 @@ static bool writeback_throttling_sane(st
#endif
return false;
}
+
+static int sc_swappiness(struct scan_control *sc, struct mem_cgroup *memcg)
+{
+ if (sc->proactive && sc->proactive_swappiness)
+ return *sc->proactive_swappiness;
+ return mem_cgroup_swappiness(memcg);
+}
#else
static bool cgroup_reclaim(struct scan_control *sc)
{
@@ -251,6 +263,11 @@ static bool writeback_throttling_sane(st
{
return true;
}
+
+static int sc_swappiness(struct scan_control *sc, struct mem_cgroup *memcg)
+{
+ return READ_ONCE(vm_swappiness);
+}
#endif
static void set_task_reclaim_state(struct task_struct *task,
@@ -2351,7 +2368,7 @@ static void get_scan_count(struct lruvec
struct pglist_data *pgdat = lruvec_pgdat(lruvec);
struct mem_cgroup *memcg = lruvec_memcg(lruvec);
unsigned long anon_cost, file_cost, total_cost;
- int swappiness = mem_cgroup_swappiness(memcg);
+ int swappiness = sc_swappiness(sc, memcg);
u64 fraction[ANON_AND_FILE];
u64 denominator = 0; /* gcc */
enum scan_balance scan_balance;
@@ -2632,7 +2649,7 @@ static int get_swappiness(struct lruvec
mem_cgroup_get_nr_swap_pages(memcg) < MIN_LRU_BATCH)
return 0;
- return mem_cgroup_swappiness(memcg);
+ return sc_swappiness(sc, memcg);
}
static int get_nr_gens(struct lruvec *lruvec, int type)
@@ -6549,12 +6566,14 @@ unsigned long mem_cgroup_shrink_node(str
unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
unsigned long nr_pages,
gfp_t gfp_mask,
- unsigned int reclaim_options)
+ unsigned int reclaim_options,
+ int *swappiness)
{
unsigned long nr_reclaimed;
unsigned int noreclaim_flag;
struct scan_control sc = {
.nr_to_reclaim = max(nr_pages, SWAP_CLUSTER_MAX),
+ .proactive_swappiness = swappiness,
.gfp_mask = (current_gfp_context(gfp_mask) & GFP_RECLAIM_MASK) |
(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK),
.reclaim_idx = MAX_NR_ZONES - 1,
_
Patches currently in -mm which might be from schatzberg.dan@gmail.com are
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
2024-06-25 21:03 [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree Andrew Morton
@ 2024-06-26 6:12 ` Michal Hocko
2024-06-26 17:11 ` Andrew Morton
2024-06-26 17:36 ` Johannes Weiner
0 siblings, 2 replies; 8+ messages in thread
From: Michal Hocko @ 2024-06-26 6:12 UTC (permalink / raw)
To: Andrew Morton
Cc: mm-commits, yosryahmed, willy, wangkefeng.wang, tj, shakeel.butt,
shakeelb, roman.gushchin, rientjes, nphamcs, muchun.song,
lizefan.x, hughd, hannes, findns94, david, corbet, chrisl,
schatzberg.dan
On Tue 25-06-24 14:03:33, Andrew Morton wrote:
>
> The quilt patch titled
> Subject: mm: add swappiness= arg to memory.reclaim
> has been removed from the -mm tree. Its filename was
> mm-add-swapiness=-arg-to-memoryreclaim.patch
>
> This patch was dropped because it was nacked
I do not see this being nacked in the original email thread.
> ------------------------------------------------------
> From: Dan Schatzberg <schatzberg.dan@gmail.com>
> Subject: mm: add swappiness= arg to memory.reclaim
> Date: Wed, 3 Jan 2024 08:48:37 -0800
>
> Allow proactive reclaimers to submit an additional swappiness=<val>
> argument to memory.reclaim. This overrides the global or per-memcg
> swappiness setting for that reclaim attempt.
>
> For example:
>
> echo "2M swappiness=0" > /sys/fs/cgroup/memory.reclaim
>
> will perform reclaim on the rootcg with a swappiness setting of 0 (no
> swap) regardless of the vm.swappiness sysctl setting.
>
> Userspace proactive reclaimers use the memory.reclaim interface to trigger
> reclaim. The memory.reclaim interface does not allow for any way to
> effect the balance of file vs anon during proactive reclaim. The only
> approach is to adjust the vm.swappiness setting. However, there are a few
> reasons we look to control the balance of file vs anon during proactive
> reclaim, separately from reactive reclaim:
>
> * Swapout should be limited to manage SSD write endurance. In near-OOM
> situations we are fine with lots of swap-out to avoid OOMs. As these
> are typically rare events, they have relatively little impact on write
> endurance. However, proactive reclaim runs continuously and so its
> impact on SSD write endurance is more significant. Therefore it is
> desireable to control swap-out for proactive reclaim separately from
> reactive reclaim
>
> * Some userspace OOM killers like systemd-oomd[1] support OOM killing on
> swap exhaustion. This makes sense if the swap exhaustion is triggered
> due to reactive reclaim but less so if it is triggered due to proactive
> reclaim (e.g. one could see OOMs when free memory is ample but anon is
> just particularly cold). Therefore, it's desireable to have proactive
> reclaim reduce or stop swap-out before the threshold at which OOM
> killing occurs.
>
> In the case of Meta's Senpai proactive reclaimer, we adjust vm.swappiness
> before writes to memory.reclaim[2]. This has been in production for
> nearly two years and has addressed our needs to control proactive vs
> reactive reclaim behavior but is still not ideal for a number of reasons:
>
> * vm.swappiness is a global setting, adjusting it can race/interfere
> with other system administration that wishes to control vm.swappiness.
> In our case, we need to disable Senpai before adjusting vm.swappiness.
>
> * vm.swappiness is stateful - so a crash or restart of Senpai can leave
> a misconfigured setting. This requires some additional management to
> record the "desired" setting and ensure Senpai always adjusts to it.
>
> With this patch, we avoid these downsides of adjusting vm.swappiness
> globally.
>
> [1]https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html
> [2]https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai.cpp#L585-L598
>
> Link: https://lkml.kernel.org/r/20240103164841.2800183-3-schatzberg.dan@gmail.com
> Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
> Suggested-by: Yosry Ahmed <yosryahmed@google.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Acked-by: David Rientjes <rientjes@google.com>
> Acked-by: Chris Li <chrisl@kernel.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Shakeel Butt <shakeel.butt@linux.dev>
> Cc: Shakeel Butt <shakeelb@google.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Yue Zhao <findns94@gmail.com>
> Cc: Zefan Li <lizefan.x@bytedance.com>
> Cc: Nhat Pham <nphamcs@gmail.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> Documentation/admin-guide/cgroup-v2.rst | 18 ++++--
> include/linux/swap.h | 3 -
> mm/memcontrol.c | 57 +++++++++++++++++-----
> mm/vmscan.c | 25 ++++++++-
> 4 files changed, 81 insertions(+), 22 deletions(-)
>
> --- a/Documentation/admin-guide/cgroup-v2.rst~mm-add-swapiness=-arg-to-memoryreclaim
> +++ a/Documentation/admin-guide/cgroup-v2.rst
> @@ -1299,17 +1299,10 @@ PAGE_SIZE multiple when read back.
> This is a simple interface to trigger memory reclaim in the
> target cgroup.
>
> - This file accepts a single key, the number of bytes to reclaim.
> - No nested keys are currently supported.
> -
> Example::
>
> echo "1G" > memory.reclaim
>
> - The interface can be later extended with nested keys to
> - configure the reclaim behavior. For example, specify the
> - type of memory to reclaim from (anon, file, ..).
> -
> Please note that the kernel can over or under reclaim from
> the target cgroup. If less bytes are reclaimed than the
> specified amount, -EAGAIN is returned.
> @@ -1321,6 +1314,17 @@ PAGE_SIZE multiple when read back.
> This means that the networking layer will not adapt based on
> reclaim induced by memory.reclaim.
>
> +The following nested keys are defined.
> +
> + ========== ================================
> + swappiness Swappiness value to reclaim with
> + ========== ================================
> +
> + Specifying a swappiness value instructs the kernel to perform
> + the reclaim with that swappiness value. Note that this has the
> + same semantics as vm.swappiness applied to memcg reclaim with
> + all the existing limitations and potential future extensions.
> +
> memory.peak
> A read-only single value file which exists on non-root
> cgroups.
> --- a/include/linux/swap.h~mm-add-swapiness=-arg-to-memoryreclaim
> +++ a/include/linux/swap.h
> @@ -410,7 +410,8 @@ extern unsigned long try_to_free_pages(s
> extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
> unsigned long nr_pages,
> gfp_t gfp_mask,
> - unsigned int reclaim_options);
> + unsigned int reclaim_options,
> + int *swappiness);
> extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem,
> gfp_t gfp_mask, bool noswap,
> pg_data_t *pgdat,
> --- a/mm/memcontrol.c~mm-add-swapiness=-arg-to-memoryreclaim
> +++ a/mm/memcontrol.c
> @@ -53,6 +53,7 @@
> #include <linux/sort.h>
> #include <linux/fs.h>
> #include <linux/seq_file.h>
> +#include <linux/parser.h>
> #include <linux/vmpressure.h>
> #include <linux/memremap.h>
> #include <linux/mm_inline.h>
> @@ -2636,7 +2637,8 @@ static unsigned long reclaim_high(struct
> psi_memstall_enter(&pflags);
> nr_reclaimed += try_to_free_mem_cgroup_pages(memcg, nr_pages,
> gfp_mask,
> - MEMCG_RECLAIM_MAY_SWAP);
> + MEMCG_RECLAIM_MAY_SWAP,
> + NULL);
> psi_memstall_leave(&pflags);
> } while ((memcg = parent_mem_cgroup(memcg)) &&
> !mem_cgroup_is_root(memcg));
> @@ -2942,7 +2944,7 @@ retry:
>
> psi_memstall_enter(&pflags);
> nr_reclaimed = try_to_free_mem_cgroup_pages(mem_over_limit, nr_pages,
> - gfp_mask, reclaim_options);
> + gfp_mask, reclaim_options, NULL);
> psi_memstall_leave(&pflags);
>
> if (mem_cgroup_margin(mem_over_limit) >= nr_pages)
> @@ -3911,7 +3913,7 @@ static int mem_cgroup_resize_max(struct
> }
>
> if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL,
> - memsw ? 0 : MEMCG_RECLAIM_MAY_SWAP)) {
> + memsw ? 0 : MEMCG_RECLAIM_MAY_SWAP, NULL)) {
> ret = -EBUSY;
> break;
> }
> @@ -4025,7 +4027,7 @@ static int mem_cgroup_force_empty(struct
> return -EINTR;
>
> if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL,
> - MEMCG_RECLAIM_MAY_SWAP))
> + MEMCG_RECLAIM_MAY_SWAP, NULL))
> nr_retries--;
> }
>
> @@ -7000,7 +7002,7 @@ static ssize_t memory_high_write(struct
> }
>
> reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
> - GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP);
> + GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP, NULL);
>
> if (!reclaimed && !nr_retries--)
> break;
> @@ -7049,7 +7051,7 @@ static ssize_t memory_max_write(struct k
>
> if (nr_reclaims) {
> if (!try_to_free_mem_cgroup_pages(memcg, nr_pages - max,
> - GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP))
> + GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP, NULL))
> nr_reclaims--;
> continue;
> }
> @@ -7179,19 +7181,50 @@ static ssize_t memory_oom_group_write(st
> return nbytes;
> }
>
> +enum {
> + MEMORY_RECLAIM_SWAPPINESS = 0,
> + MEMORY_RECLAIM_NULL,
> +};
> +
> +static const match_table_t tokens = {
> + { MEMORY_RECLAIM_SWAPPINESS, "swappiness=%d"},
> + { MEMORY_RECLAIM_NULL, NULL },
> +};
> +
> static ssize_t memory_reclaim(struct kernfs_open_file *of, char *buf,
> size_t nbytes, loff_t off)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
> unsigned int nr_retries = MAX_RECLAIM_RETRIES;
> unsigned long nr_to_reclaim, nr_reclaimed = 0;
> + int swappiness = -1;
> unsigned int reclaim_options;
> - int err;
> + char *old_buf, *start;
> + substring_t args[MAX_OPT_ARGS];
>
> buf = strstrip(buf);
> - err = page_counter_memparse(buf, "", &nr_to_reclaim);
> - if (err)
> - return err;
> +
> + old_buf = buf;
> + nr_to_reclaim = memparse(buf, &buf) / PAGE_SIZE;
> + if (buf == old_buf)
> + return -EINVAL;
> +
> + buf = strstrip(buf);
> +
> + while ((start = strsep(&buf, " ")) != NULL) {
> + if (!strlen(start))
> + continue;
> + switch (match_token(start, tokens, args)) {
> + case MEMORY_RECLAIM_SWAPPINESS:
> + if (match_int(&args[0], &swappiness))
> + return -EINVAL;
> + if (swappiness < MIN_SWAPPINESS || swappiness > MAX_SWAPPINESS)
> + return -EINVAL;
> + break;
> + default:
> + return -EINVAL;
> + }
> + }
>
> reclaim_options = MEMCG_RECLAIM_MAY_SWAP | MEMCG_RECLAIM_PROACTIVE;
> while (nr_reclaimed < nr_to_reclaim) {
> @@ -7211,7 +7244,9 @@ static ssize_t memory_reclaim(struct ker
> lru_add_drain_all();
>
> reclaimed = try_to_free_mem_cgroup_pages(memcg,
> - batch_size, GFP_KERNEL, reclaim_options);
> + batch_size, GFP_KERNEL,
> + reclaim_options,
> + swappiness == -1 ? NULL : &swappiness);
>
> if (!reclaimed && !nr_retries--)
> return -EAGAIN;
> --- a/mm/vmscan.c~mm-add-swapiness=-arg-to-memoryreclaim
> +++ a/mm/vmscan.c
> @@ -92,6 +92,11 @@ struct scan_control {
> unsigned long anon_cost;
> unsigned long file_cost;
>
> +#ifdef CONFIG_MEMCG
> + /* Swappiness value for proactive reclaim. Always use sc_swappiness()! */
> + int *proactive_swappiness;
> +#endif
> +
> /* Can active folios be deactivated as part of reclaim? */
> #define DEACTIVATE_ANON 1
> #define DEACTIVATE_FILE 2
> @@ -236,6 +241,13 @@ static bool writeback_throttling_sane(st
> #endif
> return false;
> }
> +
> +static int sc_swappiness(struct scan_control *sc, struct mem_cgroup *memcg)
> +{
> + if (sc->proactive && sc->proactive_swappiness)
> + return *sc->proactive_swappiness;
> + return mem_cgroup_swappiness(memcg);
> +}
> #else
> static bool cgroup_reclaim(struct scan_control *sc)
> {
> @@ -251,6 +263,11 @@ static bool writeback_throttling_sane(st
> {
> return true;
> }
> +
> +static int sc_swappiness(struct scan_control *sc, struct mem_cgroup *memcg)
> +{
> + return READ_ONCE(vm_swappiness);
> +}
> #endif
>
> static void set_task_reclaim_state(struct task_struct *task,
> @@ -2351,7 +2368,7 @@ static void get_scan_count(struct lruvec
> struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> unsigned long anon_cost, file_cost, total_cost;
> - int swappiness = mem_cgroup_swappiness(memcg);
> + int swappiness = sc_swappiness(sc, memcg);
> u64 fraction[ANON_AND_FILE];
> u64 denominator = 0; /* gcc */
> enum scan_balance scan_balance;
> @@ -2632,7 +2649,7 @@ static int get_swappiness(struct lruvec
> mem_cgroup_get_nr_swap_pages(memcg) < MIN_LRU_BATCH)
> return 0;
>
> - return mem_cgroup_swappiness(memcg);
> + return sc_swappiness(sc, memcg);
> }
>
> static int get_nr_gens(struct lruvec *lruvec, int type)
> @@ -6549,12 +6566,14 @@ unsigned long mem_cgroup_shrink_node(str
> unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
> unsigned long nr_pages,
> gfp_t gfp_mask,
> - unsigned int reclaim_options)
> + unsigned int reclaim_options,
> + int *swappiness)
> {
> unsigned long nr_reclaimed;
> unsigned int noreclaim_flag;
> struct scan_control sc = {
> .nr_to_reclaim = max(nr_pages, SWAP_CLUSTER_MAX),
> + .proactive_swappiness = swappiness,
> .gfp_mask = (current_gfp_context(gfp_mask) & GFP_RECLAIM_MASK) |
> (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK),
> .reclaim_idx = MAX_NR_ZONES - 1,
> _
>
> Patches currently in -mm which might be from schatzberg.dan@gmail.com are
>
>
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
2024-06-26 6:12 ` Michal Hocko
@ 2024-06-26 17:11 ` Andrew Morton
2024-06-26 17:31 ` Dan Schatzberg
2024-06-27 6:56 ` Michal Hocko
2024-06-26 17:36 ` Johannes Weiner
1 sibling, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2024-06-26 17:11 UTC (permalink / raw)
To: Michal Hocko
Cc: mm-commits, yosryahmed, willy, wangkefeng.wang, tj, shakeel.butt,
shakeelb, roman.gushchin, rientjes, nphamcs, muchun.song,
lizefan.x, hughd, hannes, findns94, david, corbet, chrisl,
schatzberg.dan
On Wed, 26 Jun 2024 08:12:28 +0200 Michal Hocko <mhocko@suse.com> wrote:
> On Tue 25-06-24 14:03:33, Andrew Morton wrote:
> >
> > The quilt patch titled
> > Subject: mm: add swappiness= arg to memory.reclaim
> > has been removed from the -mm tree. Its filename was
> > mm-add-swapiness=-arg-to-memoryreclaim.patch
> >
> > This patch was dropped because it was nacked
>
> I do not see this being nacked in the original email thread.
Sorry, I thought I sent an email about this.
Yu Zhou didn't like and the submitter (Dan) didn't participate in the
review discussion and the patchset wasn't going forward.
Dan, please prepare a new revision, attempt to address review thus far
and let's fire it up again.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
2024-06-26 17:11 ` Andrew Morton
@ 2024-06-26 17:31 ` Dan Schatzberg
2024-06-26 17:43 ` Andrew Morton
2024-06-27 6:56 ` Michal Hocko
1 sibling, 1 reply; 8+ messages in thread
From: Dan Schatzberg @ 2024-06-26 17:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Michal Hocko, mm-commits, yosryahmed, willy, wangkefeng.wang, tj,
shakeel.butt, shakeelb, roman.gushchin, rientjes, nphamcs,
muchun.song, lizefan.x, hughd, hannes, findns94, david, corbet,
chrisl
On Wed, Jun 26, 2024 at 10:11:37AM -0700, Andrew Morton wrote:
> On Wed, 26 Jun 2024 08:12:28 +0200 Michal Hocko <mhocko@suse.com> wrote:
>
> > On Tue 25-06-24 14:03:33, Andrew Morton wrote:
> > >
> > > The quilt patch titled
> > > Subject: mm: add swappiness= arg to memory.reclaim
> > > has been removed from the -mm tree. Its filename was
> > > mm-add-swapiness=-arg-to-memoryreclaim.patch
> > >
> > > This patch was dropped because it was nacked
> >
> > I do not see this being nacked in the original email thread.
>
> Sorry, I thought I sent an email about this.
>
> Yu Zhou didn't like and the submitter (Dan) didn't participate in the
> review discussion and the patchset wasn't going forward.
>
> Dan, please prepare a new revision, attempt to address review thus far
> and let's fire it up again.
>
Hi Andrew,
I'm pretty inexperienced with upstream development, so probably this
is my own misunderstanding. After the patch was accepted in
mm-unstable, I contacted you directly a month or two ago asking if
there were any next steps or if it would be sent to Linus and you
indicated that you would send it to Linus as-is. More recently,
Shakeel asked about the status of this patch[1] and Yu Zhao said he'd
take another look but that he didn't remember having any strong
concerns.
Perhaps I missed some other communication - what changes would you
like me to make for the patch?
[1]: https://patchwork.kernel.org/project/linux-mm/cover/20240103164841.2800183-1-schatzberg.dan@gmail.com/#25887688
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
2024-06-26 17:31 ` Dan Schatzberg
@ 2024-06-26 17:43 ` Andrew Morton
2024-06-27 6:59 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2024-06-26 17:43 UTC (permalink / raw)
To: Dan Schatzberg
Cc: Michal Hocko, mm-commits, yosryahmed, willy, wangkefeng.wang, tj,
shakeel.butt, shakeelb, roman.gushchin, rientjes, nphamcs,
muchun.song, lizefan.x, hughd, hannes, findns94, david, corbet,
chrisl
On Wed, 26 Jun 2024 13:31:18 -0400 Dan Schatzberg <schatzberg.dan@gmail.com> wrote:
> After the patch was accepted in
> mm-unstable, I contacted you directly a month or two ago asking if
> there were any next steps or if it would be sent to Linus and you
> indicated that you would send it to Linus as-is. More recently,
> Shakeel asked about the status of this patch[1] and Yu Zhao said he'd
> take another look but that he didn't remember having any strong
> concerns.
Ah, gotcha, thanks. I must have failed to update my notes at the time.
I restored the two patches and queued them for the next merge window.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
2024-06-26 17:43 ` Andrew Morton
@ 2024-06-27 6:59 ` Michal Hocko
0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2024-06-27 6:59 UTC (permalink / raw)
To: Andrew Morton
Cc: Dan Schatzberg, mm-commits, yosryahmed, willy, wangkefeng.wang,
tj, shakeel.butt, shakeelb, roman.gushchin, rientjes, nphamcs,
muchun.song, lizefan.x, hughd, hannes, findns94, david, corbet,
chrisl
On Wed 26-06-24 10:43:25, Andrew Morton wrote:
> On Wed, 26 Jun 2024 13:31:18 -0400 Dan Schatzberg <schatzberg.dan@gmail.com> wrote:
>
> > After the patch was accepted in
> > mm-unstable, I contacted you directly a month or two ago asking if
> > there were any next steps or if it would be sent to Linus and you
> > indicated that you would send it to Linus as-is. More recently,
> > Shakeel asked about the status of this patch[1] and Yu Zhao said he'd
> > take another look but that he didn't remember having any strong
> > concerns.
>
> Ah, gotcha, thanks. I must have failed to update my notes at the time.
>
> I restored the two patches and queued them for the next merge window.
Ups, missed this reply. Thanks!
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
2024-06-26 17:11 ` Andrew Morton
2024-06-26 17:31 ` Dan Schatzberg
@ 2024-06-27 6:56 ` Michal Hocko
1 sibling, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2024-06-27 6:56 UTC (permalink / raw)
To: Andrew Morton
Cc: mm-commits, yosryahmed, willy, wangkefeng.wang, tj, shakeel.butt,
shakeelb, roman.gushchin, rientjes, nphamcs, muchun.song,
lizefan.x, hughd, hannes, findns94, david, corbet, chrisl,
schatzberg.dan
On Wed 26-06-24 10:11:37, Andrew Morton wrote:
> On Wed, 26 Jun 2024 08:12:28 +0200 Michal Hocko <mhocko@suse.com> wrote:
>
> > On Tue 25-06-24 14:03:33, Andrew Morton wrote:
> > >
> > > The quilt patch titled
> > > Subject: mm: add swappiness= arg to memory.reclaim
> > > has been removed from the -mm tree. Its filename was
> > > mm-add-swapiness=-arg-to-memoryreclaim.patch
> > >
> > > This patch was dropped because it was nacked
> >
> > I do not see this being nacked in the original email thread.
>
> Sorry, I thought I sent an email about this.
>
> Yu Zhou didn't like and the submitter (Dan) didn't participate in the
> review discussion and the patchset wasn't going forward.
Yu said he doesn't remember his concerns anymore [1] and this has been
acked by several people.
> Dan, please prepare a new revision, attempt to address review thus far
> and let's fire it up again.
Can we just take the patch as it was please?
[1] https://lore.kernel.org/all/20240103164841.2800183-3-schatzberg.dan@gmail.com/T/#m0e4834bb45b975f518f297370edccafa8740a0ad
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree
2024-06-26 6:12 ` Michal Hocko
2024-06-26 17:11 ` Andrew Morton
@ 2024-06-26 17:36 ` Johannes Weiner
1 sibling, 0 replies; 8+ messages in thread
From: Johannes Weiner @ 2024-06-26 17:36 UTC (permalink / raw)
To: Michal Hocko
Cc: Andrew Morton, mm-commits, yosryahmed, willy, wangkefeng.wang, tj,
shakeel.butt, shakeelb, roman.gushchin, rientjes, nphamcs,
muchun.song, lizefan.x, hughd, findns94, david, corbet, chrisl,
schatzberg.dan
On Wed, Jun 26, 2024 at 08:12:28AM +0200, Michal Hocko wrote:
> On Tue 25-06-24 14:03:33, Andrew Morton wrote:
> >
> > The quilt patch titled
> > Subject: mm: add swappiness= arg to memory.reclaim
> > has been removed from the -mm tree. Its filename was
> > mm-add-swapiness=-arg-to-memoryreclaim.patch
> >
> > This patch was dropped because it was nacked
>
> I do not see this being nacked in the original email thread.
Yes, me neither.
And IIRC, the "disagreement" came down to one person's coding style
opinion (one which didn't resonate with other reviewers).
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-06-27 6:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25 21:03 [nacked] mm-add-swapiness=-arg-to-memoryreclaim.patch removed from -mm tree Andrew Morton
2024-06-26 6:12 ` Michal Hocko
2024-06-26 17:11 ` Andrew Morton
2024-06-26 17:31 ` Dan Schatzberg
2024-06-26 17:43 ` Andrew Morton
2024-06-27 6:59 ` Michal Hocko
2024-06-27 6:56 ` Michal Hocko
2024-06-26 17:36 ` Johannes Weiner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.