* [PATCH v6 1/7] mm: make alloc_demote_folio externally invokable for migration
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
@ 2024-06-14 3:00 ` Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 2/7] mm: rename alloc_demote_folio to alloc_migrate_folio Honggyu Kim
` (6 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: Honggyu Kim @ 2024-06-14 3:00 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Honggyu Kim
The alloc_demote_folio can be used out of vmscan.c so it'd be better to
remove static keyword from it.
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
---
mm/internal.h | 1 +
mm/vmscan.c | 3 +--
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index b2c75b12014e..b3ca996a4efc 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1052,6 +1052,7 @@ extern unsigned long __must_check vm_mmap_pgoff(struct file *, unsigned long,
unsigned long, unsigned long);
extern void set_pageblock_order(void);
+struct folio *alloc_demote_folio(struct folio *src, unsigned long private);
unsigned long reclaim_pages(struct list_head *folio_list);
unsigned int reclaim_clean_pages_from_list(struct zone *zone,
struct list_head *folio_list);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2e34de9cd0d4..2f4406872f43 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -916,8 +916,7 @@ static void folio_check_dirty_writeback(struct folio *folio,
mapping->a_ops->is_dirty_writeback(folio, dirty, writeback);
}
-static struct folio *alloc_demote_folio(struct folio *src,
- unsigned long private)
+struct folio *alloc_demote_folio(struct folio *src, unsigned long private)
{
struct folio *dst;
nodemask_t *allowed_mask;
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v6 2/7] mm: rename alloc_demote_folio to alloc_migrate_folio
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 1/7] mm: make alloc_demote_folio externally invokable for migration Honggyu Kim
@ 2024-06-14 3:00 ` Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 3/7] mm/damon/sysfs-schemes: add target_nid on sysfs-schemes Honggyu Kim
` (5 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: Honggyu Kim @ 2024-06-14 3:00 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Honggyu Kim
The alloc_demote_folio can also be used for general migration including
both demotion and promotion so it'd be better to rename it from
alloc_demote_folio to alloc_migrate_folio.
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
mm/internal.h | 2 +-
mm/vmscan.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index b3ca996a4efc..9f967842f636 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1052,7 +1052,7 @@ extern unsigned long __must_check vm_mmap_pgoff(struct file *, unsigned long,
unsigned long, unsigned long);
extern void set_pageblock_order(void);
-struct folio *alloc_demote_folio(struct folio *src, unsigned long private);
+struct folio *alloc_migrate_folio(struct folio *src, unsigned long private);
unsigned long reclaim_pages(struct list_head *folio_list);
unsigned int reclaim_clean_pages_from_list(struct zone *zone,
struct list_head *folio_list);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2f4406872f43..f5414b101909 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -916,7 +916,7 @@ static void folio_check_dirty_writeback(struct folio *folio,
mapping->a_ops->is_dirty_writeback(folio, dirty, writeback);
}
-struct folio *alloc_demote_folio(struct folio *src, unsigned long private)
+struct folio *alloc_migrate_folio(struct folio *src, unsigned long private)
{
struct folio *dst;
nodemask_t *allowed_mask;
@@ -979,7 +979,7 @@ static unsigned int demote_folio_list(struct list_head *demote_folios,
node_get_allowed_targets(pgdat, &allowed_mask);
/* Demotion ignores all cpuset and mempolicy settings */
- migrate_pages(demote_folios, alloc_demote_folio, NULL,
+ migrate_pages(demote_folios, alloc_migrate_folio, NULL,
(unsigned long)&mtc, MIGRATE_ASYNC, MR_DEMOTION,
&nr_succeeded);
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v6 3/7] mm/damon/sysfs-schemes: add target_nid on sysfs-schemes
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 1/7] mm: make alloc_demote_folio externally invokable for migration Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 2/7] mm: rename alloc_demote_folio to alloc_migrate_folio Honggyu Kim
@ 2024-06-14 3:00 ` Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 4/7] mm/migrate: add MR_DAMON to migrate_reason Honggyu Kim
` (4 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: Honggyu Kim @ 2024-06-14 3:00 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Hyeongtak Ji, Honggyu Kim
From: Hyeongtak Ji <hyeongtak.ji@sk.com>
This patch adds target_nid under
/sys/kernel/mm/damon/admin/kdamonds/<N>/contexts/<N>/schemes/<N>/
The 'target_nid' can be used as the destination node for DAMOS actions
such as DAMOS_MIGRATE_{HOT,COLD} in the follow up patches.
Signed-off-by: Hyeongtak Ji <hyeongtak.ji@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
---
include/linux/damon.h | 11 ++++++++++-
mm/damon/core.c | 5 ++++-
mm/damon/dbgfs.c | 2 +-
mm/damon/lru_sort.c | 3 ++-
mm/damon/reclaim.c | 3 ++-
mm/damon/sysfs-schemes.c | 33 ++++++++++++++++++++++++++++++++-
6 files changed, 51 insertions(+), 6 deletions(-)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index f7da65e1ac04..21d6b69a015c 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -374,6 +374,7 @@ struct damos_access_pattern {
* @apply_interval_us: The time between applying the @action.
* @quota: Control the aggressiveness of this scheme.
* @wmarks: Watermarks for automated (in)activation of this scheme.
+ * @target_nid: Destination node if @action is "migrate_{hot,cold}".
* @filters: Additional set of &struct damos_filter for &action.
* @stat: Statistics of this scheme.
* @list: List head for siblings.
@@ -389,6 +390,10 @@ struct damos_access_pattern {
* monitoring context are inactive, DAMON stops monitoring either, and just
* repeatedly checks the watermarks.
*
+ * @target_nid is used to set the migration target node for migrate_hot or
+ * migrate_cold actions, which means it's only meaningful when @action is either
+ * "migrate_hot" or "migrate_cold".
+ *
* Before applying the &action to a memory region, &struct damon_operations
* implementation could check pages of the region and skip &action to respect
* &filters
@@ -410,6 +415,9 @@ struct damos {
/* public: */
struct damos_quota quota;
struct damos_watermarks wmarks;
+ union {
+ int target_nid;
+ };
struct list_head filters;
struct damos_stat stat;
struct list_head list;
@@ -726,7 +734,8 @@ struct damos *damon_new_scheme(struct damos_access_pattern *pattern,
enum damos_action action,
unsigned long apply_interval_us,
struct damos_quota *quota,
- struct damos_watermarks *wmarks);
+ struct damos_watermarks *wmarks,
+ int target_nid);
void damon_add_scheme(struct damon_ctx *ctx, struct damos *s);
void damon_destroy_scheme(struct damos *s);
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 6392f1cc97a3..c0ec5be4f56e 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -354,7 +354,8 @@ struct damos *damon_new_scheme(struct damos_access_pattern *pattern,
enum damos_action action,
unsigned long apply_interval_us,
struct damos_quota *quota,
- struct damos_watermarks *wmarks)
+ struct damos_watermarks *wmarks,
+ int target_nid)
{
struct damos *scheme;
@@ -381,6 +382,8 @@ struct damos *damon_new_scheme(struct damos_access_pattern *pattern,
scheme->wmarks = *wmarks;
scheme->wmarks.activated = true;
+ scheme->target_nid = target_nid;
+
return scheme;
}
diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index 2461cfe2e968..51a6f1cac385 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -281,7 +281,7 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
pos += parsed;
scheme = damon_new_scheme(&pattern, action, 0, "a,
- &wmarks);
+ &wmarks, NUMA_NO_NODE);
if (!scheme)
goto fail;
diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
index 3de2916a65c3..3775f0f2743d 100644
--- a/mm/damon/lru_sort.c
+++ b/mm/damon/lru_sort.c
@@ -163,7 +163,8 @@ static struct damos *damon_lru_sort_new_scheme(
/* under the quota. */
"a,
/* (De)activate this according to the watermarks. */
- &damon_lru_sort_wmarks);
+ &damon_lru_sort_wmarks,
+ NUMA_NO_NODE);
}
/* Create a DAMON-based operation scheme for hot memory regions */
diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index 9bd341d62b4c..a05ccb41749b 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -177,7 +177,8 @@ static struct damos *damon_reclaim_new_scheme(void)
/* under the quota. */
&damon_reclaim_quota,
/* (De)activate this according to the watermarks. */
- &damon_reclaim_wmarks);
+ &damon_reclaim_wmarks,
+ NUMA_NO_NODE);
}
static void damon_reclaim_copy_quota_status(struct damos_quota *dst,
diff --git a/mm/damon/sysfs-schemes.c b/mm/damon/sysfs-schemes.c
index bea5bc52846a..0632d28b67f8 100644
--- a/mm/damon/sysfs-schemes.c
+++ b/mm/damon/sysfs-schemes.c
@@ -6,6 +6,7 @@
*/
#include <linux/slab.h>
+#include <linux/numa.h>
#include "sysfs-common.h"
@@ -1445,6 +1446,7 @@ struct damon_sysfs_scheme {
struct damon_sysfs_scheme_filters *filters;
struct damon_sysfs_stats *stats;
struct damon_sysfs_scheme_regions *tried_regions;
+ int target_nid;
};
/* This should match with enum damos_action */
@@ -1470,6 +1472,7 @@ static struct damon_sysfs_scheme *damon_sysfs_scheme_alloc(
scheme->kobj = (struct kobject){};
scheme->action = action;
scheme->apply_interval_us = apply_interval_us;
+ scheme->target_nid = NUMA_NO_NODE;
return scheme;
}
@@ -1692,6 +1695,28 @@ static ssize_t apply_interval_us_store(struct kobject *kobj,
return err ? err : count;
}
+static ssize_t target_nid_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ struct damon_sysfs_scheme *scheme = container_of(kobj,
+ struct damon_sysfs_scheme, kobj);
+
+ return sysfs_emit(buf, "%d\n", scheme->target_nid);
+}
+
+static ssize_t target_nid_store(struct kobject *kobj,
+ struct kobj_attribute *attr, const char *buf, size_t count)
+{
+ struct damon_sysfs_scheme *scheme = container_of(kobj,
+ struct damon_sysfs_scheme, kobj);
+ int err = 0;
+
+ /* TODO: error handling for target_nid range. */
+ err = kstrtoint(buf, 0, &scheme->target_nid);
+
+ return err ? err : count;
+}
+
static void damon_sysfs_scheme_release(struct kobject *kobj)
{
kfree(container_of(kobj, struct damon_sysfs_scheme, kobj));
@@ -1703,9 +1728,13 @@ static struct kobj_attribute damon_sysfs_scheme_action_attr =
static struct kobj_attribute damon_sysfs_scheme_apply_interval_us_attr =
__ATTR_RW_MODE(apply_interval_us, 0600);
+static struct kobj_attribute damon_sysfs_scheme_target_nid_attr =
+ __ATTR_RW_MODE(target_nid, 0600);
+
static struct attribute *damon_sysfs_scheme_attrs[] = {
&damon_sysfs_scheme_action_attr.attr,
&damon_sysfs_scheme_apply_interval_us_attr.attr,
+ &damon_sysfs_scheme_target_nid_attr.attr,
NULL,
};
ATTRIBUTE_GROUPS(damon_sysfs_scheme);
@@ -2031,7 +2060,8 @@ static struct damos *damon_sysfs_mk_scheme(
};
scheme = damon_new_scheme(&pattern, sysfs_scheme->action,
- sysfs_scheme->apply_interval_us, "a, &wmarks);
+ sysfs_scheme->apply_interval_us, "a, &wmarks,
+ sysfs_scheme->target_nid);
if (!scheme)
return NULL;
@@ -2068,6 +2098,7 @@ static void damon_sysfs_update_scheme(struct damos *scheme,
scheme->action = sysfs_scheme->action;
scheme->apply_interval_us = sysfs_scheme->apply_interval_us;
+ scheme->target_nid = sysfs_scheme->target_nid;
scheme->quota.ms = sysfs_quotas->ms;
scheme->quota.sz = sysfs_quotas->sz;
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v6 4/7] mm/migrate: add MR_DAMON to migrate_reason
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
` (2 preceding siblings ...)
2024-06-14 3:00 ` [PATCH v6 3/7] mm/damon/sysfs-schemes: add target_nid on sysfs-schemes Honggyu Kim
@ 2024-06-14 3:00 ` Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 5/7] mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion Honggyu Kim
` (3 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: Honggyu Kim @ 2024-06-14 3:00 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Honggyu Kim
The current patch series introduces DAMON based migration across NUMA
nodes so it'd be better to have a new migrate_reason in trace events.
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
---
include/linux/migrate_mode.h | 1 +
include/trace/events/migrate.h | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h
index f37cc03f9369..cec36b7e7ced 100644
--- a/include/linux/migrate_mode.h
+++ b/include/linux/migrate_mode.h
@@ -29,6 +29,7 @@ enum migrate_reason {
MR_CONTIG_RANGE,
MR_LONGTERM_PIN,
MR_DEMOTION,
+ MR_DAMON,
MR_TYPES
};
diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h
index 0190ef725b43..cd01dd7b3640 100644
--- a/include/trace/events/migrate.h
+++ b/include/trace/events/migrate.h
@@ -22,7 +22,8 @@
EM( MR_NUMA_MISPLACED, "numa_misplaced") \
EM( MR_CONTIG_RANGE, "contig_range") \
EM( MR_LONGTERM_PIN, "longterm_pin") \
- EMe(MR_DEMOTION, "demotion")
+ EM( MR_DEMOTION, "demotion") \
+ EMe(MR_DAMON, "damon")
/*
* First define the enums in the above macros to be exported to userspace
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v6 5/7] mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
` (3 preceding siblings ...)
2024-06-14 3:00 ` [PATCH v6 4/7] mm/migrate: add MR_DAMON to migrate_reason Honggyu Kim
@ 2024-06-14 3:00 ` Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 6/7] mm/damon/paddr: introduce DAMOS_MIGRATE_HOT action for promotion Honggyu Kim
` (2 subsequent siblings)
7 siblings, 0 replies; 10+ messages in thread
From: Honggyu Kim @ 2024-06-14 3:00 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Honggyu Kim, Hyeongtak Ji
This patch introduces DAMOS_MIGRATE_COLD action, which is similar to
DAMOS_PAGEOUT, but migrate folios to the given 'target_nid' in the sysfs
instead of swapping them out.
The 'target_nid' sysfs knob informs the migration target node ID.
Here is one of the example usage of this 'migrate_cold' action.
$ cd /sys/kernel/mm/damon/admin/kdamonds/<N>
$ cat contexts/<N>/schemes/<N>/action
migrate_cold
$ echo 2 > contexts/<N>/schemes/<N>/target_nid
$ echo commit > state
$ numactl -p 0 ./hot_cold 500M 600M &
$ numastat -c -p hot_cold
Per-node process memory usage (in MBs)
PID Node 0 Node 1 Node 2 Total
-------------- ------ ------ ------ -----
701 (hot_cold) 501 0 601 1101
Since there are some common routines with pageout, many functions have
similar logics between pageout and migrate cold.
damon_pa_migrate_folio_list() is a minimized version of
shrink_folio_list().
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Hyeongtak Ji <hyeongtak.ji@sk.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
---
include/linux/damon.h | 2 +
mm/damon/paddr.c | 154 +++++++++++++++++++++++++++++++++++++++
mm/damon/sysfs-schemes.c | 1 +
3 files changed, 157 insertions(+)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index 21d6b69a015c..56714b6eb0d7 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -105,6 +105,7 @@ struct damon_target {
* @DAMOS_NOHUGEPAGE: Call ``madvise()`` for the region with MADV_NOHUGEPAGE.
* @DAMOS_LRU_PRIO: Prioritize the region on its LRU lists.
* @DAMOS_LRU_DEPRIO: Deprioritize the region on its LRU lists.
+ * @DAMOS_MIGRATE_COLD: Migrate the regions prioritizing colder regions.
* @DAMOS_STAT: Do nothing but count the stat.
* @NR_DAMOS_ACTIONS: Total number of DAMOS actions
*
@@ -122,6 +123,7 @@ enum damos_action {
DAMOS_NOHUGEPAGE,
DAMOS_LRU_PRIO,
DAMOS_LRU_DEPRIO,
+ DAMOS_MIGRATE_COLD,
DAMOS_STAT, /* Do nothing but only record the stat */
NR_DAMOS_ACTIONS,
};
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 18797c1b419b..882ae54af829 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -12,6 +12,9 @@
#include <linux/pagemap.h>
#include <linux/rmap.h>
#include <linux/swap.h>
+#include <linux/memory-tiers.h>
+#include <linux/migrate.h>
+#include <linux/mm_inline.h>
#include "../internal.h"
#include "ops-common.h"
@@ -325,6 +328,153 @@ static unsigned long damon_pa_deactivate_pages(struct damon_region *r,
return damon_pa_mark_accessed_or_deactivate(r, s, false);
}
+static unsigned int __damon_pa_migrate_folio_list(
+ struct list_head *migrate_folios, struct pglist_data *pgdat,
+ int target_nid)
+{
+ unsigned int nr_succeeded;
+ nodemask_t allowed_mask = NODE_MASK_NONE;
+ struct migration_target_control mtc = {
+ /*
+ * Allocate from 'node', or fail quickly and quietly.
+ * When this happens, 'page' will likely just be discarded
+ * instead of migrated.
+ */
+ .gfp_mask = (GFP_HIGHUSER_MOVABLE & ~__GFP_RECLAIM) |
+ __GFP_NOWARN | __GFP_NOMEMALLOC | GFP_NOWAIT,
+ .nid = target_nid,
+ .nmask = &allowed_mask
+ };
+
+ if (pgdat->node_id == target_nid || target_nid == NUMA_NO_NODE)
+ return 0;
+
+ if (list_empty(migrate_folios))
+ return 0;
+
+ /* Migration ignores all cpuset and mempolicy settings */
+ migrate_pages(migrate_folios, alloc_migrate_folio, NULL,
+ (unsigned long)&mtc, MIGRATE_ASYNC, MR_DAMON,
+ &nr_succeeded);
+
+ return nr_succeeded;
+}
+
+static unsigned int damon_pa_migrate_folio_list(struct list_head *folio_list,
+ struct pglist_data *pgdat,
+ int target_nid)
+{
+ unsigned int nr_migrated = 0;
+ struct folio *folio;
+ LIST_HEAD(ret_folios);
+ LIST_HEAD(migrate_folios);
+
+ while (!list_empty(folio_list)) {
+ struct folio *folio;
+
+ cond_resched();
+
+ folio = lru_to_folio(folio_list);
+ list_del(&folio->lru);
+
+ if (!folio_trylock(folio))
+ goto keep;
+
+ /* Relocate its contents to another node. */
+ list_add(&folio->lru, &migrate_folios);
+ folio_unlock(folio);
+ continue;
+keep:
+ list_add(&folio->lru, &ret_folios);
+ }
+ /* 'folio_list' is always empty here */
+
+ /* Migrate folios selected for migration */
+ nr_migrated += __damon_pa_migrate_folio_list(
+ &migrate_folios, pgdat, target_nid);
+ /*
+ * Folios that could not be migrated are still in @migrate_folios. Add
+ * those back on @folio_list
+ */
+ if (!list_empty(&migrate_folios))
+ list_splice_init(&migrate_folios, folio_list);
+
+ try_to_unmap_flush();
+
+ list_splice(&ret_folios, folio_list);
+
+ while (!list_empty(folio_list)) {
+ folio = lru_to_folio(folio_list);
+ list_del(&folio->lru);
+ folio_putback_lru(folio);
+ }
+
+ return nr_migrated;
+}
+
+static unsigned long damon_pa_migrate_pages(struct list_head *folio_list,
+ int target_nid)
+{
+ int nid;
+ unsigned long nr_migrated = 0;
+ LIST_HEAD(node_folio_list);
+ unsigned int noreclaim_flag;
+
+ if (list_empty(folio_list))
+ return nr_migrated;
+
+ noreclaim_flag = memalloc_noreclaim_save();
+
+ nid = folio_nid(lru_to_folio(folio_list));
+ do {
+ struct folio *folio = lru_to_folio(folio_list);
+
+ if (nid == folio_nid(folio)) {
+ list_move(&folio->lru, &node_folio_list);
+ continue;
+ }
+
+ nr_migrated += damon_pa_migrate_folio_list(&node_folio_list,
+ NODE_DATA(nid),
+ target_nid);
+ nid = folio_nid(lru_to_folio(folio_list));
+ } while (!list_empty(folio_list));
+
+ nr_migrated += damon_pa_migrate_folio_list(&node_folio_list,
+ NODE_DATA(nid),
+ target_nid);
+
+ memalloc_noreclaim_restore(noreclaim_flag);
+
+ return nr_migrated;
+}
+
+static unsigned long damon_pa_migrate(struct damon_region *r, struct damos *s)
+{
+ unsigned long addr, applied;
+ LIST_HEAD(folio_list);
+
+ for (addr = r->ar.start; addr < r->ar.end; addr += PAGE_SIZE) {
+ struct folio *folio = damon_get_folio(PHYS_PFN(addr));
+
+ if (!folio)
+ continue;
+
+ if (damos_pa_filter_out(s, folio))
+ goto put_folio;
+
+ if (!folio_isolate_lru(folio))
+ goto put_folio;
+ list_add(&folio->lru, &folio_list);
+put_folio:
+ folio_put(folio);
+ }
+ applied = damon_pa_migrate_pages(&folio_list, s->target_nid);
+ cond_resched();
+ return applied * PAGE_SIZE;
+}
+
+
static unsigned long damon_pa_apply_scheme(struct damon_ctx *ctx,
struct damon_target *t, struct damon_region *r,
struct damos *scheme)
@@ -336,6 +486,8 @@ static unsigned long damon_pa_apply_scheme(struct damon_ctx *ctx,
return damon_pa_mark_accessed(r, scheme);
case DAMOS_LRU_DEPRIO:
return damon_pa_deactivate_pages(r, scheme);
+ case DAMOS_MIGRATE_COLD:
+ return damon_pa_migrate(r, scheme);
case DAMOS_STAT:
break;
default:
@@ -356,6 +508,8 @@ static int damon_pa_scheme_score(struct damon_ctx *context,
return damon_hot_score(context, r, scheme);
case DAMOS_LRU_DEPRIO:
return damon_cold_score(context, r, scheme);
+ case DAMOS_MIGRATE_COLD:
+ return damon_cold_score(context, r, scheme);
default:
break;
}
diff --git a/mm/damon/sysfs-schemes.c b/mm/damon/sysfs-schemes.c
index 0632d28b67f8..880015d5b5ea 100644
--- a/mm/damon/sysfs-schemes.c
+++ b/mm/damon/sysfs-schemes.c
@@ -1458,6 +1458,7 @@ static const char * const damon_sysfs_damos_action_strs[] = {
"nohugepage",
"lru_prio",
"lru_deprio",
+ "migrate_cold",
"stat",
};
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v6 6/7] mm/damon/paddr: introduce DAMOS_MIGRATE_HOT action for promotion
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
` (4 preceding siblings ...)
2024-06-14 3:00 ` [PATCH v6 5/7] mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion Honggyu Kim
@ 2024-06-14 3:00 ` Honggyu Kim
2024-06-14 3:00 ` [PATCH v6 7/7] Docs/damon: document damos_migrate_{hot,cold} Honggyu Kim
2024-06-14 16:39 ` [PATCH v6 0/7] DAMON based tiered memory management for CXL memory SeongJae Park
7 siblings, 0 replies; 10+ messages in thread
From: Honggyu Kim @ 2024-06-14 3:00 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Hyeongtak Ji, Honggyu Kim
From: Hyeongtak Ji <hyeongtak.ji@sk.com>
This patch introduces DAMOS_MIGRATE_HOT action, which is similar to
DAMOS_MIGRATE_COLD, but proritizes hot pages.
It migrates pages inside the given region to the 'target_nid' NUMA node
in the sysfs.
Here is one of the example usage of this 'migrate_hot' action.
$ cd /sys/kernel/mm/damon/admin/kdamonds/<N>
$ cat contexts/<N>/schemes/<N>/action
migrate_hot
$ echo 0 > contexts/<N>/schemes/<N>/target_nid
$ echo commit > state
$ numactl -p 2 ./hot_cold 500M 600M &
$ numastat -c -p hot_cold
Per-node process memory usage (in MBs)
PID Node 0 Node 1 Node 2 Total
-------------- ------ ------ ------ -----
701 (hot_cold) 501 0 601 1101
Signed-off-by: Hyeongtak Ji <hyeongtak.ji@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
---
include/linux/damon.h | 2 ++
mm/damon/paddr.c | 3 +++
mm/damon/sysfs-schemes.c | 1 +
3 files changed, 6 insertions(+)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index 56714b6eb0d7..3d62d98d6359 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -105,6 +105,7 @@ struct damon_target {
* @DAMOS_NOHUGEPAGE: Call ``madvise()`` for the region with MADV_NOHUGEPAGE.
* @DAMOS_LRU_PRIO: Prioritize the region on its LRU lists.
* @DAMOS_LRU_DEPRIO: Deprioritize the region on its LRU lists.
+ * @DAMOS_MIGRATE_HOT: Migrate the regions prioritizing warmer regions.
* @DAMOS_MIGRATE_COLD: Migrate the regions prioritizing colder regions.
* @DAMOS_STAT: Do nothing but count the stat.
* @NR_DAMOS_ACTIONS: Total number of DAMOS actions
@@ -123,6 +124,7 @@ enum damos_action {
DAMOS_NOHUGEPAGE,
DAMOS_LRU_PRIO,
DAMOS_LRU_DEPRIO,
+ DAMOS_MIGRATE_HOT,
DAMOS_MIGRATE_COLD,
DAMOS_STAT, /* Do nothing but only record the stat */
NR_DAMOS_ACTIONS,
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 882ae54af829..af6aac388a43 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -486,6 +486,7 @@ static unsigned long damon_pa_apply_scheme(struct damon_ctx *ctx,
return damon_pa_mark_accessed(r, scheme);
case DAMOS_LRU_DEPRIO:
return damon_pa_deactivate_pages(r, scheme);
+ case DAMOS_MIGRATE_HOT:
case DAMOS_MIGRATE_COLD:
return damon_pa_migrate(r, scheme);
case DAMOS_STAT:
@@ -508,6 +509,8 @@ static int damon_pa_scheme_score(struct damon_ctx *context,
return damon_hot_score(context, r, scheme);
case DAMOS_LRU_DEPRIO:
return damon_cold_score(context, r, scheme);
+ case DAMOS_MIGRATE_HOT:
+ return damon_hot_score(context, r, scheme);
case DAMOS_MIGRATE_COLD:
return damon_cold_score(context, r, scheme);
default:
diff --git a/mm/damon/sysfs-schemes.c b/mm/damon/sysfs-schemes.c
index 880015d5b5ea..66fccfa776d7 100644
--- a/mm/damon/sysfs-schemes.c
+++ b/mm/damon/sysfs-schemes.c
@@ -1458,6 +1458,7 @@ static const char * const damon_sysfs_damos_action_strs[] = {
"nohugepage",
"lru_prio",
"lru_deprio",
+ "migrate_hot",
"migrate_cold",
"stat",
};
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v6 7/7] Docs/damon: document damos_migrate_{hot,cold}
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
` (5 preceding siblings ...)
2024-06-14 3:00 ` [PATCH v6 6/7] mm/damon/paddr: introduce DAMOS_MIGRATE_HOT action for promotion Honggyu Kim
@ 2024-06-14 3:00 ` Honggyu Kim
2024-06-14 16:36 ` SeongJae Park
2024-06-14 16:39 ` [PATCH v6 0/7] DAMON based tiered memory management for CXL memory SeongJae Park
7 siblings, 1 reply; 10+ messages in thread
From: Honggyu Kim @ 2024-06-14 3:00 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Honggyu Kim
This patch adds damon description for "migrate_hot" and "migrate_cold"
actions for both usage and design documents as long as a new
"target_nid" knob to set the migration target node.
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
---
Documentation/admin-guide/mm/damon/usage.rst | 4 ++++
Documentation/mm/damon/design.rst | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index e58ceb89ea2a..98804e34448b 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -300,6 +300,10 @@ from the file and their meaning are same to those of the list on
The ``apply_interval_us`` file is for setting and getting the scheme's
:ref:`apply_interval <damon_design_damos>` in microseconds.
+The ``target_nid`` file is for setting the migration target node, which is
+only meaningful when the ``action`` is either ``migrate_hot`` or
+``migrate_cold``.
+
.. _sysfs_access_pattern:
schemes/<N>/access_pattern/
diff --git a/Documentation/mm/damon/design.rst b/Documentation/mm/damon/design.rst
index 3df387249937..3f12c884eb3a 100644
--- a/Documentation/mm/damon/design.rst
+++ b/Documentation/mm/damon/design.rst
@@ -325,6 +325,10 @@ that supports each action are as below.
Supported by ``paddr`` operations set.
- ``lru_deprio``: Deprioritize the region on its LRU lists.
Supported by ``paddr`` operations set.
+ - ``migrate_hot``: Migrate the regions prioritizing warmer regions.
+ Supported by ``paddr`` operations set.
+ - ``migrate_cold``: Migrate the regions prioritizing colder regions.
+ Supported by ``paddr`` operations set.
- ``stat``: Do nothing but count the statistics.
Supported by all operations sets.
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v6 7/7] Docs/damon: document damos_migrate_{hot,cold}
2024-06-14 3:00 ` [PATCH v6 7/7] Docs/damon: document damos_migrate_{hot,cold} Honggyu Kim
@ 2024-06-14 16:36 ` SeongJae Park
0 siblings, 0 replies; 10+ messages in thread
From: SeongJae Park @ 2024-06-14 16:36 UTC (permalink / raw)
To: Honggyu Kim
Cc: SeongJae Park, damon, Andrew Morton, Masami Hiramatsu,
Mathieu Desnoyers, Steven Rostedt, Gregory Price, linux-mm,
linux-kernel, linux-trace-kernel, 42.hyeyoo, art.jeongseob,
kernel_team
On Fri, 14 Jun 2024 12:00:09 +0900 Honggyu Kim <honggyu.kim@sk.com> wrote:
> This patch adds damon description for "migrate_hot" and "migrate_cold"
> actions for both usage and design documents as long as a new
> "target_nid" knob to set the migration target node.
>
> Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 0/7] DAMON based tiered memory management for CXL memory
2024-06-14 3:00 [PATCH v6 0/7] DAMON based tiered memory management for CXL memory Honggyu Kim
` (6 preceding siblings ...)
2024-06-14 3:00 ` [PATCH v6 7/7] Docs/damon: document damos_migrate_{hot,cold} Honggyu Kim
@ 2024-06-14 16:39 ` SeongJae Park
7 siblings, 0 replies; 10+ messages in thread
From: SeongJae Park @ 2024-06-14 16:39 UTC (permalink / raw)
To: Honggyu Kim, Andrew Morton
Cc: SeongJae Park, damon, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, Gregory Price, linux-mm, linux-kernel,
linux-trace-kernel, 42.hyeyoo, art.jeongseob, kernel_team,
Hyeongtak Ji, Rakie Kim, Yunjeong Mun
On Fri, 14 Jun 2024 12:00:02 +0900 Honggyu Kim <honggyu.kim@sk.com> wrote:
> There was an RFC IDEA "DAMOS-based Tiered-Memory Management" previously
> posted at [1].
>
> It says there is no implementation of the demote/promote DAMOS action
> are made. This patch series is about its implementation for physical
> address space so that this scheme can be applied in system wide level.
>
> Changes from v5:
> https://lore.kernel.org/20240613132056.608-1-honggyu.kim@sk.com
> 1. Remove new actions in usage document as its for debugfs
Thank you, I confirmed this and gave you my Reviewed-by: tag.
> 2. Apply minor fixes on cover letter
But...
[...]
> 2. YCSB zipfian distribution read only workload (with demotion_enabled true)
> memory pressure with cold memory on node0 with 512GB of local DRAM.
> ====================+================================================+=========
> | cold memory occupied by mmap and memset |
> | 0G 440G 450G 460G 470G 480G 490G 500G |
> ====================+================================================+=========
> Execution time normalized to DRAM-only values | GEOMEAN
> --------------------+------------------------------------------------+---------
> DAMON tiered | - 1.03 1.03 1.03 1.03 1.03 1.07 1.05 | 1.04
> DAMON lazy | - 1.04 1.03 1.04 1.05 1.06 1.06 1.06 | 1.05
> DAMON tiered kswapd | - 1.03 1.03 1.03 1.03 1.02 1.02 1.03 | 1.03
> DAMON lazy kswapd | - 1.04 1.04 1.04 1.03 1.05 1.04 1.05 | 1.04
> ====================+================================================+=========
> CXL usage of redis-server in GB | AVERAGE
> --------------------+------------------------------------------------+---------
> DAMON tiered | - 0.6 0.5 0.4 0.7 0.8 7.1 5.6 | 2.2
> DAMON lazy | - 0.5 3.0 4.5 5.4 6.4 9.4 9.1 | 5.5
> DAMON tiered kswapd | - 0.0 0.0 0.4 0.5 0.1 0.8 1.0 | 0.4
> DAMON lazy kswapd | - 4.2 4.6 5.3 1.7 6.8 8.1 5.8 | 5.2
> ====================+================================================+=========
>
> Each test result is based on the exeuction environment as follows.
Seems the typo is not fixed?
I don't want to delay this work for such trivial thing, though. For the patch
series,
Acked-by: SeongJae Park <sj@kernel.org>
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread