linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate
@ 2023-09-15  2:52 SeongJae Park
  2023-09-15  2:52 ` [PATCH 1/8] mm/damon/core: define and use a dedicated function for region access rate update SeongJae Park
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: SeongJae Park, Brendan Higgins, damon, linux-mm, kunit-dev,
	linux-kselftest, linux-kernel

Changes from RFC
(https://lore.kernel.org/damon/20230909033711.55794-1-sj@kernel.org/)
- Rebase on latest mm-unstable
- Minor wordsmithing of coverletter

DAMON checks the access to each region for every sampling interval, increase
the access rate counter of the region, namely nr_accesses, if the access was
made.  For every aggregation interval, the counter is reset.  The counter is
exposed to users to be used as a metric showing the relative access rate
(frequency) of each region.  In other words, DAMON provides access rate of each
region in every aggregation interval.  The aggregation avoids temporal access
pattern changes making things confusing.  However, this also makes a few
DAMON-related operations to unnecessarily need to be aligned to the aggregation
interval.  This can restrict the flexibility of DAMON applications, especially
when the aggregation interval is huge.

To provide the monitoring results in finer-grained timing while keeping
handling of temporal access pattern change, this patchset implements a
pseudo-moving sum based access rate metric.  It is pseudo-moving sum because
strict moving sum implementation would need to keep all values for last time
window, and that could incur high overhead of there could be arbitrary number
of values in a time window.  Especially in case of the nr_accesses, since the
sampling interval and aggregation interval can arbitrarily set and the past
values should be maintained for every region, it could be risky.  The
pseudo-moving sum assumes there were no temporal access pattern change in last
discrete time window to remove the needs for keeping the list of the last time
window values.  As a result, it beocmes not strict moving sum implementation,
but provides a reasonable accuracy.

Also, it keeps an important property of the moving sum.  That is, the moving
sum becomes same to discrete-window based sum at the time that aligns to the
time window.  This means using the pseudo moving sum based nr_accesses makes no
change to users who shows the value for every aggregation interval.

Patches Sequence
----------------

The sequence of the patches is as follows.  The first four patches are
for preparation of the change.  The first two (patches 1 and 2)
implements a helper function for nr_accesses update and eliminate corner
case that skips use of the function, respectively.  Following two
(patches 3 and 4) respectively implement the pseudo-moving sum function
and its simple unit test case.

Two patches for making DAMON to use the pseudo-moving sum follow.  The
fifthe one (patch 5) introduces a new field for representing the
pseudo-moving sum-based access rate of each region, and the sixth one
makes the new representation to actually updated with the pseudo-moving
sum function.

Last two patches (patches 7 and 8) makes followup fixes for skipping
unnecessary updates and marking the moving sum function as static,
respectively.

SeongJae Park (8):
  mm/damon/core: define and use a dedicated function for region access
    rate update
  mm/damon/vaddr: call damon_update_region_access_rate() always
  mm/damon/core: implement a pseudo-moving sum function
  mm/damon/core-test: add a unit test for damon_moving_sum()
  mm/damon/core: introduce nr_accesses_bp
  mm/damon/core: use pseudo-moving sum for nr_accesses_bp
  mm/damon/core: skip updating nr_accesses_bp for each aggregation
    interval
  mm/damon/core: mark damon_moving_sum() as a static function

 include/linux/damon.h | 16 +++++++++-
 mm/damon/core-test.h  | 21 ++++++++++++
 mm/damon/core.c       | 74 +++++++++++++++++++++++++++++++++++++++++++
 mm/damon/paddr.c      | 11 +++----
 mm/damon/vaddr.c      | 22 +++++++------
 5 files changed, 128 insertions(+), 16 deletions(-)


base-commit: a5b7405a0eaa74d23547ede9c3820f01ee0a2c13
-- 
2.25.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/8] mm/damon/core: define and use a dedicated function for region access rate update
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  2023-09-15  2:52 ` [PATCH 2/8] mm/damon/vaddr: call damon_update_region_access_rate() always SeongJae Park
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel

Each DAMON operarions set is updating nr_accesses field of each
damon_region for each of their access check results, from the
check_accesses() callback.  Directly accessing the field could make
things complex to manage and change in future.  Define and use a
dedicated function for the purpose.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/linux/damon.h |  5 ++++-
 mm/damon/core.c       | 16 ++++++++++++++++
 mm/damon/paddr.c      |  6 ++----
 mm/damon/vaddr.c      |  6 ++----
 4 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 9a32b8fd0bd3..17c504d236b9 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -45,7 +45,9 @@ struct damon_addr_range {
  *
  * @nr_accesses is reset to zero for every &damon_attrs->aggr_interval and be
  * increased for every &damon_attrs->sample_interval if an access to the region
- * during the last sampling interval is found.
+ * during the last sampling interval is found.  The update of this field should
+ * not be done with direct access but with the helper function,
+ * damon_update_region_access_rate().
  *
  * @age is initially zero, increased for each aggregation interval, and reset
  * to zero again if the access frequency is significantly changed.  If two
@@ -620,6 +622,7 @@ void damon_add_region(struct damon_region *r, struct damon_target *t);
 void damon_destroy_region(struct damon_region *r, struct damon_target *t);
 int damon_set_regions(struct damon_target *t, struct damon_addr_range *ranges,
 		unsigned int nr_ranges);
+void damon_update_region_access_rate(struct damon_region *r, bool accessed);
 
 struct damos_filter *damos_new_filter(enum damos_filter_type type,
 		bool matching);
diff --git a/mm/damon/core.c b/mm/damon/core.c
index c5b7296c69a0..10532159323a 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1549,6 +1549,22 @@ int damon_set_region_biggest_system_ram_default(struct damon_target *t,
 	return damon_set_regions(t, &addr_range, 1);
 }
 
+/**
+ * damon_update_region_access_rate() - Update the access rate of a region.
+ * @r:		The DAMON region to update for its access check result.
+ * @accessed:	Whether the region has accessed during last sampling interval.
+ *
+ * Update the access rate of a region with the region's last sampling interval
+ * access check result.
+ *
+ * Usually this will be called by &damon_operations->check_accesses callback.
+ */
+void damon_update_region_access_rate(struct damon_region *r, bool accessed)
+{
+	if (accessed)
+		r->nr_accesses++;
+}
+
 static int __init damon_init(void)
 {
 	damon_region_cache = KMEM_CACHE(damon_region, 0);
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 909db25efb35..44f21860b555 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -157,14 +157,12 @@ static void __damon_pa_check_access(struct damon_region *r)
 	/* If the region is in the last checked page, reuse the result */
 	if (ALIGN_DOWN(last_addr, last_folio_sz) ==
 				ALIGN_DOWN(r->sampling_addr, last_folio_sz)) {
-		if (last_accessed)
-			r->nr_accesses++;
+		damon_update_region_access_rate(r, last_accessed);
 		return;
 	}
 
 	last_accessed = damon_pa_young(r->sampling_addr, &last_folio_sz);
-	if (last_accessed)
-		r->nr_accesses++;
+	damon_update_region_access_rate(r, last_accessed);
 
 	last_addr = r->sampling_addr;
 }
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index 4c81a9dbd044..7fc0bda73b4c 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -566,14 +566,12 @@ static void __damon_va_check_access(struct mm_struct *mm,
 	/* If the region is in the last checked page, reuse the result */
 	if (same_target && (ALIGN_DOWN(last_addr, last_folio_sz) ==
 				ALIGN_DOWN(r->sampling_addr, last_folio_sz))) {
-		if (last_accessed)
-			r->nr_accesses++;
+		damon_update_region_access_rate(r, last_accessed);
 		return;
 	}
 
 	last_accessed = damon_va_young(mm, r->sampling_addr, &last_folio_sz);
-	if (last_accessed)
-		r->nr_accesses++;
+	damon_update_region_access_rate(r, last_accessed);
 
 	last_addr = r->sampling_addr;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/8] mm/damon/vaddr: call damon_update_region_access_rate() always
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
  2023-09-15  2:52 ` [PATCH 1/8] mm/damon/core: define and use a dedicated function for region access rate update SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  2023-09-15  2:52 ` [PATCH 3/8] mm/damon/core: implement a pseudo-moving sum function SeongJae Park
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel

When getting mm_struct of the monitoring target process fails, there wil
be no need to increase the access rate counter (nr_accesses) of the
regions for the process.  Hence, damon_va_check_accesses() skips calling
damon_update_region_access_rate() in the case.  This breaks the
assumption that damon_update_region_access_rate() is called for every
region, for every sampling interval.  Call the function for every region
even in the case.  This might increase the overhead in some cases, but
such case would not be frequent, so no significant impact is really
expected.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/vaddr.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index 7fc0bda73b4c..e36303271f9d 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -563,6 +563,11 @@ static void __damon_va_check_access(struct mm_struct *mm,
 	static unsigned long last_folio_sz = PAGE_SIZE;
 	static bool last_accessed;
 
+	if (!mm) {
+		damon_update_region_access_rate(r, false);
+		return;
+	}
+
 	/* If the region is in the last checked page, reuse the result */
 	if (same_target && (ALIGN_DOWN(last_addr, last_folio_sz) ==
 				ALIGN_DOWN(r->sampling_addr, last_folio_sz))) {
@@ -586,15 +591,14 @@ static unsigned int damon_va_check_accesses(struct damon_ctx *ctx)
 
 	damon_for_each_target(t, ctx) {
 		mm = damon_get_mm(t);
-		if (!mm)
-			continue;
 		same_target = false;
 		damon_for_each_region(r, t) {
 			__damon_va_check_access(mm, r, same_target);
 			max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
 			same_target = true;
 		}
-		mmput(mm);
+		if (mm)
+			mmput(mm);
 	}
 
 	return max_nr_accesses;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/8] mm/damon/core: implement a pseudo-moving sum function
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
  2023-09-15  2:52 ` [PATCH 1/8] mm/damon/core: define and use a dedicated function for region access rate update SeongJae Park
  2023-09-15  2:52 ` [PATCH 2/8] mm/damon/vaddr: call damon_update_region_access_rate() always SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  2023-09-15  2:52 ` [PATCH 4/8] mm/damon/core-test: add a unit test for damon_moving_sum() SeongJae Park
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel

For values that continuously change, moving average or sum are good ways
to provide fast updates while handling temporal and errorneous
variability of the value.  For example, the access rate counter
(nr_accesses) is calculated as a sum of the number of positive sampled
access check results that collected during a discrete time window
(aggregation interval), and hence it handles temporal and errorneous
access check results, but provides the update only for every aggregation
interval.  Using a moving sum method for that could allow providing the
value for every sampling interval.  That could be useful for getting
monitoring results snapshot or running DAMOS in fine-grained timing.

However, supporting the moving sum for cases that number of samples in
the time window is arbirary could impose high overhead, since the number
of past values that it needs to keep could be too high.  The nr_accesses
would also be one of the cases.  To mitigate the overhead, implement a
pseudo-moving sum function that only provides an estimated pseudo-moving
sum.  It assumes there was no error in last discrete time window and
subtract constant portion of last discrete time window sum.

Note that the function is not strictly implementing the moving sum, but
it keeps a property of moving sum, which makes the value same to the
dsicrete-window based sum for each time window-aligned timing.  Hence,
people collecting the value in the old timings would show no difference.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/linux/damon.h |  2 ++
 mm/damon/core.c       | 40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 17c504d236b9..487a545a11b4 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -622,6 +622,8 @@ void damon_add_region(struct damon_region *r, struct damon_target *t);
 void damon_destroy_region(struct damon_region *r, struct damon_target *t);
 int damon_set_regions(struct damon_target *t, struct damon_addr_range *ranges,
 		unsigned int nr_ranges);
+unsigned int damon_moving_sum(unsigned int mvsum, unsigned int nomvsum,
+		unsigned int len_window, unsigned int new_value);
 void damon_update_region_access_rate(struct damon_region *r, bool accessed);
 
 struct damos_filter *damos_new_filter(enum damos_filter_type type,
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 10532159323a..b005dc15009f 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1549,6 +1549,46 @@ int damon_set_region_biggest_system_ram_default(struct damon_target *t,
 	return damon_set_regions(t, &addr_range, 1);
 }
 
+/*
+ * damon_moving_sum() - Calculate an inferred moving sum value.
+ * @mvsum:	Inferred sum of the last @len_window values.
+ * @nomvsum:	Non-moving sum of the last discrete @len_window window values.
+ * @len_window:	The number of last values to take care of.
+ * @new_value:	New value that will be added to the pseudo moving sum.
+ *
+ * Moving sum (moving average * window size) is good for handling noise, but
+ * the cost of keeping past values can be high for arbitrary window size.  This
+ * function implements a lightweight pseudo moving sum function that doesn't
+ * keep the past window values.
+ *
+ * It simply assumes there was no noise in the past, and get the no-noise
+ * assumed past value to drop from @nomvsum and @len_window.  @nomvsum is a
+ * non-moving sum of the last window.  For example, if @len_window is 10 and we
+ * have 25 values, @nomvsum is the sum of the 11th to 20th values of the 25
+ * values.  Hence, this function simply drops @nomvsum / @len_window from
+ * given @mvsum and add @new_value.
+ *
+ * For example, if @len_window is 10 and @nomvsum is 50, the last 10 values for
+ * the last window could be vary, e.g., 0, 10, 0, 10, 0, 10, 0, 0, 0, 20.  For
+ * calculating next moving sum with a new value, we should drop 0 from 50 and
+ * add the new value.  However, this function assumes it got value 5 for each
+ * of the last ten times.  Based on the assumption, when the next value is
+ * measured, it drops the assumed past value, 5 from the current sum, and add
+ * the new value to get the updated pseduo-moving average.
+ *
+ * This means the value could have errors, but the errors will be disappeared
+ * for every @len_window aligned calls.  For example, if @len_window is 10, the
+ * pseudo moving sum with 11th value to 19th value would have an error.  But
+ * the sum with 20th value will not have the error.
+ *
+ * Return: Pseudo-moving average after getting the @new_value.
+ */
+unsigned int damon_moving_sum(unsigned int mvsum, unsigned int nomvsum,
+		unsigned int len_window, unsigned int new_value)
+{
+	return mvsum - nomvsum / len_window + new_value;
+}
+
 /**
  * damon_update_region_access_rate() - Update the access rate of a region.
  * @r:		The DAMON region to update for its access check result.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/8] mm/damon/core-test: add a unit test for damon_moving_sum()
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
                   ` (2 preceding siblings ...)
  2023-09-15  2:52 ` [PATCH 3/8] mm/damon/core: implement a pseudo-moving sum function SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  2023-09-15  2:52 ` [PATCH 5/8] mm/damon/core: introduce nr_accesses_bp SeongJae Park
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: SeongJae Park, Brendan Higgins, damon, linux-mm, kunit-dev,
	linux-kselftest, linux-kernel

Add a simple unit test for the pseudo moving-sum function
(damon_moving_sum()).

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/core-test.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/mm/damon/core-test.h b/mm/damon/core-test.h
index 6cc8b245586d..c539f0e8377e 100644
--- a/mm/damon/core-test.h
+++ b/mm/damon/core-test.h
@@ -341,6 +341,21 @@ static void damon_test_set_attrs(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, damon_set_attrs(c, &invalid_attrs), -EINVAL);
 }
 
+static void damon_test_moving_sum(struct kunit *test)
+{
+	unsigned int mvsum = 50000, nomvsum = 50000, len_window = 10;
+	unsigned int new_values[] = {10000, 0, 10000, 0, 0, 0, 10000, 0, 0, 0};
+	unsigned int expects[] = {55000, 50000, 55000, 50000, 45000, 40000,
+		45000, 40000, 35000, 30000};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(new_values); i++) {
+		mvsum = damon_moving_sum(mvsum, nomvsum, len_window,
+				new_values[i]);
+		KUNIT_EXPECT_EQ(test, mvsum, expects[i]);
+	}
+}
+
 static void damos_test_new_filter(struct kunit *test)
 {
 	struct damos_filter *filter;
@@ -425,6 +440,7 @@ static struct kunit_case damon_test_cases[] = {
 	KUNIT_CASE(damon_test_set_regions),
 	KUNIT_CASE(damon_test_update_monitoring_result),
 	KUNIT_CASE(damon_test_set_attrs),
+	KUNIT_CASE(damon_test_moving_sum),
 	KUNIT_CASE(damos_test_new_filter),
 	KUNIT_CASE(damos_test_filter_out),
 	{},
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/8] mm/damon/core: introduce nr_accesses_bp
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
                   ` (3 preceding siblings ...)
  2023-09-15  2:52 ` [PATCH 4/8] mm/damon/core-test: add a unit test for damon_moving_sum() SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  2023-09-15  2:52 ` [PATCH 6/8] mm/damon/core: use pseudo-moving sum for nr_accesses_bp SeongJae Park
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel

Add yet another representation of the access rate of each region, namely
nr_accesses_bp.  It is just same to the nr_accesses but represents the
value in basis point (1 in 10,000), and updated at once in every
aggregation interval.  That is, moving_accesses_bp is just nr_accesses *
10000.  This may seems useless at the moment.  However, it will be
useful for representing less than one nr_accesses value that will be
needed to make moving sum-based nr_accesses.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/linux/damon.h | 5 +++++
 mm/damon/core-test.h  | 5 +++++
 mm/damon/core.c       | 6 ++++++
 3 files changed, 16 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 487a545a11b4..15f24b23c9a0 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -40,6 +40,7 @@ struct damon_addr_range {
  * @ar:			The address range of the region.
  * @sampling_addr:	Address of the sample for the next access check.
  * @nr_accesses:	Access frequency of this region.
+ * @nr_accesses_bp:	@nr_accesses in basis point (0.01%).
  * @list:		List head for siblings.
  * @age:		Age of this region.
  *
@@ -49,6 +50,9 @@ struct damon_addr_range {
  * not be done with direct access but with the helper function,
  * damon_update_region_access_rate().
  *
+ * @nr_accesses_bp is another representation of @nr_accesses in basis point
+ * (1 in 10,000) that updated every aggregation interval.
+ *
  * @age is initially zero, increased for each aggregation interval, and reset
  * to zero again if the access frequency is significantly changed.  If two
  * regions are merged into a new region, both @nr_accesses and @age of the new
@@ -58,6 +62,7 @@ struct damon_region {
 	struct damon_addr_range ar;
 	unsigned long sampling_addr;
 	unsigned int nr_accesses;
+	unsigned int nr_accesses_bp;
 	struct list_head list;
 
 	unsigned int age;
diff --git a/mm/damon/core-test.h b/mm/damon/core-test.h
index c539f0e8377e..79f1f12e0dd5 100644
--- a/mm/damon/core-test.h
+++ b/mm/damon/core-test.h
@@ -94,6 +94,7 @@ static void damon_test_aggregate(struct kunit *test)
 		for (ir = 0; ir < 3; ir++) {
 			r = damon_new_region(saddr[it][ir], eaddr[it][ir]);
 			r->nr_accesses = accesses[it][ir];
+			r->nr_accesses_bp = accesses[it][ir] * 10000;
 			damon_add_region(r, t);
 		}
 		it++;
@@ -147,9 +148,11 @@ static void damon_test_merge_two(struct kunit *test)
 	t = damon_new_target();
 	r = damon_new_region(0, 100);
 	r->nr_accesses = 10;
+	r->nr_accesses_bp = 100000;
 	damon_add_region(r, t);
 	r2 = damon_new_region(100, 300);
 	r2->nr_accesses = 20;
+	r2->nr_accesses_bp = 200000;
 	damon_add_region(r2, t);
 
 	damon_merge_two_regions(t, r, r2);
@@ -196,6 +199,7 @@ static void damon_test_merge_regions_of(struct kunit *test)
 	for (i = 0; i < ARRAY_SIZE(sa); i++) {
 		r = damon_new_region(sa[i], ea[i]);
 		r->nr_accesses = nrs[i];
+		r->nr_accesses_bp = nrs[i] * 10000;
 		damon_add_region(r, t);
 	}
 
@@ -297,6 +301,7 @@ static void damon_test_update_monitoring_result(struct kunit *test)
 	struct damon_region *r = damon_new_region(3, 7);
 
 	r->nr_accesses = 15;
+	r->nr_accesses_bp = 150000;
 	r->age = 20;
 
 	new_attrs = (struct damon_attrs){
diff --git a/mm/damon/core.c b/mm/damon/core.c
index b005dc15009f..ce85c00b0a4c 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -128,6 +128,7 @@ struct damon_region *damon_new_region(unsigned long start, unsigned long end)
 	region->ar.start = start;
 	region->ar.end = end;
 	region->nr_accesses = 0;
+	region->nr_accesses_bp = 0;
 	INIT_LIST_HEAD(&region->list);
 
 	region->age = 0;
@@ -508,6 +509,7 @@ static void damon_update_monitoring_result(struct damon_region *r,
 {
 	r->nr_accesses = damon_nr_accesses_for_new_attrs(r->nr_accesses,
 			old_attrs, new_attrs);
+	r->nr_accesses_bp = r->nr_accesses * 10000;
 	r->age = damon_age_for_new_attrs(r->age, old_attrs, new_attrs);
 }
 
@@ -1115,6 +1117,7 @@ static void damon_merge_two_regions(struct damon_target *t,
 
 	l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) /
 			(sz_l + sz_r);
+	l->nr_accesses_bp = l->nr_accesses * 10000;
 	l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r);
 	l->ar.end = r->ar.end;
 	damon_destroy_region(r, t);
@@ -1138,6 +1141,8 @@ static void damon_merge_regions_of(struct damon_target *t, unsigned int thres,
 		else
 			r->age++;
 
+		r->nr_accesses_bp = r->nr_accesses * 10000;
+
 		if (prev && prev->ar.end == r->ar.start &&
 		    abs(prev->nr_accesses - r->nr_accesses) <= thres &&
 		    damon_sz_region(prev) + damon_sz_region(r) <= sz_limit)
@@ -1186,6 +1191,7 @@ static void damon_split_region_at(struct damon_target *t,
 
 	new->age = r->age;
 	new->last_nr_accesses = r->last_nr_accesses;
+	new->nr_accesses_bp = r->nr_accesses_bp;
 
 	damon_insert_region(new, r, damon_next_region(r), t);
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/8] mm/damon/core: use pseudo-moving sum for nr_accesses_bp
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
                   ` (4 preceding siblings ...)
  2023-09-15  2:52 ` [PATCH 5/8] mm/damon/core: introduce nr_accesses_bp SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  2023-09-15  2:52 ` [PATCH 7/8] mm/damon/core: skip updating nr_accesses_bp for each aggregation interval SeongJae Park
  2023-09-15  2:52 ` [PATCH 8/8] mm/damon/core: mark damon_moving_sum() as a static function SeongJae Park
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel

Let nr_accesses_bp be calculated as a pseudo-moving sum that updated for
every sampling interval, using damon_moving_sum().  This is assumed to
be useful for cases that the aggregation interval is set quite huge, but
the monivoting results need to be collected earlier than next
aggregation interval is passed.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/linux/damon.h | 12 +++++++++---
 mm/damon/core.c       | 16 +++++++++++++++-
 mm/damon/paddr.c      |  9 +++++----
 mm/damon/vaddr.c      | 12 +++++++-----
 4 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 15f24b23c9a0..0fe13482df63 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -40,7 +40,8 @@ struct damon_addr_range {
  * @ar:			The address range of the region.
  * @sampling_addr:	Address of the sample for the next access check.
  * @nr_accesses:	Access frequency of this region.
- * @nr_accesses_bp:	@nr_accesses in basis point (0.01%).
+ * @nr_accesses_bp:	@nr_accesses in basis point (0.01%) that updated for
+ *			each sampling interval.
  * @list:		List head for siblings.
  * @age:		Age of this region.
  *
@@ -51,7 +52,11 @@ struct damon_addr_range {
  * damon_update_region_access_rate().
  *
  * @nr_accesses_bp is another representation of @nr_accesses in basis point
- * (1 in 10,000) that updated every aggregation interval.
+ * (1 in 10,000) that updated for every &damon_attrs->sample_interval in a
+ * manner similar to moving sum.  By the algorithm, this value becomes
+ * @nr_accesses * 10000 for every &struct damon_attrs->aggr_interval.  This can
+ * be used when the aggregation interval is too huge and therefore cannot wait
+ * for it before getting the access monitoring results.
  *
  * @age is initially zero, increased for each aggregation interval, and reset
  * to zero again if the access frequency is significantly changed.  If two
@@ -629,7 +634,8 @@ int damon_set_regions(struct damon_target *t, struct damon_addr_range *ranges,
 		unsigned int nr_ranges);
 unsigned int damon_moving_sum(unsigned int mvsum, unsigned int nomvsum,
 		unsigned int len_window, unsigned int new_value);
-void damon_update_region_access_rate(struct damon_region *r, bool accessed);
+void damon_update_region_access_rate(struct damon_region *r, bool accessed,
+		struct damon_attrs *attrs);
 
 struct damos_filter *damos_new_filter(enum damos_filter_type type,
 		bool matching);
diff --git a/mm/damon/core.c b/mm/damon/core.c
index ce85c00b0a4c..29ee1fc18393 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1599,14 +1599,28 @@ unsigned int damon_moving_sum(unsigned int mvsum, unsigned int nomvsum,
  * damon_update_region_access_rate() - Update the access rate of a region.
  * @r:		The DAMON region to update for its access check result.
  * @accessed:	Whether the region has accessed during last sampling interval.
+ * @attrs:	The damon_attrs of the DAMON context.
  *
  * Update the access rate of a region with the region's last sampling interval
  * access check result.
  *
  * Usually this will be called by &damon_operations->check_accesses callback.
  */
-void damon_update_region_access_rate(struct damon_region *r, bool accessed)
+void damon_update_region_access_rate(struct damon_region *r, bool accessed,
+		struct damon_attrs *attrs)
 {
+	unsigned int len_window = 1;
+
+	/*
+	 * sample_interval can be zero, but cannot be larger than
+	 * aggr_interval, owing to validation of damon_set_attrs().
+	 */
+	if (attrs->sample_interval)
+		len_window = attrs->aggr_interval / attrs->sample_interval;
+	r->nr_accesses_bp = damon_moving_sum(r->nr_accesses_bp,
+			r->last_nr_accesses * 10000, len_window,
+			accessed ? 10000 : 0);
+
 	if (accessed)
 		r->nr_accesses++;
 }
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 44f21860b555..081e2a325778 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -148,7 +148,8 @@ static bool damon_pa_young(unsigned long paddr, unsigned long *folio_sz)
 	return accessed;
 }
 
-static void __damon_pa_check_access(struct damon_region *r)
+static void __damon_pa_check_access(struct damon_region *r,
+		struct damon_attrs *attrs)
 {
 	static unsigned long last_addr;
 	static unsigned long last_folio_sz = PAGE_SIZE;
@@ -157,12 +158,12 @@ static void __damon_pa_check_access(struct damon_region *r)
 	/* If the region is in the last checked page, reuse the result */
 	if (ALIGN_DOWN(last_addr, last_folio_sz) ==
 				ALIGN_DOWN(r->sampling_addr, last_folio_sz)) {
-		damon_update_region_access_rate(r, last_accessed);
+		damon_update_region_access_rate(r, last_accessed, attrs);
 		return;
 	}
 
 	last_accessed = damon_pa_young(r->sampling_addr, &last_folio_sz);
-	damon_update_region_access_rate(r, last_accessed);
+	damon_update_region_access_rate(r, last_accessed, attrs);
 
 	last_addr = r->sampling_addr;
 }
@@ -175,7 +176,7 @@ static unsigned int damon_pa_check_accesses(struct damon_ctx *ctx)
 
 	damon_for_each_target(t, ctx) {
 		damon_for_each_region(r, t) {
-			__damon_pa_check_access(r);
+			__damon_pa_check_access(r, &ctx->attrs);
 			max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
 		}
 	}
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index e36303271f9d..af2cb82e1fad 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -557,26 +557,27 @@ static bool damon_va_young(struct mm_struct *mm, unsigned long addr,
  * r	the region to be checked
  */
 static void __damon_va_check_access(struct mm_struct *mm,
-				struct damon_region *r, bool same_target)
+				struct damon_region *r, bool same_target,
+				struct damon_attrs *attrs)
 {
 	static unsigned long last_addr;
 	static unsigned long last_folio_sz = PAGE_SIZE;
 	static bool last_accessed;
 
 	if (!mm) {
-		damon_update_region_access_rate(r, false);
+		damon_update_region_access_rate(r, false, attrs);
 		return;
 	}
 
 	/* If the region is in the last checked page, reuse the result */
 	if (same_target && (ALIGN_DOWN(last_addr, last_folio_sz) ==
 				ALIGN_DOWN(r->sampling_addr, last_folio_sz))) {
-		damon_update_region_access_rate(r, last_accessed);
+		damon_update_region_access_rate(r, last_accessed, attrs);
 		return;
 	}
 
 	last_accessed = damon_va_young(mm, r->sampling_addr, &last_folio_sz);
-	damon_update_region_access_rate(r, last_accessed);
+	damon_update_region_access_rate(r, last_accessed, attrs);
 
 	last_addr = r->sampling_addr;
 }
@@ -593,7 +594,8 @@ static unsigned int damon_va_check_accesses(struct damon_ctx *ctx)
 		mm = damon_get_mm(t);
 		same_target = false;
 		damon_for_each_region(r, t) {
-			__damon_va_check_access(mm, r, same_target);
+			__damon_va_check_access(mm, r, same_target,
+					&ctx->attrs);
 			max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
 			same_target = true;
 		}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 7/8] mm/damon/core: skip updating nr_accesses_bp for each aggregation interval
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
                   ` (5 preceding siblings ...)
  2023-09-15  2:52 ` [PATCH 6/8] mm/damon/core: use pseudo-moving sum for nr_accesses_bp SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  2023-09-15  2:52 ` [PATCH 8/8] mm/damon/core: mark damon_moving_sum() as a static function SeongJae Park
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel

damon_merge_regions_of(), which is called for each aggregation interval,
updates nr_accesses_bp to nr_accesses * 10000.  However, nr_accesses_bp
is updated for each sampling interval via damon_moving_sum() using the
aggregation interval as the moving time window.  And by the definition
of the algorithm, the value becomes same to discrete-window based sum
for each time window-aligned time.  Hence, nr_accesses_bp will be same
to nr_accesses * 10000 for each aggregation interval without explicit
update.  Remove the unnecessary update of nr_accesses_bp in
damon_merge_regions_of().

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/damon/core.c b/mm/damon/core.c
index 29ee1fc18393..45cc108c0fe1 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1141,8 +1141,6 @@ static void damon_merge_regions_of(struct damon_target *t, unsigned int thres,
 		else
 			r->age++;
 
-		r->nr_accesses_bp = r->nr_accesses * 10000;
-
 		if (prev && prev->ar.end == r->ar.start &&
 		    abs(prev->nr_accesses - r->nr_accesses) <= thres &&
 		    damon_sz_region(prev) + damon_sz_region(r) <= sz_limit)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 8/8] mm/damon/core: mark damon_moving_sum() as a static function
  2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
                   ` (6 preceding siblings ...)
  2023-09-15  2:52 ` [PATCH 7/8] mm/damon/core: skip updating nr_accesses_bp for each aggregation interval SeongJae Park
@ 2023-09-15  2:52 ` SeongJae Park
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2023-09-15  2:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel

The function is used by only mm/damon/core.c.  Mark it as a static
function.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/linux/damon.h | 2 --
 mm/damon/core.c       | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 0fe13482df63..491fdd3e4c76 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -632,8 +632,6 @@ void damon_add_region(struct damon_region *r, struct damon_target *t);
 void damon_destroy_region(struct damon_region *r, struct damon_target *t);
 int damon_set_regions(struct damon_target *t, struct damon_addr_range *ranges,
 		unsigned int nr_ranges);
-unsigned int damon_moving_sum(unsigned int mvsum, unsigned int nomvsum,
-		unsigned int len_window, unsigned int new_value);
 void damon_update_region_access_rate(struct damon_region *r, bool accessed,
 		struct damon_attrs *attrs);
 
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 45cc108c0fe1..b15cf47d2d29 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1587,7 +1587,7 @@ int damon_set_region_biggest_system_ram_default(struct damon_target *t,
  *
  * Return: Pseudo-moving average after getting the @new_value.
  */
-unsigned int damon_moving_sum(unsigned int mvsum, unsigned int nomvsum,
+static unsigned int damon_moving_sum(unsigned int mvsum, unsigned int nomvsum,
 		unsigned int len_window, unsigned int new_value)
 {
 	return mvsum - nomvsum / len_window + new_value;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-09-15  3:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-15  2:52 [PATCH 0/8] mm/damon: provide pseudo-moving sum based access rate SeongJae Park
2023-09-15  2:52 ` [PATCH 1/8] mm/damon/core: define and use a dedicated function for region access rate update SeongJae Park
2023-09-15  2:52 ` [PATCH 2/8] mm/damon/vaddr: call damon_update_region_access_rate() always SeongJae Park
2023-09-15  2:52 ` [PATCH 3/8] mm/damon/core: implement a pseudo-moving sum function SeongJae Park
2023-09-15  2:52 ` [PATCH 4/8] mm/damon/core-test: add a unit test for damon_moving_sum() SeongJae Park
2023-09-15  2:52 ` [PATCH 5/8] mm/damon/core: introduce nr_accesses_bp SeongJae Park
2023-09-15  2:52 ` [PATCH 6/8] mm/damon/core: use pseudo-moving sum for nr_accesses_bp SeongJae Park
2023-09-15  2:52 ` [PATCH 7/8] mm/damon/core: skip updating nr_accesses_bp for each aggregation interval SeongJae Park
2023-09-15  2:52 ` [PATCH 8/8] mm/damon/core: mark damon_moving_sum() as a static function SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).