* [RFC PATCH v1.1 00/13] mm/damon: optimize out nr_accesses_bp @ 2026-06-20 17:22 SeongJae Park 2026-06-20 17:22 ` [RFC PATCH v1.1 04/13] mm/damon/core: use damon_nr_accesses_mvsum() for damos region tracing SeongJae Park 0 siblings, 1 reply; 2+ messages in thread From: SeongJae Park @ 2026-06-20 17:22 UTC (permalink / raw) Cc: SeongJae Park, Andrew Morton, Brendan Higgins, David Gow, Masami Hiramatsu, Mathieu Desnoyers, Shuah Khan, Steven Rostedt, damon, kunit-dev, linux-kernel, linux-kselftest, linux-mm, linux-trace-kernel TLDR: Replace damon_region->nr_accesses_bp, which is easy to be wrong, with a simpler on-demand moving sum function, damon_nr_accesses_mvsum(). Background ========== DAMON's monitoring output (access pattern snapshot, or more technically speaking, damon_region->nr_accesses) is completed once per aggregation interval, which is 100 ms by default. Users can arbitrarily increase the interval for demand. Under the suggested intervals auto-tuning setup, it can span up to 200 seconds. If the aggregation interval is too long, the snapshot users cannot use it in reasonable time. To mitigate this, we introduced a new field of damon_region, namely nr_accesses_bp. It contains a pseudo moving sum of nr_accesses in bp units and is updated for each sampling interval. It turned out keeping it correctly updated every sampling interval is not that easy. From online parameter update feature development and more experimental hacks, we found it is easy to be corrupted. Once it is corrupted, DAMON's monitoring outputs become quite insane. Hence we added a few validation checks. It is easy to be corrupted because it requires every update per sampling interval to be correct. Solution ======== There is no real reason to keep it updated every sampling interval. Due to the simple pseudo-moving sum mechanism and existing helper field (last_nr_accesses), we can also calculate the pseudo moving sum on demand in a much simpler way. Implement a function for getting the pseudo moving sum on demand, and replace nr_accessses_bp uses with the new function. Also remove no more needed tests for nr_accesses_bp and the per-sampling interval update functions. Finally, remove the nr_accesses_bp. The new function is quite simple. Discussion ========== Depending on the use case, multiple nr_accesses readers could be executed in the same kdamond_fn() main loop iteration, which is executed once per sampling interval. Such readers include DAMON region exporting tracepoints (damon_[region_]aggregated and damos_before_apply), DAMOS, and DAMON sysfs interface logic for update_schemes_tried_regions command. In this case, the new function will be called multiple times and this could be overhead compared to the old logic, which simply reads the field without any additional work. Nonetheless, the new function is quite simple. And the new approach does nothing while there is no need to read. The old approach had to execute its update function for each region for every sampling interval. Hence the new approach is believed to be even more lightweight in common case, and the overhead is anyway negligible. One more advantage of this change is that one field from the damon_region struct is removed. On setups that uses a high number of DAMON regions, this could be a potential memory space benefit. Patches Sequence ================ Patch 1 introduces the new function for getting the pseudo moving sum of nr_accesses on demands. Patch 2 implements a unit test for the new function's internal logic. Patches 3-5 replace uses of nr_accesses_bp in DAMOS, tracepoints and DAMON sysfs interface with the new function, respectively. Patches 6-8 removes nr_accesses_bp validation functions in DAMON core, one by one. Patches 9 and 10 further remove tests and test helper for nr_accesses_bp, respectively. Patches 11 removes the setups and updates or nr_accesses_bp field. Patch 12 removes the function that was used for updating nr_accesses_bp field with its unit test, which is the single remaining caller of the function. Finally, patch 13 removes damon_region->nr_accesses_bp field. Changes from RFC v1 - RFC v1: https://lore.kernel.org/20260619193415.73833-1-sj@kernel.org - Avoid divide-by-zero from zero aggregation interval. - Call damon_nr_accesses_mvsum() for damos tracing only when it is enabled. - Remove obsolete mentioning of nr_accesses_bp in comments. SeongJae Park (13): mm/damon: introduce damon_nr_accesses_mvsum() mm/damon/tests/core-kunit: test damon_mvsum() mm/damon/core: use damon_nr_accesses_mvsum() in __damos_valid_target() mm/damon/core: use damon_nr_accesses_mvsum() for damos region tracing mm/damon/sysfs-schemes: use damon_nr_accesses_mvsum() for damo regions mm/damon/core: remove damon_warn_fix_nr_accesses_corruption() mm/damon/core: remove damon_verify_reset_aggregated() mm/damon/core: remove damon_verify_merge_regions_of() mm/damon/tests/core-kunit: remove nr_accesses_bp setup and tests selftests/damon/drgn_dump_damon_status: do not dump nr_accesses_bp mm/damon/core: remove nr_accesses_bp setups and updates mm/damon/core: remove damon_moving_sum() and its unit test mm/damon: remove damon_region->nr_accesses_bp include/linux/damon.h | 12 +- include/trace/events/damon.h | 8 +- mm/damon/core.c | 180 +++++++----------- mm/damon/sysfs-schemes.c | 6 +- mm/damon/tests/core-kunit.h | 37 ++-- .../selftests/damon/drgn_dump_damon_status.py | 1 - 6 files changed, 96 insertions(+), 148 deletions(-) base-commit: a74bff7aaa4b3a64070425b4b367a459388a8233 -- 2.47.3 ^ permalink raw reply [flat|nested] 2+ messages in thread
* [RFC PATCH v1.1 04/13] mm/damon/core: use damon_nr_accesses_mvsum() for damos region tracing 2026-06-20 17:22 [RFC PATCH v1.1 00/13] mm/damon: optimize out nr_accesses_bp SeongJae Park @ 2026-06-20 17:22 ` SeongJae Park 0 siblings, 0 replies; 2+ messages in thread From: SeongJae Park @ 2026-06-20 17:22 UTC (permalink / raw) Cc: SeongJae Park, Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers, Steven Rostedt, damon, linux-kernel, linux-mm, linux-trace-kernel damon_nr_accesses_mvsum() returns a value same to nr_accesses_bp. Also the function is more simple and therefore more tolerant to errors. Execution of the function would be more expensive than the simple read of the field, but because the function is quite simple, the overhead should be negligible. Use it in the DAMON region exporting trace points instead of the nr_accesses_bp. Signed-off-by: SeongJae Park <sj@kernel.org> --- include/trace/events/damon.h | 8 +++++--- mm/damon/core.c | 5 +++-- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h index 78388538acf44..8851727ae1627 100644 --- a/include/trace/events/damon.h +++ b/include/trace/events/damon.h @@ -78,9 +78,11 @@ TRACE_EVENT_CONDITION(damos_before_apply, TP_PROTO(unsigned int context_idx, unsigned int scheme_idx, unsigned int target_idx, struct damon_region *r, - unsigned int nr_regions, bool do_trace), + unsigned int nr_accesses, unsigned int nr_regions, + bool do_trace), - TP_ARGS(context_idx, scheme_idx, target_idx, r, nr_regions, do_trace), + TP_ARGS(context_idx, scheme_idx, target_idx, r, nr_accesses, + nr_regions, do_trace), TP_CONDITION(do_trace), @@ -101,7 +103,7 @@ TRACE_EVENT_CONDITION(damos_before_apply, __entry->target_idx = target_idx; __entry->start = r->ar.start; __entry->end = r->ar.end; - __entry->nr_accesses = r->nr_accesses_bp / 10000; + __entry->nr_accesses = nr_accesses; __entry->age = r->age; __entry->nr_regions = nr_regions; ), diff --git a/mm/damon/core.c b/mm/damon/core.c index ce0e2a4c1d523..710ec13e98281 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -2434,7 +2434,7 @@ static void damos_apply_scheme(struct damon_ctx *c, struct damon_target *t, struct damos *siter; /* schemes iterator */ unsigned int sidx = 0; struct damon_target *titer; /* targets iterator */ - unsigned int tidx = 0; + unsigned int tidx = 0, nr_accesses = 0; bool do_trace = false; /* get indices for trace_damos_before_apply() */ @@ -2449,6 +2449,7 @@ static void damos_apply_scheme(struct damon_ctx *c, struct damon_target *t, break; tidx++; } + nr_accesses = damon_nr_accesses_mvsum(r, c); do_trace = true; } @@ -2464,7 +2465,7 @@ static void damos_apply_scheme(struct damon_ctx *c, struct damon_target *t, if (damos_core_filter_out(c, t, r, s)) return; ktime_get_coarse_ts64(&begin); - trace_damos_before_apply(cidx, sidx, tidx, r, + trace_damos_before_apply(cidx, sidx, tidx, r, nr_accesses, damon_nr_regions(t), do_trace); sz_applied = c->ops.apply_scheme(c, t, r, s, &sz_ops_filter_passed); -- 2.47.3 ^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-20 17:22 UTC | newest] Thread overview: 2+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-20 17:22 [RFC PATCH v1.1 00/13] mm/damon: optimize out nr_accesses_bp SeongJae Park 2026-06-20 17:22 ` [RFC PATCH v1.1 04/13] mm/damon/core: use damon_nr_accesses_mvsum() for damos region tracing SeongJae Park
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox