From: Liew Rui Yan <aethernet65535@gmail.com>
To: sj@kernel.org
Cc: damon@lists.linux.dev, linux-mm@kvack.org,
Liew Rui Yan <aethernet65535@gmail.com>
Subject: [RFC PATCH] mm/damon/ops-common: optimize damon_hot_score() using fls()
Date: Fri, 20 Mar 2026 15:24:31 +0800 [thread overview]
Message-ID: <20260320072431.248235-1-aethernet65535@gmail.com> (raw)
The current implementation of damon_hot_score() uses a manual for-loop
to calculate the value of 'age_in_log'. This can be efficiently replaced
by the fls().
In a simulated performance test with 10,000,000 iterations, this
optimization showed a significant reduction in latency:
- Average Latency: Reduced from ~9ns to ~1ns.
- P99 Latency: Reduced from ~60ns to ~41ns.
- Throughput: The loop-based version mostly fell into the 40-50ns range,
while the fls-based version shifted significantly towards the 20-39ns
range in the test environment.
Although these results are based on a simulated kernel module test
environment [1], they indicate a clear instruction-level optimization.
[1] https://github.com/aethernet65535/damon-hot-score-fls-optimize/blob/master/test-kernel-module/fls.c
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
---
Note on testing methodology:
I attempted to measure the performance directly within the kernel using
bpftrace, perf, and ktime inside damon_hot_score(). However, the results
were highly unstable (ktime), and in some cases (perf/bpftrace) the
function was difficult to trace reliably (likely due to my own tracing
limitations).
Despite the instability of in-kernel ktime measurements, one thing
remained consistent: the fls-based version significantly improves the
"long tail" latency compared to the for-loop.
Test results from the simulated module:
- fls-based:
DAMON Perf Test: Starting 10000000 iterations
=============================================
Total Iterations : 10000000
Average Latency : 1 ns
P95 Latency : 40 ns
P99 Latency : 41 ns
---------------------------------------------
Range (ns) | Count | Percent
---------------------------------------------
20-39 | 3522000 | 35%
40-59 | 6478000 | 64%
60-79 | 0 | 0%
=============================================
- for-loop:
DAMON Perf Test: Starting 10000000 iterations
=============================================
Total Iterations : 10000000
Average Latency : 9 ns
P95 Latency : 51 ns
P99 Latency : 60 ns
---------------------------------------------
Range (ns) | Count | Percent
---------------------------------------------
20-39 | 0 | 0%
40-59 | 9894000 | 98%
60-79 | 98000 | 0%
=============================================
Full raw benchmark results can be found at [2].
If anyone could suggest a more robust way to profile this specific
function within live DAMON context, I would greatly appreciate the
guidance.
[2] https://github.com/aethernet65535/damon-hot-score-fls-optimize/tree/master/result-raw
mm/damon/ops-common.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c
index 8c6d613425c1..0294de61a23a 100644
--- a/mm/damon/ops-common.c
+++ b/mm/damon/ops-common.c
@@ -117,9 +117,7 @@ int damon_hot_score(struct damon_ctx *c, struct damon_region *r,
damon_max_nr_accesses(&c->attrs);
age_in_sec = (unsigned long)r->age * c->attrs.aggr_interval / 1000000;
- for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec;
- age_in_log++, age_in_sec >>= 1)
- ;
+ age_in_log = min_t(int, fls(age_in_sec), DAMON_MAX_AGE_IN_LOG);
/* If frequency is 0, higher age means it's colder */
if (freq_subscore == 0)
--
2.53.0
next reply other threads:[~2026-03-20 7:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 7:24 Liew Rui Yan [this message]
2026-03-20 15:05 ` [RFC PATCH] mm/damon/ops-common: optimize damon_hot_score() using fls() SeongJae Park
2026-03-20 19:20 ` [PATCH v2] mm/damon/ops-common: optimize damon_hot_score() using ilog2() Liew Rui Yan
2026-03-21 0:23 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260320072431.248235-1-aethernet65535@gmail.com \
--to=aethernet65535@gmail.com \
--cc=damon@lists.linux.dev \
--cc=linux-mm@kvack.org \
--cc=sj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.