* [RFC PATCH v3 07/10] mm/damon/tests/core-kunit: test fail_charge_{num,denom} committing
2026-04-07 1:05 [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio SeongJae Park
@ 2026-04-07 1:05 ` SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 08/10] selftests/damon/_damon_sysfs: support failed region quota charge ratio SeongJae Park
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2026-04-07 1:05 UTC (permalink / raw)
Cc: SeongJae Park, Andrew Morton, Brendan Higgins, David Gow, damon,
kunit-dev, linux-kernel, linux-kselftest, linux-mm
Extend damos_test_commit_quotas() kunit test to ensure
damos_commit_quota() handles fail_charge_{num,denom} parameters.
Signed-off-by: SeongJae Park <sj@kernel.org>
---
mm/damon/tests/core-kunit.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/mm/damon/tests/core-kunit.h b/mm/damon/tests/core-kunit.h
index 0030f682b23b7..1b23a22ac04c4 100644
--- a/mm/damon/tests/core-kunit.h
+++ b/mm/damon/tests/core-kunit.h
@@ -694,6 +694,8 @@ static void damos_test_commit_quota(struct kunit *test)
.ms = 2,
.sz = 3,
.goal_tuner = DAMOS_QUOTA_GOAL_TUNER_CONSIST,
+ .fail_charge_num = 2,
+ .fail_charge_denom = 3,
.weight_sz = 4,
.weight_nr_accesses = 5,
.weight_age = 6,
@@ -703,6 +705,8 @@ static void damos_test_commit_quota(struct kunit *test)
.ms = 8,
.sz = 9,
.goal_tuner = DAMOS_QUOTA_GOAL_TUNER_TEMPORAL,
+ .fail_charge_num = 1,
+ .fail_charge_denom = 1024,
.weight_sz = 10,
.weight_nr_accesses = 11,
.weight_age = 12,
@@ -717,6 +721,8 @@ static void damos_test_commit_quota(struct kunit *test)
KUNIT_EXPECT_EQ(test, dst.ms, src.ms);
KUNIT_EXPECT_EQ(test, dst.sz, src.sz);
KUNIT_EXPECT_EQ(test, dst.goal_tuner, src.goal_tuner);
+ KUNIT_EXPECT_EQ(test, dst.fail_charge_num, src.fail_charge_num);
+ KUNIT_EXPECT_EQ(test, dst.fail_charge_denom, src.fail_charge_denom);
KUNIT_EXPECT_EQ(test, dst.weight_sz, src.weight_sz);
KUNIT_EXPECT_EQ(test, dst.weight_nr_accesses, src.weight_nr_accesses);
KUNIT_EXPECT_EQ(test, dst.weight_age, src.weight_age);
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* [RFC PATCH v3 08/10] selftests/damon/_damon_sysfs: support failed region quota charge ratio
2026-04-07 1:05 [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 07/10] mm/damon/tests/core-kunit: test fail_charge_{num,denom} committing SeongJae Park
@ 2026-04-07 1:05 ` SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 09/10] selftests/damon/drgn_dump_damon_status: " SeongJae Park
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2026-04-07 1:05 UTC (permalink / raw)
Cc: SeongJae Park, Shuah Khan, damon, linux-kernel, linux-kselftest,
linux-mm
Extend _damon_sysfs.py for DAMOS action failed regions quota charge
ratio setup, so that we can add kselftest for the new feature.
Signed-off-by: SeongJae Park <sj@kernel.org>
---
tools/testing/selftests/damon/_damon_sysfs.py | 21 +++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/damon/_damon_sysfs.py b/tools/testing/selftests/damon/_damon_sysfs.py
index 120b96ecbd741..8b12cc0484405 100644
--- a/tools/testing/selftests/damon/_damon_sysfs.py
+++ b/tools/testing/selftests/damon/_damon_sysfs.py
@@ -132,14 +132,17 @@ class DamosQuota:
goals = None # quota goals
goal_tuner = None # quota goal tuner
reset_interval_ms = None # quota reset interval
+ fail_charge_num = None
+ fail_charge_denom = None
weight_sz_permil = None
weight_nr_accesses_permil = None
weight_age_permil = None
scheme = None # owner scheme
def __init__(self, sz=0, ms=0, goals=None, goal_tuner='consist',
- reset_interval_ms=0, weight_sz_permil=0,
- weight_nr_accesses_permil=0, weight_age_permil=0):
+ reset_interval_ms=0, fail_charge_num=0, fail_charge_denom=0,
+ weight_sz_permil=0, weight_nr_accesses_permil=0,
+ weight_age_permil=0):
self.sz = sz
self.ms = ms
self.reset_interval_ms = reset_interval_ms
@@ -151,6 +154,8 @@ class DamosQuota:
for idx, goal in enumerate(self.goals):
goal.idx = idx
goal.quota = self
+ self.fail_charge_num = fail_charge_num
+ self.fail_charge_denom = fail_charge_denom
def sysfs_dir(self):
return os.path.join(self.scheme.sysfs_dir(), 'quotas')
@@ -197,6 +202,18 @@ class DamosQuota:
os.path.join(self.sysfs_dir(), 'goal_tuner'), self.goal_tuner)
if err is not None:
return err
+
+ err = write_file(
+ os.path.join(self.sysfs_dir(), 'fail_charge_num'),
+ self.fail_charge_num)
+ if err is not None:
+ return err
+ err = write_file(
+ os.path.join(self.sysfs_dir(), 'fail_charge_denom'),
+ self.fail_charge_denom)
+ if err is not None:
+ return err
+
return None
class DamosWatermarks:
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* [RFC PATCH v3 09/10] selftests/damon/drgn_dump_damon_status: support failed region quota charge ratio
2026-04-07 1:05 [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 07/10] mm/damon/tests/core-kunit: test fail_charge_{num,denom} committing SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 08/10] selftests/damon/_damon_sysfs: support failed region quota charge ratio SeongJae Park
@ 2026-04-07 1:05 ` SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 10/10] selftests/damon/sysfs.py: test " SeongJae Park
2026-04-08 16:48 ` [RFC PATCH v3 00/10] mm/damon: introduce DAMOS " Bijan Tabatabai
4 siblings, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2026-04-07 1:05 UTC (permalink / raw)
Cc: SeongJae Park, Shuah Khan, damon, linux-kernel, linux-kselftest,
linux-mm
Extend drgn_dump_damon_status.py to dump DAMON internal state for DAMOS
action failed regions quota charge ratio, to be able to show if the
internal state for the feature is working, with future DAMON selftests.
Signed-off-by: SeongJae Park <sj@kernel.org>
---
tools/testing/selftests/damon/drgn_dump_damon_status.py | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/damon/drgn_dump_damon_status.py b/tools/testing/selftests/damon/drgn_dump_damon_status.py
index 5b90eb8e7ef88..972948e6215f1 100755
--- a/tools/testing/selftests/damon/drgn_dump_damon_status.py
+++ b/tools/testing/selftests/damon/drgn_dump_damon_status.py
@@ -112,6 +112,8 @@ def damos_quota_to_dict(quota):
['goals', damos_quota_goals_to_list],
['goal_tuner', int],
['esz', int],
+ ['fail_charge_num', int],
+ ['fail_charge_denom', int],
['weight_sz', int],
['weight_nr_accesses', int],
['weight_age', int],
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* [RFC PATCH v3 10/10] selftests/damon/sysfs.py: test failed region quota charge ratio
2026-04-07 1:05 [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio SeongJae Park
` (2 preceding siblings ...)
2026-04-07 1:05 ` [RFC PATCH v3 09/10] selftests/damon/drgn_dump_damon_status: " SeongJae Park
@ 2026-04-07 1:05 ` SeongJae Park
2026-04-08 16:48 ` [RFC PATCH v3 00/10] mm/damon: introduce DAMOS " Bijan Tabatabai
4 siblings, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2026-04-07 1:05 UTC (permalink / raw)
Cc: SeongJae Park, Shuah Khan, damon, linux-kernel, linux-kselftest,
linux-mm
Extend sysfs.py DAMON selftest to setup DAMOS action failed region quota
charge ratio and assert the setup is made into DAMON internal state.
Signed-off-by: SeongJae Park <sj@kernel.org>
---
tools/testing/selftests/damon/sysfs.py | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/damon/sysfs.py b/tools/testing/selftests/damon/sysfs.py
index b8d6e0f8fd906..d8d4717128898 100755
--- a/tools/testing/selftests/damon/sysfs.py
+++ b/tools/testing/selftests/damon/sysfs.py
@@ -73,6 +73,10 @@ def assert_quota_committed(quota, dump):
}
assert_true(dump['goal_tuner'] == tuner_val[quota.goal_tuner],
'goal_tuner', dump)
+ assert_true(dump['fail_charge_num'] == quota.fail_charge_num,
+ 'fail_charge_num', dump)
+ assert_true(dump['fail_charge_denom'] == quota.fail_charge_denom,
+ 'fail_charge_denom', dump)
assert_true(dump['weight_sz'] == quota.weight_sz_permil, 'weight_sz', dump)
assert_true(dump['weight_nr_accesses'] == quota.weight_nr_accesses_permil,
'weight_nr_accesses', dump)
@@ -277,6 +281,8 @@ def main():
nid=1)],
goal_tuner='temporal',
reset_interval_ms=1500,
+ fail_charge_num=1,
+ fail_charge_denom=4096,
weight_sz_permil=20,
weight_nr_accesses_permil=200,
weight_age_permil=1000),
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio
2026-04-07 1:05 [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio SeongJae Park
` (3 preceding siblings ...)
2026-04-07 1:05 ` [RFC PATCH v3 10/10] selftests/damon/sysfs.py: test " SeongJae Park
@ 2026-04-08 16:48 ` Bijan Tabatabai
2026-04-09 0:00 ` SeongJae Park
4 siblings, 1 reply; 7+ messages in thread
From: Bijan Tabatabai @ 2026-04-08 16:48 UTC (permalink / raw)
To: SeongJae Park
Cc: Bijan Tabatabai, Liam R. Howlett, Andrew Morton, Brendan Higgins,
David Gow, David Hildenbrand, Jonathan Corbet, Lorenzo Stoakes,
Michal Hocko, Mike Rapoport, Shuah Khan, Shuah Khan,
Suren Baghdasaryan, Vlastimil Babka, damon, kunit-dev, linux-doc,
linux-kernel, linux-kselftest, linux-mm
On Mon, 6 Apr 2026 18:05:22 -0700 SeongJae Park <sj@kernel.org> wrote:
Hi SJ,
> TL; DR: Let users set different DAMOS quota charge ratios for DAMOS
> action failed regions, for deterministic and consistent DAMOS action
> progress.
>
> Common Reports: Unexpectedly Slow DAMOS
> =======================================
>
> One common issue report that we get from DAMON users is that DAMOS
> action applying progress speed is sometimes much slower than expected.
> And one common root cause is that the DAMOS quota is exceeded by the
> action applying failed memory regions.
>
> For example, a group of users tried to run DAMOS-based proactive memory
> reclamation (DAMON_RECLAIM) with 100 MiB per second DAMOS quota. They
> ran it on a system having no active workload which means all memory of
> the system is cold. The expectation was that the system will show 100
> MiB per second reclamation until (nearly) all memory is reclaimed. But
> what they found is that the speed is quite inconsistent and sometimes it
> becomes very slower than the expectation, sometimes even no reclamation
> at all for about tens of seconds. The upper limit of the speed (100 MiB
> per second) was being kept as expected, though.
>
> By monitoring the qt_exceeds (number of DAMOS quota exceed events) DAMOS
> stat, we found DAMOS quota is always exceeded when the speed is slow. By
> monitoring sz_tried and sz_applied (the total amount of DAMOS action
> tried memory and succeeded memory) DAMOS stats together, we found the
> reclamation attempts nearly always failed when the speed is slow.
>
> DAMOS quota charges DAMOS action tried regions regardless of the
> successfulness of the try. Hence in the example reported case, there
> was unreclaimable memory spread around the system memory. Sometimes
> nearly 100 MiB of memory that DAMOS tried to reclaim in the given quota
> interval was reclaimable, and therefore showed nearly 100 MiB per second
> speed. Sometimes nearly 99 MiB of memory that DAMOS was trying to
> reclaim in the given quota interval was unreclaimable, and therefore
> showing only about 1 MiB per second reclaim speed.
>
> We explained it is an expected behavior of the feature rather than a
> bug, as DAMOS quota is there for only the upper-limit of the speed. The
> users agreed and later reported a huge win from the adoption of
> DAMON_RECLAIM on their products.
Thanks for this series. This is a problem I have come across and am looking
forward to seeing this land.
> It is Not a Bug but a Feature; But...
> =====================================
>
> So nothing is broken. DAMOS quota is working as intended, as the upper
> limit of the speed. It also provides its behavior observability via
> DAMOS stat. In the real world production environment that runs long
> term active workloads and matters stability, the speed sometimes being
> slow is not a real problem.
>
> But, the non-deterministic behavior is sometimes annoying, especially in
> lab environments. Even in a realistic production environment, when
> there is a huge amount of DAMOS action unapplicable memory, the speed
> could be problematically slow. Let's suppose a virtual machines
> provider that setup 99% of the host memory as hugetlb pages that cannot
> be reclaimed, to give it to virtual machines. Also, when aim-oriented
> DAMOS auto-tuning is applied, this could also make the internal feedback
> loop confused.
>
> The intention of the current behavior was that trying DAMOS action to
> regions would anyway impose some overhead, and therefore somehow be
> charged. But in the real world, the overhead for failed action is much
> lighter than successful action. Charging those at the same ratio may be
> unfair, or at least suboptimum in some environments.
>
> DAMOS Action Failed Region Quota Charge Ratio
> =============================================
>
> Let users set the charge ratio for the action-failed memory, for more
> optimal and deterministic use of DAMOS. It allows users to specify the
> numerator and the denominator of the ratio for flexible setup. For
> example, let's suppose the numerator and the denominator are set to 1
> and 4,096, respectively. The ratio is 1 / 4,096. A DAMOS scheme action
> is applied to 5 GiB memory. For 1 GiB of the memory, the action is
> succeeded. For the rest (4 GiB), the action is failed. Then, only 1
> GiB and 1 MiB quota is charged.
>
> The optimal charge ratio will depend on the use case and
> system/workload. I'd recommend starting from setting the nominator as 1
> and the denominator as PAGE_SIZE and tune based on the results, because
> many DAMOS actions are applied at page level.
This makes sense, but the quota is also considered when setting the minimum
allowable score in damos_adjust_quota(), which, to my understanding, assumes
that all of the all of a region's data will by applied. If an action fails for
a significant amount of the memory, a lower score than what was calculated in
damos_adjust_quota() could be valid. If that's the case, the scheme would be
applied to fewer regions than strictly necessary.
As you mention above, this is not a correctness issue because the quota only
guarantees an upper limit on the amount of data the scheme is applied to.
Additionally, it may very well be true that what I listed above would not be
very noticeable in practice. I just thought this was worth pointing out as
something to think about.
Thanks,
Bijan
<snip>
Sent using hkml (https://github.com/sjp38/hackermail)
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio
2026-04-08 16:48 ` [RFC PATCH v3 00/10] mm/damon: introduce DAMOS " Bijan Tabatabai
@ 2026-04-09 0:00 ` SeongJae Park
0 siblings, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2026-04-09 0:00 UTC (permalink / raw)
To: Bijan Tabatabai
Cc: SeongJae Park, Liam R. Howlett, Andrew Morton, Brendan Higgins,
David Gow, David Hildenbrand, Jonathan Corbet, Lorenzo Stoakes,
Michal Hocko, Mike Rapoport, Shuah Khan, Shuah Khan,
Suren Baghdasaryan, Vlastimil Babka, damon, kunit-dev, linux-doc,
linux-kernel, linux-kselftest, linux-mm
On Wed, 8 Apr 2026 11:48:27 -0500 Bijan Tabatabai <bijan311@gmail.com> wrote:
> On Mon, 6 Apr 2026 18:05:22 -0700 SeongJae Park <sj@kernel.org> wrote:
>
> Hi SJ,
>
> > TL; DR: Let users set different DAMOS quota charge ratios for DAMOS
> > action failed regions, for deterministic and consistent DAMOS action
> > progress.
> >
> > Common Reports: Unexpectedly Slow DAMOS
> > =======================================
> >
> > One common issue report that we get from DAMON users is that DAMOS
> > action applying progress speed is sometimes much slower than expected.
> > And one common root cause is that the DAMOS quota is exceeded by the
> > action applying failed memory regions.
> >
> > For example, a group of users tried to run DAMOS-based proactive memory
> > reclamation (DAMON_RECLAIM) with 100 MiB per second DAMOS quota. They
> > ran it on a system having no active workload which means all memory of
> > the system is cold. The expectation was that the system will show 100
> > MiB per second reclamation until (nearly) all memory is reclaimed. But
> > what they found is that the speed is quite inconsistent and sometimes it
> > becomes very slower than the expectation, sometimes even no reclamation
> > at all for about tens of seconds. The upper limit of the speed (100 MiB
> > per second) was being kept as expected, though.
> >
> > By monitoring the qt_exceeds (number of DAMOS quota exceed events) DAMOS
> > stat, we found DAMOS quota is always exceeded when the speed is slow. By
> > monitoring sz_tried and sz_applied (the total amount of DAMOS action
> > tried memory and succeeded memory) DAMOS stats together, we found the
> > reclamation attempts nearly always failed when the speed is slow.
> >
> > DAMOS quota charges DAMOS action tried regions regardless of the
> > successfulness of the try. Hence in the example reported case, there
> > was unreclaimable memory spread around the system memory. Sometimes
> > nearly 100 MiB of memory that DAMOS tried to reclaim in the given quota
> > interval was reclaimable, and therefore showed nearly 100 MiB per second
> > speed. Sometimes nearly 99 MiB of memory that DAMOS was trying to
> > reclaim in the given quota interval was unreclaimable, and therefore
> > showing only about 1 MiB per second reclaim speed.
> >
> > We explained it is an expected behavior of the feature rather than a
> > bug, as DAMOS quota is there for only the upper-limit of the speed. The
> > users agreed and later reported a huge win from the adoption of
> > DAMON_RECLAIM on their products.
>
> Thanks for this series. This is a problem I have come across and am looking
> forward to seeing this land.
Thank you for acknowledging. I'm hoping this to land on 7.2-rc1.
[...]
> > DAMOS Action Failed Region Quota Charge Ratio
> > =============================================
> >
> > Let users set the charge ratio for the action-failed memory, for more
> > optimal and deterministic use of DAMOS. It allows users to specify the
> > numerator and the denominator of the ratio for flexible setup. For
> > example, let's suppose the numerator and the denominator are set to 1
> > and 4,096, respectively. The ratio is 1 / 4,096. A DAMOS scheme action
> > is applied to 5 GiB memory. For 1 GiB of the memory, the action is
> > succeeded. For the rest (4 GiB), the action is failed. Then, only 1
> > GiB and 1 MiB quota is charged.
> >
> > The optimal charge ratio will depend on the use case and
> > system/workload. I'd recommend starting from setting the nominator as 1
> > and the denominator as PAGE_SIZE and tune based on the results, because
> > many DAMOS actions are applied at page level.
>
> This makes sense, but the quota is also considered when setting the minimum
> allowable score in damos_adjust_quota(), which, to my understanding, assumes
> that all of the all of a region's data will by applied. If an action fails for
> a significant amount of the memory, a lower score than what was calculated in
> damos_adjust_quota() could be valid. If that's the case, the scheme would be
> applied to fewer regions than strictly necessary.
Good point, you are right.
>
> As you mention above, this is not a correctness issue because the quota only
> guarantees an upper limit on the amount of data the scheme is applied to.
I agree.
> Additionally, it may very well be true that what I listed above would not be
> very noticeable in practice.
I guess it is hopefully true, for following reason.
The score for each region is calculated as a weigted sum of the access
frequency and the age of the region. To avoid DAMOS action is repeatedly
applied to only a few regions, we reset age of regions after a DAMOS action is
applied to the region, regardless of the action failure. So, periodically the
score of the regions having the action unapplicable region will get low, make
no big impact to the minimum score threshold calculation.
But real data could say something different. I will be happy to be proven
wrong my real data. :)
> I just thought this was worth pointing out as
> something to think about.
Indeed. Thank you for pointing out. Nonetheless this is not a new issue that
introduced by this patch series. And the impact is not clear at the moment. I
will be happy to revisit this in parallel to this patch series.
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread