From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id DD812FB5178
	for <linux-mm@archiver.kernel.org>; Tue,  7 Apr 2026 01:05:43 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id DD1DB6B008A; Mon,  6 Apr 2026 21:05:42 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id D82436B008C; Mon,  6 Apr 2026 21:05:42 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id C98356B0092; Mon,  6 Apr 2026 21:05:42 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16])
	by kanga.kvack.org (Postfix) with ESMTP id B39966B008A
	for <linux-mm@kvack.org>; Mon,  6 Apr 2026 21:05:42 -0400 (EDT)
Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay04.hostedemail.com (Postfix) with ESMTP id 5B8C51A05EF
	for <linux-mm@kvack.org>; Tue,  7 Apr 2026 01:05:42 +0000 (UTC)
X-FDA: 84629967324.23.52C4E66
Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254])
	by imf05.hostedemail.com (Postfix) with ESMTP id 002BA100009
	for <linux-mm@kvack.org>; Tue,  7 Apr 2026 01:05:40 +0000 (UTC)
Authentication-Results: imf05.hostedemail.com;
	dkim=pass header.d=kernel.org header.s=k20201202 header.b=jjhkznrw;
	spf=pass (imf05.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org;
	dmarc=pass (policy=quarantine) header.from=kernel.org
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1775523941;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:references:dkim-signature;
	bh=3DVP3317Uay99dW0Q8SaZOVxAbv1ENgJSoRkxeeFyuE=;
	b=MPHtiXcEXLFQemDK9eiXYqS82I25c1T1KYpDV3Y7K+JIR+p1UwyhjtXYeNbb9WeyCxRqZU
	Kvi+JKnzEHTlt2bRoGtKM1NqG3FaDSLlRpwEg2pHoLOFCA7yWYEcMXQAkxnFFGNu5amqF2
	vA+5Z9DI6YHXEgCzDqHBKGxbKYhcPOw=
ARC-Authentication-Results: i=1;
	imf05.hostedemail.com;
	dkim=pass header.d=kernel.org header.s=k20201202 header.b=jjhkznrw;
	spf=pass (imf05.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org;
	dmarc=pass (policy=quarantine) header.from=kernel.org
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1775523941; a=rsa-sha256;
	cv=none;
	b=vXKi9CfaoU3jNJlC978hbMgCwDASEB+vgDKLRKIMSAY0HMjvC97T43sUI1t/igwfo7lwc6
	UgnfankcVdUmKOvr2nmRrv+wQPxr1QVQGBcDGkWQVFDR4DhBV32sNxrRVngn1vzesBb34m
	HhPdVPehnOS2fUQ3g+D1JJGhXTblrbM=
Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58])
	by tor.source.kernel.org (Postfix) with ESMTP id 5D8A860180;
	Tue,  7 Apr 2026 01:05:40 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1FD81C4CEF7;
	Tue,  7 Apr 2026 01:05:39 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1775523939;
	bh=CHHjSWicbWh2jqwNCAI5Ew+SQlasXIEElMMWDjLK6ow=;
	h=From:To:Cc:Subject:Date:From;
	b=jjhkznrw9Wu4dlwuvcFbRto6bas4MrZJ0YSyqM5DSlOO6N6TZlpwSzYBCRN0i3eBL
	 AzRZRLmjbt9aI56EALBlMVHhr6VnNNFQ/06TA7BHMvS8od6nA/MU+jY49p30Ddaa6X
	 OFLQkDCQABe7XIRhvfNu1ieHmzkSfNuvqJOATp+BGv4kou+jN3/zoH11VJ2DgRZTWS
	 jd5qdugPzfJOb2iD13tx1TYvJtV/IJaFqAuLeiv6yZoTzcPd1PS/z8rrkDNvnM7vx7
	 JZhQHWuAlBlZLN6kuQFGnSyO6OaP6txDcHEpb/Jkv9v95ol9JT+BEsxasU9hBcb/A7
	 EcNNNjHrBXiyA==
From: SeongJae Park <sj@kernel.org>
To: 
Cc: SeongJae Park <sj@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Brendan Higgins <brendan.higgins@linux.dev>,
	David Gow <davidgow@davidgow.net>,
	David Hildenbrand <david@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Lorenzo Stoakes <ljs@kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	Mike Rapoport <rppt@kernel.org>,
	Shuah Khan <shuah@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	damon@lists.linux.dev,
	kunit-dev@googlegroups.com,
	linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	linux-mm@kvack.org
Subject: [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio
Date: Mon,  6 Apr 2026 18:05:22 -0700
Message-ID: <20260407010536.83603-1-sj@kernel.org>
X-Mailer: git-send-email 2.47.3
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Rspam-User: 
X-Stat-Signature: hqb98sgd9a5sqyqj8zkrx8rcje739c7u
X-Rspamd-Queue-Id: 002BA100009
X-Rspamd-Server: rspam09
X-HE-Tag: 1775523940-76551
X-HE-Meta: U2FsdGVkX184eKKW5Y8eKjfolOCxgqlPfO/9X7/CxHlwhHkpA3iJncPo8abr71QeEto+/3qV4wqrZgo7ttwdE4ssHufKKLSBI1VTZjgQeDiF6Tc6TuFZ/2PTQggzOjBZ6R5hqM71ujE3hgAregjLjkLC7P2XLk5PcpQr6gGO2rjDLqlqcjxstL4RVw7lKoqnunoJMjVKBAj8Ui2w47HFoqBafloof3Z304ppdlVdgGkUmYbfeTNA2DVPthIVVs58YTGCgmQJeDk4rn3NpcEMagAKQ9iQ+zGmbpGQUrMoEueMuLBsWhjr7pPL/2I6Vlv3Txk0lHRcUMDnSucxeXpWQgh7n4DYJlCHrHI13YkQaQyRM7a1BQKb7+/3RFG5ZU/V72Nm9KqkO2Q3igeVGA3op14uas1UxL3C1tReZW1Do1xUCYqbpwgo+UMmmzvc8kCY75kM+uSAOIlEMGaKaFWgfkWPgA/B8daKgQanRW2PZksBNK7ko9IaWDIbvE4+ig1/BS5jfXPQFdxpfHE9JHrB7yEgYfgpFRkJeoQ2Kr29GWhx/XvKRIJv1Trkf9YRfHtsM7fTxp0JVuQYsZmz/12W/+CNaof9O4dK2/PgQWnbL1CkEdqw59rVW+POBR84D886VyxQpB4VguuiM/MgE5/p5UjvhpzhLX4WusNSExQE1F/VHeWwpX+MBxqP/Y6g6sUeZEyroWjhAJiKA4y/ZWemRY2kZcnAVH9jvSf08F2OGfbE7ELZs+bJoMyo/Ex9eDZRb7Jb1CGMCcSVszBOL84xgyCALHdrDPdI8/r/muDrBqd6DR438zIKO1FbeKbU5DNQAoUHRAv3xjxezD9btjPkB1ANW8chX8D5+LnN4Q7LUyM/ShC29KaqQtPpiODq1ulbgPoYlZsp9hlgbDTYDPuEWP0qfD/1PFhVdjsX6cDO9tZDlfYjLEg9Jg==
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

TL; DR: Let users set different DAMOS quota charge ratios for DAMOS
action failed regions, for deterministic and consistent DAMOS action
progress.

Common Reports: Unexpectedly Slow DAMOS
=======================================

One common issue report that we get from DAMON users is that DAMOS
action applying progress speed is sometimes much slower than expected.
And one common root cause is that the DAMOS quota is exceeded by the
action applying failed memory regions.

For example, a group of users tried to run DAMOS-based proactive memory
reclamation (DAMON_RECLAIM) with 100 MiB per second DAMOS quota.  They
ran it on a system having no active workload which means all memory of
the system is cold.  The expectation was that the system will show 100
MiB per second reclamation until (nearly) all memory is reclaimed. But
what they found is that the speed is quite inconsistent and sometimes it
becomes very slower than the expectation, sometimes even no reclamation
at all for about tens of seconds.  The upper limit of the speed (100 MiB
per second) was being kept as expected, though.

By monitoring the qt_exceeds (number of DAMOS quota exceed events) DAMOS
stat, we found DAMOS quota is always exceeded when the speed is slow. By
monitoring sz_tried and sz_applied (the total amount of DAMOS action
tried memory and succeeded memory) DAMOS stats together, we found the
reclamation attempts nearly always failed when the speed is slow.

DAMOS quota charges DAMOS action tried regions regardless of the
successfulness of the try.  Hence in the example reported case, there
was unreclaimable memory spread around the system memory.  Sometimes
nearly 100 MiB of memory that DAMOS tried to reclaim in the given quota
interval was reclaimable, and therefore showed nearly 100 MiB per second
speed.  Sometimes nearly 99 MiB of memory that DAMOS was trying to
reclaim in the given quota interval was unreclaimable, and therefore
showing only about 1 MiB per second reclaim speed.

We explained it is an expected behavior of the feature rather than a
bug, as DAMOS quota is there for only the upper-limit of the speed.  The
users agreed and later reported a huge win from the adoption of
DAMON_RECLAIM on their products.

It is Not a Bug but a Feature; But...
=====================================

So nothing is broken.  DAMOS quota is working as intended, as the upper
limit of the speed.  It also provides its behavior observability via
DAMOS stat.  In the real world production environment that runs long
term active workloads and matters stability, the speed sometimes being
slow is not a real problem.

But, the non-deterministic behavior is sometimes annoying, especially in
lab environments.  Even in a realistic production environment, when
there is a huge amount of DAMOS action unapplicable memory, the speed
could be problematically slow.  Let's suppose a virtual machines
provider that setup 99% of the host memory as hugetlb pages that cannot
be reclaimed, to give it to virtual machines.  Also, when aim-oriented
DAMOS auto-tuning is applied, this could also make the internal feedback
loop confused.

The intention of the current behavior was that trying DAMOS action to
regions would anyway impose some overhead, and therefore somehow be
charged.  But in the real world, the overhead for failed action is much
lighter than successful action.  Charging those at the same ratio may be
unfair, or at least suboptimum in some environments.

DAMOS Action Failed Region Quota Charge Ratio
=============================================

Let users set the charge ratio for the action-failed memory, for more
optimal and deterministic use of DAMOS.  It allows users to specify the
numerator and the denominator of the ratio for flexible setup.  For
example, let's suppose the numerator and the denominator are set to 1
and 4,096, respectively.  The ratio is 1 / 4,096.  A DAMOS scheme action
is applied to 5 GiB memory.  For 1 GiB of the memory, the action is
succeeded.  For the rest (4 GiB), the action is failed.  Then, only 1
GiB and 1 MiB quota is charged.

The optimal charge ratio will depend on the use case and
system/workload.  I'd recommend starting from setting the nominator as 1
and the denominator as PAGE_SIZE and tune based on the results, because
many DAMOS actions are applied at page level.

Tests
=====

I tested this feature in the steps below.

1. Allocate 50% of system memory and mlock() it using a test program.
2. Fill up the page cache to exhaust nearly all free memory.
3. Start DAMON-based proactive reclamation with 100 MiB/second DAMOS
   hard-quota.  Auto-tune the DAMOS soft-quota under the hard-quota for
   achieving 40% free memory of the system with 'temporal' tuner.

For step 1, I run a simple C program that is written by Gemini.  It is
quite straightforward, so I'm not sharing the code here.

For step 2, I use dd command like below:

   dd if=/dev/zero of=foo bs=1M count=$50_percent_of_system_memory

For step 3, I use the latest version of DAMON user-space tool (damo)
like below.

    sudo damo start --damos_action pageout \
            ` # Do the pageout only up to 100 MiB per second ` \
            --damos_quota_space 100M --damos_quota_interval 1s \
            ` # Auto-tune the quota below the hard quota aiming` \
            ` # 40% free memory of the node 0 ` \
            ` # (entire node of the test system)` \
            --damos_quota_goal node_mem_free_bp 40% 0 \
            ` # use temporal tuner, which is easy to understnd ` \
            --damos_quota_goal_tuner temporal

As expected, the progress of the reclamation is not consistent, because
the quota is exceeded for the failed reclamation of the unreclaimable
memory.

I do this again, but with the failed region charge ratio feature.  For
this, the above 'damo' command is used, after appending command line
option for setup of the charge ratio like below.  Note that the option
was added to 'damo' after v3.1.9.

    sudo ./damo start --damos_action pageout \
            [...]
            ` # quota-charge only 1/4096 for pageout-failed regions ` \
            --damos_quota_fail_charge_ratio 1 4096

The progress of the reclamation was nearly 100 MiB per second until the
goal was achieved, meeting the expectation.

Patches Sequence
================

Patch 1 updates fully charged quota check to handle <min_region_sz
remaining quota, which will be able to exist after this series is
applied.  Patch 2 implements the feature and exposes it via DAMON core
API.  Patch 3 implements DAMON sysfs ABI for the feature.  Three
following patches (4-6) document the feature and ABI on design, usage,
and ABI documents, respectively.  Four patches for testing of the new
feature follow.  Patch 7 implements a kunit test for the feature.
Patches 8 and 9 extend DAMON selftest helpers for DAMON sysfs control
and internal state dumping for adding a new selftest for the feature.
Patch 10 extends existing DAMON sysfs interface selftest to test the new
feature using the extended helper scripts.

Changelog
=========

Changes from RFC v2
(https://lore.kernel.org/20260405151232.102690-1-sj@kernel.org)
- Handle <min_region_sz remaining quota.
- Document zero denum behavior.
- Fix typos: s/selftets/selftests/
Changes from RFC v1
(https://lore.kernel.org/20260404163943.89278-1-sj@kernel.org)
- Avoid overflows in charge amount calculation.
- Fix/wordsmith documentation for grammar, typo, and wrong examples.
- Improve unit test for more consistent comparison source use.

SeongJae Park (10):
  mm/damon/core: handle <min_region_sz remaining quota as empty
  mm/damon/core: introduce failed region quota charge ratio
  mm/damon/sysfs-schemes: implement fail_charge_{num,denom} files
  Docs/mm/damon/design: document fail_charge_{num,denom}
  Docs/admin-guide/mm/damon/usage: document fail_charge_{num,denom}
    files
  Docs/ABI/damon: document fail_charge_{num,denom}
  mm/damon/tests/core-kunit: test fail_charge_{num,denom} committing
  selftests/damon/_damon_sysfs: support failed region quota charge ratio
  selftests/damon/drgn_dump_damon_status: support failed region quota
    charge ratio
  selftests/damon/sysfs.py: test failed region quota charge ratio

 .../ABI/testing/sysfs-kernel-mm-damon         | 12 +++++
 Documentation/admin-guide/mm/damon/usage.rst  | 18 +++++--
 Documentation/mm/damon/design.rst             | 22 ++++++++
 include/linux/damon.h                         |  9 ++++
 mm/damon/core.c                               | 38 ++++++++++---
 mm/damon/sysfs-schemes.c                      | 54 +++++++++++++++++++
 mm/damon/tests/core-kunit.h                   |  6 +++
 tools/testing/selftests/damon/_damon_sysfs.py | 21 +++++++-
 .../selftests/damon/drgn_dump_damon_status.py |  2 +
 tools/testing/selftests/damon/sysfs.py        |  6 +++
 10 files changed, 175 insertions(+), 13 deletions(-)


base-commit: b1ca86c92674eaf92a32ce3a2d89a0349e406df1
-- 
2.47.3