public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [Question] mm/damon: min_nr_regions constraint and inconsistent error handling
@ 2026-03-18 15:37 Liew Rui Yan
  2026-03-19  1:23 ` SeongJae Park
  0 siblings, 1 reply; 5+ messages in thread
From: Liew Rui Yan @ 2026-03-18 15:37 UTC (permalink / raw)
  To: sj; +Cc: damon, linux-mm

Hi SeongJae,

TL;DR
=====
1. Why does DAMON require min_nr_regions >= 3? Is there a technical
   rationale?
2. Could we improve error reporting when sysfs parameter validation
   fails? Or document it?
3. [Bug?] DAMON_LRU_SORT doesn't disable itself when commit_inputs fails,
   contrary to documentation.

Details below. Happy to help with patches if helpful.

Details
=======

1. Error feedback for min_nr_regions validation
-----------------------------------------------

While configuring DAMON_LRU_SORT via sysfs, I noticed that setting
'min_nr_regions' to a value < 3 (e.g., 1) succeeds at write time, but
the error only appears later when writing 'Y' to the 'enabled' file,
which returns -EINVAL.

Since multiple parameters (intervals, watermarks, etc.) are often
configured together, a generic -EINVAL at enablement makes it difficult
to identify which specific parameter violated constraints.

Code reference: mm/damon/core.c:damon_set_attrs()

    if (attrs->min_nr_regions < 3)
        return -EINVAL;

Question:
1. What is the design rationale for requiring min_nr_regions >= 3? (eg.,
   algorithmic requirements, sampling accuracy, etc.)
2. Would it be acceptable to improve user feedback? For example:
   - Document this lower bound explicitly in
     Documentation/admin-guide/mm/damon/lru_sort.rst
   - Add pr_warn()/pr_debug() in damon_set_attrs() to log which
     parameter failed validation and why

I believe clearer errors or document would make DAMON tuning more
intuitive for sysadmins and developers.

---

2. Potential inconsistency in DAMON_LRU_SORT error handling
-----------------------------------------------------------

The documentation [1] states:
> "If invalid parameters are found while the re-reading, DAMON_LRU_SORT
 will be disabled."

However, testing on v7.0-rc4 shows that when commit_inputs fails (e.g.,
due to min_nr_regions < 3):
- The 'enabled' parameter remains 'Y'
- The kdamond thread continues running

This behavior may confuse users who expect the module to safely disable
itself upon configuration errors. Should commit_inputs failures trigger
an explicit disable (enabled=N)?

Reproduction steps:
-------------------
VM: virtme-ng 1.40 (x86_64)
Kernel: v7.0-rc4

    cd /sys/module/damon_lru_sort/parameters
    cat enabled 
    N

    # Initial setup - works fine
    echo 65535 > max_nr_regions 
    echo 3 > min_nr_regions 
    echo Y > enabled 
    cat kdamond_pid 
    91

    # Observed behavior:
    echo 1 > min_nr_regions 
    echo Y > commit_inputs 
    cat kdamond_pid 
    91
    cat enabled 
    Y

    # [kdamond.0] still alive
    ps aux | rg "kdamond"
    root        91  0.0  0.0      0     0 ?        I    22:03   0:00 [kdamond.0]

If this is confirmed as unintended behavior, I'd be glad to help
investigate and submit a fix.

---

Thanks for your time reviewing this!

Best regards,
Rui Yan

[1] https://docs.kernel.org/admin-guide/mm/damon/lru_sort.html

[...]


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] mm/damon: min_nr_regions constraint and inconsistent error handling
  2026-03-18 15:37 [Question] mm/damon: min_nr_regions constraint and inconsistent error handling Liew Rui Yan
@ 2026-03-19  1:23 ` SeongJae Park
  2026-03-19  1:46   ` SeongJae Park
  0 siblings, 1 reply; 5+ messages in thread
From: SeongJae Park @ 2026-03-19  1:23 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm

Hi Liew,

On Wed, 18 Mar 2026 23:37:31 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> Hi SeongJae,
> 
> TL;DR
> =====
> 1. Why does DAMON require min_nr_regions >= 3? Is there a technical
>    rationale?
> 2. Could we improve error reporting when sysfs parameter validation
>    fails? Or document it?
> 3. [Bug?] DAMON_LRU_SORT doesn't disable itself when commit_inputs fails,
>    contrary to documentation.

Thank you for nice questions, I will reply below in line.

> 
> Details below. Happy to help with patches if helpful.
> 
> Details
> =======
> 
> 1. Error feedback for min_nr_regions validation
> -----------------------------------------------
> 
> While configuring DAMON_LRU_SORT via sysfs, I noticed that setting
> 'min_nr_regions' to a value < 3 (e.g., 1) succeeds at write time, but
> the error only appears later when writing 'Y' to the 'enabled' file,
> which returns -EINVAL.
> 
> Since multiple parameters (intervals, watermarks, etc.) are often
> configured together, a generic -EINVAL at enablement makes it difficult
> to identify which specific parameter violated constraints.
> 
> Code reference: mm/damon/core.c:damon_set_attrs()
> 
>     if (attrs->min_nr_regions < 3)
>         return -EINVAL;
> 
> Question:
> 1. What is the design rationale for requiring min_nr_regions >= 3? (eg.,
>    algorithmic requirements, sampling accuracy, etc.)

The initial version of DAMON was supporting only 'vaddr' operation set.  And
'vaddr' is designed to have at least three regions, for the two large unmapped
areas on normal virtual address spaces.  Hence we enforced min_nr_regions to be
three or larger.

There is no real reason to keep it for 'fvaddr' and 'paddr'.  Maybe we can lift
the restriction for 'fvaddr' and 'paddr' operation sets.  But, we found no real
reason to lift it, either.  As a result, the restriction is still there.  If
you find a reason to lift the restriction, please feel free to let me know, or
send patches for that.

The VMA-based target address range construction desgin doc [1] should explain
more details about the two large unmapped areas.  Please feel free to ask more
clarifications if it doesn't give you enough details.

> 2. Would it be acceptable to improve user feedback? For example:
>    - Document this lower bound explicitly in
>      Documentation/admin-guide/mm/damon/lru_sort.rst

Yes, I think this is a nice idea.  Maybe adding a comment on damon_set_attrs(),
or additional sentences to 'Monitoring' section of design.rst [2] would be
appropriate.

>    - Add pr_warn()/pr_debug() in damon_set_attrs() to log which
>      parameter failed validation and why

I'd like to be cautious about adding logging, as it might be a spam.  I'd
prefer making it documented well.  Then, users could find what is wrong.

> 
> I believe clearer errors or document would make DAMON tuning more
> intuitive for sysadmins and developers.

I agree, the documentation should be helpful.

> 
> ---
> 
> 2. Potential inconsistency in DAMON_LRU_SORT error handling
> -----------------------------------------------------------
> 
> The documentation [1] states:
> > "If invalid parameters are found while the re-reading, DAMON_LRU_SORT
>  will be disabled."
> 
> However, testing on v7.0-rc4 shows that when commit_inputs fails (e.g.,
> due to min_nr_regions < 3):
> - The 'enabled' parameter remains 'Y'
> - The kdamond thread continues running
> 
> This behavior may confuse users who expect the module to safely disable
> itself upon configuration errors. Should commit_inputs failures trigger
> an explicit disable (enabled=N)?

Nice finding!  In short, it is an intended behavior, but the document is
outdated.

In the past, the documentation was correct.  When a wrong input is committed,
DAMON_LRU_SORT was stopped.  That's because it was using damon_callback, which
is now removed [3].  The callback was executed inside DAMON main loop, and when
the callback returns an error, the loop was broken.  As a result, DAMON
execution was stopped.  DAMON_LRU_SORT was setting the parameter inside the
callback, and was returning an error when the parameter is invalid.  Hence the
behavior was documented as you noticed.

Commit a30969436428 ("mm/damon/lru_sort: use damon_commit_ctx()") changed
the behavior.  It changed DAMON_LRU_SORT to use damon_call() instead of
damon_callback.  Because damon_call()'s callback function failure doesn't stop
DAMON, the current behavior is made.

Still, committing wrong DAMON_LRU_SORT parameters returns an error to the user.
Hence user can notify it was failed.  The invalid parameters are not really be
committed to DAMON, so DAMON continues running with valid parameters.  So this
is technically speaking a visible user behavior change.  But we believed it is
ok because we were unable to expect users who depend on the old behavior.  I
think I discussed this somewhere, but I cannot find the link right now...

> 
> Reproduction steps:
> -------------------
> VM: virtme-ng 1.40 (x86_64)
> Kernel: v7.0-rc4
> 
>     cd /sys/module/damon_lru_sort/parameters
>     cat enabled 
>     N
> 
>     # Initial setup - works fine
>     echo 65535 > max_nr_regions 
>     echo 3 > min_nr_regions 
>     echo Y > enabled 
>     cat kdamond_pid 
>     91
> 
>     # Observed behavior:
>     echo 1 > min_nr_regions 
>     echo Y > commit_inputs 

I believe the above command should returned an error.  If not, please let me
know.

>     cat kdamond_pid 
>     91
>     cat enabled 
>     Y
> 
>     # [kdamond.0] still alive
>     ps aux | rg "kdamond"
>     root        91  0.0  0.0      0     0 ?        I    22:03   0:00 [kdamond.0]
> 
> If this is confirmed as unintended behavior, I'd be glad to help
> investigate and submit a fix.

So, this is an intended behavior.  But the document is outdated.  It would be
nice if you could send a patch for fixing the document.

Also, if you know users who depend on the old behavior and therefore hoping the
old behavior back, please let me know.

> 
> ---
> 
> Thanks for your time reviewing this!

Thank you for sharing your findings and questions!

> 
> Best regards,
> Rui Yan
> 
> [1] https://docs.kernel.org/admin-guide/mm/damon/lru_sort.html
> 
> [...]

[1] https://docs.kernel.org/mm/damon/design.html#vma-based-target-address-range-construction
[2] https://docs.kernel.org/mm/damon/design.html#monitoring
[3] commit 5add26c0a186 ("mm/damon/core: remove damon_callback")


Thanks,
SJ


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] mm/damon: min_nr_regions constraint and inconsistent error handling
  2026-03-19  1:23 ` SeongJae Park
@ 2026-03-19  1:46   ` SeongJae Park
  2026-03-19  8:36     ` Liew Rui Yan
  0 siblings, 1 reply; 5+ messages in thread
From: SeongJae Park @ 2026-03-19  1:46 UTC (permalink / raw)
  To: SeongJae Park; +Cc: Liew Rui Yan, damon, linux-mm

On Wed, 18 Mar 2026 18:23:36 -0700 SeongJae Park <sj@kernel.org> wrote:

> Hi Liew,
> 
> On Wed, 18 Mar 2026 23:37:31 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:
[...]
> > 2. Potential inconsistency in DAMON_LRU_SORT error handling
> > -----------------------------------------------------------
> > 
> > The documentation [1] states:
> > > "If invalid parameters are found while the re-reading, DAMON_LRU_SORT
> >  will be disabled."
> > 
> > However, testing on v7.0-rc4 shows that when commit_inputs fails (e.g.,
> > due to min_nr_regions < 3):
> > - The 'enabled' parameter remains 'Y'
> > - The kdamond thread continues running
> > 
> > This behavior may confuse users who expect the module to safely disable
> > itself upon configuration errors. Should commit_inputs failures trigger
> > an explicit disable (enabled=N)?
> 
> Nice finding!  In short, it is an intended behavior, but the document is
> outdated.
> 
> In the past, the documentation was correct.  When a wrong input is committed,
> DAMON_LRU_SORT was stopped.  That's because it was using damon_callback, which
> is now removed [3].  The callback was executed inside DAMON main loop, and when
> the callback returns an error, the loop was broken.  As a result, DAMON
> execution was stopped.  DAMON_LRU_SORT was setting the parameter inside the
> callback, and was returning an error when the parameter is invalid.  Hence the
> behavior was documented as you noticed.
> 
> Commit a30969436428 ("mm/damon/lru_sort: use damon_commit_ctx()") changed
> the behavior.  It changed DAMON_LRU_SORT to use damon_call() instead of
> damon_callback.  Because damon_call()'s callback function failure doesn't stop
> DAMON, the current behavior is made.
> 
> Still, committing wrong DAMON_LRU_SORT parameters returns an error to the user.
> Hence user can notify it was failed.  The invalid parameters are not really be
> committed to DAMON, so DAMON continues running with valid parameters.  So this
> is technically speaking a visible user behavior change.  But we believed it is
> ok because we were unable to expect users who depend on the old behavior.  I
> think I discussed this somewhere, but I cannot find the link right now...

I found it.  Commit 4c9ea539ad59 ("mm/damon/sysfs: validate user inputs from
damon_sysfs_commit_input()") discuss this.  To quote:

    Online DAMON parameters commit via DAMON sysfs interface can make kdamond
    stop.  This behavior was made because it can make the implementation
    simpler.  The implementation tries committing the parameter without
    validation.  If it finds something wrong in the middle of the parameters
    update, it returns error without reverting the partially committed
    parameters back.  It is safe though, since it immediately breaks kdamond
    main loop in the case of the error return.

    Users can make the wrong parameters by mistake, though.  Stopping kdamond
    in the case is not very useful behavior.  Also this makes it difficult to
    utilize damon_call() instead of damon_callback hook for online parameters
    update, since damon_call() cannot immediately break kdamond main loop in
    the middle.

    Validate the input parameters and return error when it fails before
    starting parameters updates.  In case of mistakenly wrong parameters,
    kdamond can continue running with the old and valid parameters.


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] mm/damon: min_nr_regions constraint and inconsistent error handling
  2026-03-19  1:46   ` SeongJae Park
@ 2026-03-19  8:36     ` Liew Rui Yan
  2026-03-19 15:15       ` SeongJae Park
  0 siblings, 1 reply; 5+ messages in thread
From: Liew Rui Yan @ 2026-03-19  8:36 UTC (permalink / raw)
  To: sj; +Cc: aethernet65535, damon, linux-mm

On Thu, Mar 19, 2026 at 9:46 AM SeongJae Park <sj@kernel.org> wrote:

> Hi Liew,

Hi SeongJae,

> The initial version of DAMON was supporting only 'vaddr' operation set.  And
> 'vaddr' is designed to have at least three regions, for the two large unmapped
> areas on normal virtual address spaces.  Hence we enforced min_nr_regions to be
> three or larger.

Thanks for your explanation! It makes perfect sense.

> Yes, I think this is a nice idea.  Maybe adding a comment on damon_set_attrs(),
> or additional sentences to 'Monitoring' section of design.rst [2] would be
> appropriate.

Agreed. I will try to figure out which method is better, and I might
send out a patch in the near future.

> I'd like to be cautious about adding logging, as it might be a spam.  I'd
> prefer making it documented well.  Then, users could find what is wrong.

I see. I understand your concern about log spam, I will focus on
documentation.

> I believe the above command should returned an error.  If not, please let me
> know.

I re-tested it on v7.0-rc4 and it did NOT return any error when writing
'Y' to 'commit_inputs' with an invalid 'min_nr_regions=1'.

This might be unintended. Here is the full log of my session:

    [root@virtme-ng x86_64]# cd /sys/module/damon_lru_sort/parameters/
    [root@virtme-ng parameters]# cat enabled 
    N
    [root@virtme-ng parameters]# echo 65535 > max_nr_regions
    [root@virtme-ng parameters]# echo 3 > min_nr_regions 
    [root@virtme-ng parameters]# echo Y > enabled 
    [root@virtme-ng parameters]# cat kdamond_pid 
    89
    [root@virtme-ng parameters]# echo 1 > min_nr_regions 
    [root@virtme-ng parameters]# echo Y > commit_inputs 
    [root@virtme-ng parameters]# cat kdamond_pid 
    89
    [root@virtme-ng parameters]# cat enabled 
    Y
    [root@virtme-ng parameters]# ps aux | rg "kdamond" | rg -v "rg"
    root        89  0.0  0.0      0     0 ?        I    10:15   0:00 [kdamond.0]

> Also, if you know users who depend on the old behavior and therefore hoping the
> old behavior back, please let me know.

I don't know of anyone who depends on the old behavior.
I am a student exploring the kernel (DAMON) as a hobby, so my
perspective is limited to individual research rather than large-scale
production environments. However, the current behavior (staying alive
with valid params) feels much more sensible to me. But it should return
an error when 'commit_inputs=Y'.

---

Plan
====
Based on our discussion, my current plan is to:
1. Document the 'min_nr_regions' constraint in admin-guide.
2. Add the design rationale for the "at least 3 regions" rule in
   design.rst.
3. Remove the outdated sentences claiming DAMON will disable itself on
   invalid parameters input.
4. Investigate and try to fix the missing error return for
   'commit_inputs'.

Does this plan cover everything you'd expect?

Best regards,
Rui Yan

[...]


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] mm/damon: min_nr_regions constraint and inconsistent error handling
  2026-03-19  8:36     ` Liew Rui Yan
@ 2026-03-19 15:15       ` SeongJae Park
  0 siblings, 0 replies; 5+ messages in thread
From: SeongJae Park @ 2026-03-19 15:15 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm

On Thu, 19 Mar 2026 16:36:39 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> On Thu, Mar 19, 2026 at 9:46 AM SeongJae Park <sj@kernel.org> wrote:
> 
> > Hi Liew,
> 
> Hi SeongJae,
> 
> > The initial version of DAMON was supporting only 'vaddr' operation set.  And
> > 'vaddr' is designed to have at least three regions, for the two large unmapped
> > areas on normal virtual address spaces.  Hence we enforced min_nr_regions to be
> > three or larger.
> 
> Thanks for your explanation! It makes perfect sense.
> 
> > Yes, I think this is a nice idea.  Maybe adding a comment on damon_set_attrs(),
> > or additional sentences to 'Monitoring' section of design.rst [2] would be
> > appropriate.
> 
> Agreed. I will try to figure out which method is better,

Doing both may also make sense.

> and I might
> send out a patch in the near future.

Looking forward!

> 
> > I'd like to be cautious about adding logging, as it might be a spam.  I'd
> > prefer making it documented well.  Then, users could find what is wrong.
> 
> I see. I understand your concern about log spam, I will focus on
> documentation.
> 
> > I believe the above command should returned an error.  If not, please let me
> > know.
> 
> I re-tested it on v7.0-rc4 and it did NOT return any error when writing
> 'Y' to 'commit_inputs' with an invalid 'min_nr_regions=1'.
> 
> This might be unintended. Here is the full log of my session:
> 
>     [root@virtme-ng x86_64]# cd /sys/module/damon_lru_sort/parameters/
>     [root@virtme-ng parameters]# cat enabled 
>     N
>     [root@virtme-ng parameters]# echo 65535 > max_nr_regions
>     [root@virtme-ng parameters]# echo 3 > min_nr_regions 
>     [root@virtme-ng parameters]# echo Y > enabled 
>     [root@virtme-ng parameters]# cat kdamond_pid 
>     89
>     [root@virtme-ng parameters]# echo 1 > min_nr_regions 
>     [root@virtme-ng parameters]# echo Y > commit_inputs 
>     [root@virtme-ng parameters]# cat kdamond_pid 
>     89
>     [root@virtme-ng parameters]# cat enabled 
>     Y
>     [root@virtme-ng parameters]# ps aux | rg "kdamond" | rg -v "rg"
>     root        89  0.0  0.0      0     0 ?        I    10:15   0:00 [kdamond.0]

Thank you for sharing this test results.  I also confirmed you are right.  My
above assumption (error on commit_inputs writing) is simply wrong.  In the case
of wrong argument writing, it just silently fails because the commit_inputs
handling is done in background.  So, not a bug.  But maybe something better to
be improved.  I think it is better to return an explicit error and/or document
the behavior in near future.

> 
> > Also, if you know users who depend on the old behavior and therefore hoping the
> > old behavior back, please let me know.
> 
> I don't know of anyone who depends on the old behavior.
> I am a student exploring the kernel (DAMON) as a hobby, so my
> perspective is limited to individual research rather than large-scale
> production environments.

Thank you for sharing the context.  People from different background gives us
very different and important perspectives.  I appreciate all such voices.  And
knowing the context of from where it was made is very helpful at discussing
suggested changes.  Thank you again.

> However, the current behavior (staying alive
> with valid params) feels much more sensible to me. But it should return
> an error when 'commit_inputs=Y'.

I agree.  Maybe something that we can improve.

> 
> ---
> 
> Plan
> ====
> Based on our discussion, my current plan is to:
> 1. Document the 'min_nr_regions' constraint in admin-guide.
> 2. Add the design rationale for the "at least 3 regions" rule in
>    design.rst.
> 3. Remove the outdated sentences claiming DAMON will disable itself on
>    invalid parameters input.

I found an issue in the current behavior, and just posted a fix.  Assuming the
fix is merged, the behavior will again be changed.  So, please do this only
after the fix is merged or completely abandoned.

If the fix is merged, the changed behavior will be something like,

"It will continue running with old valid parameters if the parameters are
wrong.  It will stop running if the parameters update was unable to be
completed due to internal error."

> 4. Investigate and try to fix the missing error return for
>    'commit_inputs'.
> 
> Does this plan cover everything you'd expect?

Yes, this sounds like a very nice plan.  I'm looking forward to your next mail!
:)


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-19 15:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-18 15:37 [Question] mm/damon: min_nr_regions constraint and inconsistent error handling Liew Rui Yan
2026-03-19  1:23 ` SeongJae Park
2026-03-19  1:46   ` SeongJae Park
2026-03-19  8:36     ` Liew Rui Yan
2026-03-19 15:15       ` SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox