Re: Two simple ideas for DAMON accuracy improvement

All of lore.kernel.org
 help / color / mirror / Atom feed

From: SeongJae Park <sj@kernel.org>
To: SeongJae Park <sj@kernel.org>
Cc: damon@lists.linux.dev, kernel-team@meta.com
Subject: Re: Two simple ideas for DAMON accuracy improvement
Date: Thu, 13 Feb 2025 14:23:03 -0800	[thread overview]
Message-ID: <20250213222303.244724-1-sj@kernel.org> (raw)
In-Reply-To: <20241026215311.148363-1-sj@kernel.org>

On Sat, 26 Oct 2024 14:53:11 -0700 SeongJae Park <sj@kernel.org> wrote:

> Hello DAMON community,
> 
> 
> There were a number of grateful questions, concerns, and improvement ideas
> around monitoring output accuracy of DAMON.  I always admitted the fact that
> DAMON has many rooms for improvement, but was bit awary at changes for some
> reasons.  Now I think it caused some unnecessarily long delay.  Sorry about
> that.  Now I want to invest some time on the topic.  So starting by sharing
> below two simple ideas first.
[...]
> 
> Periodic Fine-grain Split of Aged Regions
> -----------------------------------------
> 
> If a region is continuously changing its boundary and access temperature, it
> means it is converging, or the access pattern of the workload is not
> stabilized.  Either case, this is a healthy signal.
> 
> If a region is consistently showing same access pattern for long time, it may
> because the access pattern is stabilized, and the region is correctly
> converged.  However, it might be because the access pattern is changed, but the
> converging is slow.
> 
> To avoid the too slow converging of aged regions, we will let users
> periodically increase the split factor for regions that kept current access
> pattern for long time (high 'age').  Users will be able to set the 'age'
> offset, the split factor for the aged regions, and time interval between the
> periodic fine-grain split of the regions.  For example, users can ask DAMON to
> "split regions keeping current access pattern for ten minutes or higher to five
> sub-regions every minute".

This means that users need to answer three questions.  1) How frequently, 2)
for how long regions, and 3) into how many sub-regions the splitting should be
done.  It seems too dificult to answer.  To make it simpler to answer, and
still preserve the effect of the original idea, I'd like to adjust the idea as:
"Periodically split regions without limiting number of resulting sub-regions
per region, while keeping the aimed number of total regions after the split."
For example, if there are three regions of different sizes, slice any region
any number of times if it results in making the total number of regions six.
Where to slice will be random, but would be uniformly distributed in a large
scale, to avoid too much bias.  Having the distance between the slicing lines
same and randomize only first line's position can be a simplest implementation.

The updated scheme asks users only how frequently the new split method needs to
be used, so reducing the number of questions from three to one.  Obviously one
question is easier to answer than three questions.

Huge regions will be splitted finer than now, so what we wanted to achieve with
the original version of this idea is still kept.  Unlike the original version
of this idea, it will do the fine splitting for even young huge regions.  But
the unnecessary splits will be reverted with upcoming regions merging.  Samll
regions may not be splitted with the new approach, and it can slow down
converging small regions.  But the next split operation will do the per-region
split, and user can set the frequency of the new split method.

This will make micro-target region split difficult.  But such micro-targetting
is anyway challenging for users.  If they really know the answers, they can
reform regions as they want by online-committing of target regions.

Please let me know if you have any concern or question about this updated idea.

Thanks,
SJ

[...]

     prev parent reply	other threads:[~2025-02-13 22:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-26 21:53 Two simple ideas for DAMON accuracy improvement SeongJae Park
2025-01-18  1:47 ` SeongJae Park
2025-02-13 22:23 ` SeongJae Park [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250213222303.244724-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=damon@lists.linux.dev \
    --cc=kernel-team@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.