public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: Josh Law <objecting@objecting.org>
Cc: SeongJae Park <sj@kernel.org>,
	akpm@linux-foundation.org, damon@lists.linux.dev,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/2] mm/damon/core: eliminate hot-path integer division in damon_max_nr_accesses()
Date: Tue, 24 Mar 2026 00:19:08 -0700	[thread overview]
Message-ID: <20260324071908.89152-1-sj@kernel.org> (raw)
In-Reply-To: <20260322214325.260007-3-objecting@objecting.org>

On Sun, 22 Mar 2026 21:43:25 +0000 Josh Law <objecting@objecting.org> wrote:

> Hardware integer division is slow. The function damon_max_nr_accesses(),
> which is called very frequently (e.g., once per region per sample
> interval inside damon_update_region_access_rate), performs an integer
> division: attrs->aggr_interval / attrs->sample_interval.
> 
> However, the struct damon_attrs already caches this exact ratio in the
> internal field aggr_samples (since earlier commits). We can eliminate
> the hardware division in the hot path by simply returning aggr_samples.
> 
> This significantly reduces the CPU cycle overhead of updating the access
> rates for thousands of regions.
> 
> Signed-off-by: Josh Law <objecting@objecting.org>
> ---
>  include/linux/damon.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/damon.h b/include/linux/damon.h
> index 6bd71546f7b2..438fe6f3eab4 100644
> --- a/include/linux/damon.h
> +++ b/include/linux/damon.h
> @@ -960,8 +960,7 @@ static inline bool damon_target_has_pid(const struct damon_ctx *ctx)
>  static inline unsigned int damon_max_nr_accesses(const struct damon_attrs *attrs)
>  {
>  	/* {aggr,sample}_interval are unsigned long, hence could overflow */
> -	return min(attrs->aggr_interval / attrs->sample_interval,
> -			(unsigned long)UINT_MAX);
> +	return min_t(unsigned long, attrs->aggr_samples, UINT_MAX);
>  }

I just found this patch causes below divide-by-zero when
tools/testing/selftets/damon/sysfs.py is executed.

'''
[   42.462039] Oops: divide error: 0000 [#1] SMP NOPTI
[   42.463673] CPU: 4 UID: 0 PID: 2044 Comm: kdamond.0 Not tainted 7.0.0-rc4-mm-new-damon+ #354 PREEMPT(full)
[   42.465193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   42.466590] RIP: 0010:damon_set_attrs (mm/damon/core.c:769 (discriminator 1) mm/damon/core.c:775 (discriminator 1) mm/damon/core.c:786 (discriminator 1) mm/damon/core.c:827 (discriminator 1) mm/damon/core.c:897 (discriminator 1))
[   42.467287] Code: 48 39 c5 0f 84 dd 00 00 00 41 bb 59 17 b7 d1 48 8b 43 10 4c 8d 43 10 48 8d 48 e0 49 39 c0 75 5b e9 b0 00 00 00 8b 41 18 31 d2 <41> f7 f6 41 89 c5 69 c2 10 27 00 00 31 d2 45 69 ed 10 27 00 00 41
All code
========
   0:   48 39 c5                cmp    %rax,%rbp
   3:   0f 84 dd 00 00 00       je     0xe6
   9:   41 bb 59 17 b7 d1       mov    $0xd1b71759,%r11d
   f:   48 8b 43 10             mov    0x10(%rbx),%rax
  13:   4c 8d 43 10             lea    0x10(%rbx),%r8
  17:   48 8d 48 e0             lea    -0x20(%rax),%rcx
  1b:   49 39 c0                cmp    %rax,%r8
  1e:   75 5b                   jne    0x7b
  20:   e9 b0 00 00 00          jmp    0xd5
  25:   8b 41 18                mov    0x18(%rcx),%eax
  28:   31 d2                   xor    %edx,%edx
  2a:*  41 f7 f6                div    %r14d            <-- trapping instruction
  2d:   41 89 c5                mov    %eax,%r13d
  30:   69 c2 10 27 00 00       imul   $0x2710,%edx,%eax
  36:   31 d2                   xor    %edx,%edx
  38:   45 69 ed 10 27 00 00    imul   $0x2710,%r13d,%r13d
  3f:   41                      rex.B

Code starting with the faulting instruction
===========================================
   0:   41 f7 f6                div    %r14d
   3:   41 89 c5                mov    %eax,%r13d
   6:   69 c2 10 27 00 00       imul   $0x2710,%edx,%eax
   c:   31 d2                   xor    %edx,%edx
   e:   45 69 ed 10 27 00 00    imul   $0x2710,%r13d,%r13d
  15:   41                      rex.B
[   42.470046] RSP: 0018:ffffd25c4586bcb0 EFLAGS: 00010246
[   42.470818] RAX: 0000000000000000 RBX: ffff891346919400 RCX: ffff8913502dd040
[   42.471923] RDX: 0000000000000000 RSI: ffff891348527600 RDI: ffff891344d94400
[   42.472972] RBP: ffff891344d94598 R08: ffff891346919410 R09: 0000000000000000
[   42.474028] R10: 0000000000000000 R11: 00000000d1b71759 R12: 0000000000000014
[   42.475104] R13: ffff891348527778 R14: 0000000000000000 R15: ffff891348527798
[   42.476191] FS:  0000000000000000(0000) GS:ffff89149efd8000(0000) knlGS:0000000000000000
[   42.477375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   42.478235] CR2: 000000003a84f080 CR3: 000000004a824000 CR4: 00000000000006f0
[   42.479291] Call Trace:
[   42.479670]  <TASK>
[   42.480044]  damon_commit_ctx (mm/damon/core.c:1538)
[   42.480660]  damon_sysfs_commit_input (mm/damon/sysfs.c:2153 mm/damon/sysfs.c:2181)
[   42.481389]  kdamond_call (mm/damon/core.c:3186)
[   42.482492]  kdamond_fn (mm/damon/core.c:3428)
[   42.483041]  ? kthread_affine_node (kernel/kthread.c:377)
[   42.483766]  ? kfree (include/linux/kmemleak.h:50 mm/slub.c:2610 mm/slub.c:6165 mm/slub.c:6483)
[   42.484257]  ? __pfx_kdamond_fn (mm/damon/core.c:3368)
[   42.484855]  ? __pfx_kdamond_fn (mm/damon/core.c:3368)
[   42.485459]  kthread (kernel/kthread.c:436)
[   42.485959]  ? __pfx_kthread (kernel/kthread.c:381)
[   42.486524]  ret_from_fork (arch/x86/kernel/process.c:164)
[   42.487105]  ? __pfx_kthread (kernel/kthread.c:381)
[   42.487668]  ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
[   42.488304]  </TASK>
'''

That's because damon_commit_ctx() is called to a context that just generated
using damon_new_ctx(), which doesn't set the aggr_samples.  After applying
below change, the divide-by-zero is gone.

'''
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -676,6 +676,7 @@ struct damon_ctx *damon_new_ctx(void)
        ctx->attrs.sample_interval = 5 * 1000;
        ctx->attrs.aggr_interval = 100 * 1000;
        ctx->attrs.ops_update_interval = 60 * 1000 * 1000;
+       ctx->attrs.aggr_samples = 20;

        ctx->passed_sample_intervals = 0;
        /* These will be set from kdamond_init_ctx() */
'''

Also, kunit crashes like below.

'''
$ ./tools/testing/kunit/kunit.py run --kunitconfig mm/damon/tests/
[00:07:19] Configuring KUnit Kernel ...
[00:07:19] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=8
[00:08:11] Starting KUnit Kernel (1/1)...
[00:08:11] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[00:08:11] =================== damon (28 subtests) ====================
[00:08:11] [PASSED] damon_test_target
[00:08:11] [PASSED] damon_test_regions
[00:08:11] [PASSED] damon_test_aggregate
[00:08:11] [PASSED] damon_test_split_at
[00:08:11] [PASSED] damon_test_merge_two
[00:08:11] [PASSED] damon_test_merge_regions_of
[00:08:11] [PASSED] damon_test_split_regions_of
[00:08:11] [PASSED] damon_test_ops_registration
[00:08:11] [PASSED] damon_test_set_regions
[00:08:11] [ERROR] Test: damon: missing expected subtest!
[00:08:11] Kernel panic - not syncing: Kernel mode signal 4
'''

It can run without the error after below changes are applied:

'''
--- a/mm/damon/tests/core-kunit.h
+++ b/mm/damon/tests/core-kunit.h
@@ -514,6 +514,8 @@ static void damon_test_nr_accesses_to_accesses_bp(struct kunit *test)
                .aggr_interval = ((unsigned long)UINT_MAX + 1) * 10
        };

+       attrs.aggr_samples = attrs.aggr_interval / attrs.sample_interval;
+
        /*
         * In some cases such as 32bit architectures where UINT_MAX is
         * ULONG_MAX, attrs.aggr_interval becomes zero.  Calling
@@ -532,7 +534,8 @@ static void damon_test_nr_accesses_to_accesses_bp(struct kunit *test)
 static void damon_test_update_monitoring_result(struct kunit *test)
 {
        struct damon_attrs old_attrs = {
-               .sample_interval = 10, .aggr_interval = 1000,};
+               .sample_interval = 10, .aggr_interval = 1000,
+               .aggr_samples = 100,};
        struct damon_attrs new_attrs;
        struct damon_region *r = damon_new_region(3, 7);

@@ -544,19 +547,24 @@ static void damon_test_update_monitoring_result(struct kunit *test)
        r->age = 20;

        new_attrs = (struct damon_attrs){
-               .sample_interval = 100, .aggr_interval = 10000,};
+               .sample_interval = 100, .aggr_interval = 10000,
+               .aggr_samples = 100,};
        damon_update_monitoring_result(r, &old_attrs, &new_attrs, false);
        KUNIT_EXPECT_EQ(test, r->nr_accesses, 15);
        KUNIT_EXPECT_EQ(test, r->age, 2);

        new_attrs = (struct damon_attrs){
-               .sample_interval = 1, .aggr_interval = 1000};
+               .sample_interval = 1, .aggr_interval = 1000,
+               .aggr_samples = 1000,
+       };
        damon_update_monitoring_result(r, &old_attrs, &new_attrs, false);
        KUNIT_EXPECT_EQ(test, r->nr_accesses, 150);
        KUNIT_EXPECT_EQ(test, r->age, 2);

        new_attrs = (struct damon_attrs){
-               .sample_interval = 1, .aggr_interval = 100};
+               .sample_interval = 1, .aggr_interval = 100,
+               .aggr_samples = 100,
+       };
        damon_update_monitoring_result(r, &old_attrs, &new_attrs, false);
        KUNIT_EXPECT_EQ(test, r->nr_accesses, 150);
        KUNIT_EXPECT_EQ(test, r->age, 20);
'''

Josh, could you please also take a look if these fixups are sufficient?  And
once the sufficient fixes are found from your side, could you please post a new
version of this patch after applying the fixees?  Also, please drop my
Reviewed-by: from the new version.  I will review it again.


Thanks,
SJ

[...]


  parent reply	other threads:[~2026-03-24  7:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-22 21:43 [PATCH v3 0/2] mm/damon/core: Performance optimizations for the kdamond hot path Josh Law
2026-03-22 21:43 ` [PATCH v3 1/2] mm/damon/core: optimize kdamond_apply_schemes() by inverting scheme and region loops Josh Law
2026-03-23 14:07   ` SeongJae Park
2026-03-22 21:43 ` [PATCH v3 2/2] mm/damon/core: eliminate hot-path integer division in damon_max_nr_accesses() Josh Law
2026-03-23 14:10   ` SeongJae Park
2026-03-24  7:19   ` SeongJae Park [this message]
2026-03-24  7:22     ` Josh Law
2026-03-23 14:06 ` [PATCH v3 0/2] mm/damon/core: Performance optimizations for the kdamond hot path SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260324071908.89152-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=damon@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=objecting@objecting.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox