Re: [RFC PATCH v3 3/4] mm/damon/vaddr: support perf event based access check

public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Akinobu Mita" <akinobu.mita@gmail.com>
Cc: damon@lists.linux.dev, linux-perf-users@vger.kernel.org
Subject: Re: [RFC PATCH v3 3/4] mm/damon/vaddr: support perf event based access check
Date: Thu, 23 Apr 2026 02:48:26 +0000	[thread overview]
Message-ID: <20260423024827.498B1C19425@smtp.kernel.org> (raw)
In-Reply-To: <20260423004211.7037-4-akinobu.mita@gmail.com>

Sashiko AI review found 13 potential issue(s):
- [Critical] Compilation Failure: Use of undefined API `kmalloc_obj()`.
- [Critical] Compilation Failure: Non-existent field `sample_phys_addr` in `perf_event_attr`.
- [Critical] Compilation Failure: Typo in the `CONFIG_HUGETLB_PAGE` fallback macro.
- [High] Deadlock Risk: Recursive acquisition of `mmap_read_lock`.
- [High] Infinite Loop: Returning `ACTION_AGAIN` on `pte_offset_map_lock()` failure.
- [High] Memory Leak: `damon_vaddr_histogram_per_target` structs are never freed.
- [High] Memory Leak: Dynamically allocated `buffer->reports` arrays are orphaned.
- [High] Memory Leak: Orphaned `perf` structs when the `kdamond` thread is restarted.
- [High] Data Truncation: 64-bit physical addresses are silently truncated when used as xarray indices on 32-bit platforms.
- [High] Logic Error: Unbounded perf events for physical memory targets (`DAMON_OPS_PADDR`).
- [High] Logic Error: Invalid PFN extraction from non-present PMD migration entries.
- [High] Severe Performance Flaw: Full page table walks per sampling interval.
- [High] Severe Performance Flaw: O(N) iteration by `PAGE_SIZE` over massive virtual memory regions.
--

commit 04697c66c594dde51627f9cdaf4c7d11b08fe1fe
Author: Akinobu Mita <akinobu.mita@gmail.com>

mm/damon/vaddr: support perf event based access check

This commit introduces a mechanism to use perf events for virtual and
physical address space monitoring in DAMON. It buffers perf samples during
the sampling interval and generates a histogram of memory accesses. During
the access check phase, this histogram is used to update the access counts
for each monitoring target region.

> diff --git a/mm/damon/core.c b/mm/damon/core.c
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -3014,6 +3014,40 @@ static void kdamond_init_ctx(struct damon_ctx *ctx)
>  	}
>  }

When the kdamond thread begins, kdamond_init_ctx() unconditionally initializes
the perf events. If a user stops and restarts the thread via sysfs, does this
leak the previous allocations? Calling damon_perf_init() again would allocate
new perf structures, hardware counters, and per-CPU buffers, permanently
leaking the previous ones.

> +static void kdamond_prepare_perf_event_report(struct damon_ctx *ctx)
> +{
> +	struct damon_perf_event *event;
> +
> +	list_for_each_entry(event, &ctx->perf_events, list)
> +		damon_perf_prepare_access_checks(ctx, event);
> +}
> +
> +static unsigned int kdamond_check_perf_event_reported_accesses(struct damon_ctx *ctx)
> +{
> +	struct damon_target *t;
> +	struct damon_perf_event *event;
> +	unsigned int max_nr_accesses = 0;
> +
> +	if (damon_target_has_pid(ctx)) {
> +		list_for_each_entry(event, &ctx->perf_events, list)
> +			damon_va_perf_check_accesses(ctx, event);
> +	}

For physical memory targets (which have no PIDs), the processing code inside
this if block is skipped. However, kdamond_prepare_perf_event_report()
unconditionally enables perf events. Does this leave perf events permanently
enabled and continuously firing for physical memory targets?

[ ... ]
> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
> --- a/mm/damon/vaddr.c
> +++ b/mm/damon/vaddr.c
[ ... ]
> +static void damon_vaddr_histogram_add(struct damon_vaddr_histogram *histogram, u32 pid,
> +						u64 vaddr)
> +{
> +	struct damon_vaddr_histogram_per_target *target;
> +	unsigned long nr_accesses;
> +
> +	while (!(target = xa_load(&histogram->targets, pid))) {
> +		target = kmalloc_obj(*target);

Can this code compile? There is no kmalloc_obj() macro or function in the
Linux kernel. A standard kmalloc() could be used instead:

target = kmalloc(sizeof(*target), GFP_KERNEL);

[ ... ]
> +static void damon_vaddr_histogram_destroy(struct damon_vaddr_histogram *histogram)
> +{
> +	unsigned long index;
> +	struct damon_vaddr_histogram_per_target *target;
> +
> +	xa_for_each(&histogram->targets, index, target)
> +		xa_destroy(&target->accesses);

Does this code leak the damon_vaddr_histogram_per_target structure? The nested
accesses xarray is destroyed via xa_destroy(), but the target struct itself is
not freed with kfree().

> +	xa_destroy(&histogram->targets);
> +}
[ ... ]
> +static void damon_va_perf_check_accesses_by_vaddr(struct damon_ctx *ctx,
> +		struct damon_perf_event *event)
> +{
[ ... ]
> +		if ((count == buffer->size - 1) && (buffer->size < DAMON_PERF_MAX_RECORDS)) {
> +			void *new_reports = kvcalloc_node(buffer->size * 2,
> +					sizeof(buffer->reports[0]), GFP_KERNEL,
> +					cpu_to_node(cpu));

Are these dynamically allocated buffer->reports arrays freed during cleanup?
When perf monitoring stops, damon_perf_cleanup() frees the top-level per-CPU
pointer wrapper but appears to orphan these inner reports arrays.

[ ... ]
> +	damon_for_each_target(t, ctx) {
> +		struct damon_region *r;
> +		u32 pid = pid_nr(t->pid);
> +
> +		damon_for_each_region(r, t) {
> +			unsigned long addr;
> +
> +			if (r->access_reported)
> +				continue;
> +
> +			for (addr = r->ar.start; addr < r->ar.end; addr += PAGE_SIZE) {

Can this loop stall the kernel thread? Since DAMON regions can merge to
encompass hundreds of gigabytes, iterating from r->ar.start to r->ar.end by
PAGE_SIZE could result in hundreds of millions of iterations per interval.
Could this logic iterate over the populated xarray entries instead?

> +				if (damon_vaddr_histogram_count(&perf->vaddr_histogram, pid,
> +									addr & PAGE_MASK)) {
> +					damon_update_region_access_rate(r, true, &ctx->attrs);
> +					r->access_reported = true;
> +					break;
> +				}
> +			}
> +		}
> +		tidx++;
> +	}
[ ... ]
> +static void damon_va_perf_check_accesses_by_paddr(struct damon_ctx *ctx,
> +		struct damon_perf_event *event)
> +{
[ ... ]
> +	damon_for_each_target(t, ctx) {
> +		struct damon_region *r;
> +		struct mm_struct *mm = damon_get_mm(t);
> +
> +		if (!mm)
> +			continue;
> +
> +		mmap_read_lock(mm);
> +		damon_for_each_region(r, t) {
> +			struct damon_paddr_walk walk_private = {
> +				.histogram = &perf->paddr_histogram,
> +				.accessed = false,
> +			};
> +
> +			if (r->access_reported)
> +				continue;
> +
> +			walk_page_range(mm, r->ar.start, r->ar.end, &damon_paddr_ops,
> +					&walk_private);

Can this sequence cause a recursive read lock deadlock? The mmap_read_lock is
acquired directly, but damon_paddr_ops is configured with
.walk_lock = PGWALK_RDLOCK, which instructs the page walker to also acquire
the mmap_read_lock. If a writer is pending, the second read lock will block
and self-deadlock.

Is it too expensive to call walk_page_range() across the target's entire
mapped virtual memory space on every sampling interval? For large memory
footprints, walking millions of page table entries every few milliseconds
could create severe CPU overhead and defeat the adaptive sampling design.

> +			if (walk_private.accessed) {
> +				damon_update_region_access_rate(r, true, &ctx->attrs);
> +				r->access_reported = true;
> +			}
> +		}
> +		mmap_read_unlock(mm);
> +		mmput(mm);
> +		tidx++;
> +	}
> +
> +	damon_paddr_histogram_destroy(&perf->paddr_histogram);
> +}
> +
> +void damon_va_perf_check_accesses(struct damon_ctx *ctx, struct damon_perf_event *event)
> +{
> +	if (event->attr.sample_phys_addr)

Does struct perf_event_attr contain a sample_phys_addr field? Physical
address sampling is typically requested by setting the PERF_SAMPLE_PHYS_ADDR
flag in the sample_type bitmask.

> +		return damon_va_perf_check_accesses_by_paddr(ctx, event);
> +	else
> +		return damon_va_perf_check_accesses_by_vaddr(ctx, event);
> +}
[ ... ]
> +static void damon_paddr_histogram_add(struct damon_paddr_histogram *histogram,
> +						u64 paddr)
> +{
> +	unsigned long nr_accesses;
> +
> +	nr_accesses = xa_to_value(xa_load(&histogram->accesses, paddr));
> +	xa_store(&histogram->accesses, paddr, xa_mk_value(nr_accesses + 1), GFP_KERNEL);

Can this silently truncate physical addresses on 32-bit platforms? The paddr
argument is a u64, but it is passed directly as the index to xa_load() and
xa_store(), which take an unsigned long. On 32-bit systems with PAE/LPAE,
this will drop the upper 32 bits. Could the physical address be shifted by
PAGE_SHIFT to store the PFN instead?

> +}
[ ... ]
> +static int damon_paddr_pmd_entry(pmd_t *pmd, unsigned long addr,
> +		unsigned long next, struct mm_walk *walk)
> +{
> +	pte_t *pte;
> +	spinlock_t *ptl;
> +	struct damon_paddr_walk *paddr_walk = walk->private;
> +
> +	ptl = pmd_trans_huge_lock(pmd, walk->vma);
> +	if (ptl) {
> +		pmd_t pmde = pmdp_get(pmd);
> +
> +		if (pmd_present(pmde)) {
> +			for (; addr < next && !paddr_walk->accessed; addr += PAGE_SIZE) {
> +				u64 frame = pmd_pfn(pmde) +

Can pmd_present() evaluate to true for non-present leaf entries like migration
entries? Extracting the PFN via pmd_pfn() on a migration entry would treat
swap offset metadata as a physical address, resulting in invalid calculations.
Does this code need to verify the PMD is a valid THP first?

> +					((addr & ~HPAGE_PMD_MASK) >> PAGE_SHIFT);
> +
> +				if (damon_paddr_histogram_count(paddr_walk->histogram,
> +						PFN_PHYS(frame))) {
> +					paddr_walk->accessed = true;
> +					break;
> +				}
> +			}
> +		}
> +		spin_unlock(ptl);
> +		return paddr_walk->accessed ? 1 : 0;
> +	}
> +
> +	pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
> +	if (!pte) {
> +		walk->action = ACTION_AGAIN;
> +		return 0;

Will setting walk->action = ACTION_AGAIN on a pte_offset_map_lock() failure
cause an infinite loop? Migration entries cause this lock to return NULL
non-transiently, meaning the page walker will continually retry the exact same
PMD without making progress. Returning 0 without ACTION_AGAIN skips it
gracefully.

> +	}
[ ... ]
> +#ifdef CONFIG_HUGETLB_PAGE
> +static int damon_paddr_hugetlb_entry(pte_t *pte, unsigned long hmask,
[ ... ]
> +	return paddr_walk->accessed ? 1 : 0;
> +}
> +#else
> +#define damon_perf_hugetlb_entry NULL
> +#endif /* CONFIG_HUGETLB_PAGE */
> +
> +static const struct mm_walk_ops damon_paddr_ops = {
> +	.pmd_entry = damon_paddr_pmd_entry,
> +	.hugetlb_entry = damon_paddr_hugetlb_entry,

Does this cause a compilation failure when CONFIG_HUGETLB_PAGE is disabled?
The fallback macro is defined as damon_perf_hugetlb_entry, but the struct
initializes .hugetlb_entry with damon_paddr_hugetlb_entry, leaving it
undeclared.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260423004211.7037-1-akinobu.mita@gmail.com?part=3

next prev parent reply	other threads:[~2026-04-23  2:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23  0:42 [RFC PATCH v3 0/4] mm/damon: introduce perf event based access check Akinobu Mita
2026-04-23  0:42 ` [RFC PATCH v3 1/4] mm/damon/core: add code borrowed from report-based monitoring work Akinobu Mita
2026-04-23  1:21   ` sashiko-bot
2026-04-23  0:42 ` [RFC PATCH v3 2/4] mm/damon/core: add common code for perf event based access check Akinobu Mita
2026-04-23  1:58   ` sashiko-bot
2026-04-23  0:42 ` [RFC PATCH v3 3/4] mm/damon/vaddr: support " Akinobu Mita
2026-04-23  2:48   ` sashiko-bot [this message]
2026-04-23  0:42 ` [RFC PATCH v3 4/4] mm/damon/paddr: " Akinobu Mita
2026-04-23  3:22   ` sashiko-bot
2026-04-23  4:34 ` [RFC PATCH v3 0/4] mm/damon: introduce " SeongJae Park
2026-04-24  3:27   ` Akinobu Mita
2026-04-24 23:31     ` SeongJae Park
2026-04-25 12:33       ` Akinobu Mita
2026-04-25 15:33         ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260423024827.498B1C19425@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=akinobu.mita@gmail.com \
    --cc=damon@lists.linux.dev \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox