Re: [RFC PATCH bpf-next v7 11/11] samples/bpf: Add memcg priority control and async reclaim example

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Hui Zhu" <hui.zhu@linux.dev>
Cc: tj@kernel.org, hannes@cmpxchg.org, bpf@vger.kernel.org, mkoutny@suse.com
Subject: Re: [RFC PATCH bpf-next v7 11/11] samples/bpf: Add memcg priority control and async reclaim example
Date: Tue, 26 May 2026 03:10:57 +0000	[thread overview]
Message-ID: <20260526031057.DAF2D1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <39784d6dba757f4fb82134192419094ce42c5af5.1779760876.git.zhuhui@kylinos.cn>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Medium] Non-atomic updates to the global `aggregation_map` in the `handle_count_memcg_events` tracepoint lead to data races and lost PGFAULT updates.
- [Medium] Atomic one-shot guards permanently leak in error paths, deadlocking the async reclaim functionality.
- [Low] `parse_u64` relies entirely on `errno` and fails to detect completely non-numeric input strings, silently returning 0.
- [Low] Ignoring the return value of `__atomic_add_fetch` and using a subsequent `READ_ONCE` introduces an unnecessary race window.
--

commit 0eac8db0526f6826e19ed8ba2b922d7d358b6032
Author: Hui Zhu <zhuhui@kylinos.cn>
Subject: samples/bpf: Add memcg priority control and async reclaim example

This commit adds a sample program demonstrating the memcg_bpf_ops feature.
It provides priority-based memory throttling and workqueue-driven
asynchronous page reclaim using a BPF program and userspace loader.

> diff --git a/samples/bpf/memcg.bpf.c b/samples/bpf/memcg.bpf.c
> new file mode 100644
> index 0000000000000..0995284794ac7
> --- /dev/null
> +++ b/samples/bpf/memcg.bpf.c
[ ... ]
> +static int async_free(void *map, int *key, void *value)
> +{
> +	struct cgroup_memcg cm;
> +	bool started_wq = false;
> +	int i;
> +
> +	if (get_cgroup_memcg_from_id(local_config.low_cgroup_id, &cm) != 0)
> +		return 0;

[Severity: Medium]
Can the async_free_run guard permanently leak if get_cgroup_memcg_from_id
fails? Returning early here skips the lock release at the end of the
function, which would deadlock the async reclaim functionality.

> +
> +	for (i = 0; i < ASYNC_FREE_LOOP_MAX; i++) {
[ ... ]
> +	}
> +
> +	put_cgroup_memcg(&cm);
> +
> +	if (!started_wq)
> +		__atomic_exchange_n(&async_free_run, 0, __ATOMIC_RELEASE);
> +	return 0;
> +}
[ ... ]
> +SEC("tp/memcg/count_memcg_events")
> +int handle_count_memcg_events(
> +		struct trace_event_raw_memcg_rstat_events *ctx)
> +{
[ ... ]
> +	current_ts = bpf_ktime_get_ns();
> +
> +	if (current_ts - data->window_start_ts < ONE_SECOND_NS) {
> +		data->sum += ctx->val;

[Severity: Medium]
Can this non-atomic update to the global aggregation_map lead to data races
and lost PGFAULT updates? Since this map is shared across all CPUs,
concurrent PGFAULT events executing the tracepoint might perform read,
modify, write operations simultaneously.

Should BPF atomic instructions or a bpf_spin_lock be used here?

> +	} else {
> +		data->window_start_ts = current_ts;
> +		data->sum = ctx->val;
> +	}
[ ... ]
> +static __always_inline unsigned int
> +__memcg_charged_impl(struct mem_cgroup *memcg, unsigned int nr_pages)
> +{
> +	struct wq_elem *elem;
> +	__u32 key = 0;
> +	u64 expected = 0;
> +	s64 cur_allocated;
> +	s64 cur_old_allocated;
> +
> +	__atomic_add_fetch(&allocated, nr_pages, __ATOMIC_RELAXED);
> +	cur_allocated = READ_ONCE(allocated);

[Severity: Low]
Is there a race window introduced by ignoring the return value of
__atomic_add_fetch and performing a separate READ_ONCE? A concurrent
increment by another CPU between the add and the read could cause
cur_allocated to observe an inconsistent or skipped value. 

Could the return value of __atomic_add_fetch be assigned directly to
eliminate this window?

> +	cur_old_allocated = READ_ONCE(old_allocated);
> +	if (abs_diff(cur_allocated, cur_old_allocated) < ONE_MB_PAGE_COUNT)
> +		goto out;
> +	WRITE_ONCE(old_allocated, cur_allocated);
> +
> +	if (get_usage() < local_config.async_trigger_bytes)
> +		goto out;
> +
> +	if (__atomic_compare_exchange_n(&async_free_run,
> +					&expected, 1,
> +					false,
> +					__ATOMIC_ACQ_REL,
> +					__ATOMIC_RELAXED)) {
> +		elem = bpf_map_lookup_elem(&wq_map, &key);
> +		if (!elem)
> +			return 0;

[Severity: Medium]
Does this early return also leak the async_free_run guard if the map lookup
fails? Like the early return in async_free, this skips resetting the guard
and could deadlock the state machine.

> +
> +		bpf_wq_start(&elem->work, 0);
> +	}
> +
> +out:
> +	return 0;
> +}
[ ... ]
> diff --git a/samples/bpf/memcg.c b/samples/bpf/memcg.c
> new file mode 100644
> index 0000000000000..0929d868e6d8f
> --- /dev/null
> +++ b/samples/bpf/memcg.c
[ ... ]
> +static uint64_t parse_u64(const char *str, const char *prog)
> +{
> +	uint64_t value;
> +
> +	errno = 0;
> +	value = strtoull(str, NULL, 10);
> +	if (errno != 0) {

[Severity: Low]
Does this code silently return 0 for completely non-numeric input strings?

The strtoull function does not set errno if the input string contains no
valid digits. Since the endptr argument is NULL, could malformed arguments
like --async_trigger_bytes abc silently return 0 instead of raising an
error?

> +		fprintf(stderr, "ERROR: strtoull '%s' failed: %d\n",
> +			str, errno);
> +		usage(prog);
> +		exit(-errno);
> +	}
> +	return value;
> +}
[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1779760876.git.zhuhui@kylinos.cn?part=11

next prev parent reply	other threads:[~2026-05-26  3:10 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26  2:20 [RFC PATCH bpf-next v7 00/11] mm: BPF struct_ops for dynamic memory protection and async reclaim Hui Zhu
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 01/11] bpf: move bpf_struct_ops_link into bpf.h Hui Zhu
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 02/11] bpf: allow attaching struct_ops to cgroups Hui Zhu
2026-05-26  3:00   ` sashiko-bot
2026-05-26  3:19   ` bot+bpf-ci
2026-05-26  3:19     ` bot+bpf-ci
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 03/11] libbpf: fix return value on memory allocation failure Hui Zhu
2026-05-26  2:41   ` sashiko-bot
2026-05-26  3:06   ` bot+bpf-ci
2026-05-26  3:06     ` bot+bpf-ci
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 04/11] libbpf: introduce bpf_map__attach_struct_ops_opts() Hui Zhu
2026-05-26  2:58   ` sashiko-bot
2026-05-26  3:06   ` bot+bpf-ci
2026-05-26  3:06     ` bot+bpf-ci
2026-05-27 15:43   ` Yonghong Song
2026-05-28  5:53     ` Leon Hwang
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 05/11] bpf: Pass flags in bpf_link_create for struct_ops Hui Zhu
2026-05-26  3:04   ` sashiko-bot
2026-05-26  2:24 ` [RFC PATCH bpf-next v7 06/11] mm: memcontrol: Add BPF struct_ops for memory controller Hui Zhu
2026-05-26  3:04   ` sashiko-bot
2026-05-26  3:19   ` bot+bpf-ci
2026-05-26  3:19     ` bot+bpf-ci
2026-05-26  2:24 ` [RFC PATCH bpf-next v7 07/11] mm/bpf: Add bpf_try_to_free_mem_cgroup_pages kfunc Hui Zhu
2026-05-26  2:52   ` sashiko-bot
2026-05-26  3:06   ` bot+bpf-ci
2026-05-26  3:06     ` bot+bpf-ci
2026-05-26  2:24 ` [RFC PATCH bpf-next v7 08/11] selftests/bpf: Add tests for memcg_bpf_ops Hui Zhu
2026-05-26  2:50   ` sashiko-bot
2026-05-26  2:27 ` [RFC PATCH bpf-next v7 09/11] selftests/bpf: Add test for memcg_bpf_ops hierarchies Hui Zhu
2026-05-26  2:52   ` sashiko-bot
2026-05-26  2:27 ` [RFC PATCH bpf-next v7 10/11] selftests/bpf: Add selftest for memcg async reclaim via BPF Hui Zhu
2026-05-26  3:06   ` sashiko-bot
2026-05-26  3:06   ` bot+bpf-ci
2026-05-26  3:06     ` bot+bpf-ci
2026-05-26  2:27 ` [RFC PATCH bpf-next v7 11/11] samples/bpf: Add memcg priority control and async reclaim example Hui Zhu
2026-05-26  3:10   ` sashiko-bot [this message]
2026-05-26 13:41 ` [RFC PATCH bpf-next v7 00/11] mm: BPF struct_ops for dynamic memory protection and async reclaim Usama Arif
2026-05-27  9:31   ` teawater
2026-05-27  8:47 ` Michal Hocko
2026-05-28  8:27   ` teawater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260526031057.DAF2D1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hui.zhu@linux.dev \
    --cc=mkoutny@suse.com \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.