All of lore.kernel.org
 help / color / mirror / Atom feed
From: hui.zhu@linux.dev
To: "Roman Gushchin" <roman.gushchin@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Johannes  Weiner" <hannes@cmpxchg.org>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"JP  Kobryn" <inwardvessel@gmail.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org, bpf@vger.kernel.org,
	"Martin KaFai Lau" <martin.lau@kernel.org>,
	"Song Liu" <song@kernel.org>,
	"Kumar Kartikeya  Dwivedi" <memxor@gmail.com>,
	"Tejun Heo" <tj@kernel.org>,
	"Roman  Gushchin" <roman.gushchin@linux.dev>
Subject: Re: [PATCH v2 21/23] sched: psi: implement bpf_psi_create_trigger()  kfunc
Date: Mon, 08 Dec 2025 08:49:34 +0000	[thread overview]
Message-ID: <1d9a162605a3f32ac215430131f7745488deaa34@linux.dev> (raw)
In-Reply-To: <20251027232206.473085-11-roman.gushchin@linux.dev>

2025年10月28日 07:22, "Roman Gushchin" <roman.gushchin@linux.dev mailto:roman.gushchin@linux.dev?to=%22Roman%20Gushchin%22%20%3Croman.gushchin%40linux.dev%3E > 写到:


> 
> Implement a new bpf_psi_create_trigger() BPF kfunc, which allows
> to create new PSI triggers and attach them to cgroups or be
> system-wide.
> 
> Created triggers will exist until the struct ops is loaded and
> if they are attached to a cgroup until the cgroup exists.
> 
> Due to a limitation of 5 arguments, the resource type and the "full"
> bit are squeezed into a single u32.
> 
> Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>

Hi Roman,

I wrote an eBPF program attempting to use bpf_psi struct ops and
bpf_psi_create_trigger to continuously receive memory-related PSI
events, but I only received one event.

Looking at the code implementation, when an event occurs:
if (cmpxchg(&t->event, 0, 1) == 0) {

However, in eBPF there appears to be no way to call the equivalent
of this code from psi_trigger_poll:
if (cmpxchg(&t->event, 1, 0) == 1)
to reset the event back to 0.

Would it be possible to add an additional BPF helper function to
handle this? Without a way to acknowledge/reset the event flag,
the trigger only fires once and cannot be reused for continuous
monitoring.

Best,
Hui



> ---
>  include/linux/cgroup.h | 4 ++
>  include/linux/psi.h | 6 +++
>  kernel/sched/bpf_psi.c | 94 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 104 insertions(+)
> 
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 6ed477338b16..1a99da44999e 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -707,6 +707,10 @@ static inline bool task_under_cgroup_hierarchy(struct task_struct *task,
>  
>  static inline void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
>  {}
> +static inline struct cgroup *cgroup_get_from_id(u64 id)
> +{
> + return NULL;
> +}
>  #endif /* !CONFIG_CGROUPS */
>  
>  #ifdef CONFIG_CGROUPS
> diff --git a/include/linux/psi.h b/include/linux/psi.h
> index 8178e998d94b..8ffe84cd8571 100644
> --- a/include/linux/psi.h
> +++ b/include/linux/psi.h
> @@ -50,6 +50,12 @@ int psi_cgroup_alloc(struct cgroup *cgrp);
>  void psi_cgroup_free(struct cgroup *cgrp);
>  void cgroup_move_task(struct task_struct *p, struct css_set *to);
>  void psi_cgroup_restart(struct psi_group *group);
> +
> +#else
> +static inline struct psi_group *cgroup_psi(struct cgroup *cgrp)
> +{
> + return &psi_system;
> +}
>  #endif
>  
>  #else /* CONFIG_PSI */
> diff --git a/kernel/sched/bpf_psi.c b/kernel/sched/bpf_psi.c
> index c383a20119a6..7974de56594f 100644
> --- a/kernel/sched/bpf_psi.c
> +++ b/kernel/sched/bpf_psi.c
> @@ -8,6 +8,7 @@
>  #include <linux/bpf_psi.h>
>  #include <linux/cgroup-defs.h>
>  
> +struct bpf_struct_ops bpf_psi_bpf_ops;
>  static struct workqueue_struct *bpf_psi_wq;
>  
>  static DEFINE_MUTEX(bpf_psi_lock);
> @@ -186,6 +187,92 @@ static const struct bpf_verifier_ops bpf_psi_verifier_ops = {
>  .is_valid_access = bpf_psi_ops_is_valid_access,
>  };
>  
> +__bpf_kfunc_start_defs();
> +
> +/**
> + * bpf_psi_create_trigger - Create a PSI trigger
> + * @bpf_psi: bpf_psi struct to attach the trigger to
> + * @cgroup_id: cgroup Id to attach the trigger; 0 for system-wide scope
> + * @resource: resource to monitor (PSI_MEM, PSI_IO, etc) and the full bit.
> + * @threshold_us: threshold in us
> + * @window_us: window in us
> + *
> + * Creates a PSI trigger and attached is to bpf_psi. The trigger will be
> + * active unless bpf struct ops is unloaded or the corresponding cgroup
> + * is deleted.
> + *
> + * Resource's most significant bit encodes whether "some" or "full"
> + * PSI state should be tracked.
> + *
> + * Returns 0 on success and the error code on failure.
> + */
> +__bpf_kfunc int bpf_psi_create_trigger(struct bpf_psi *bpf_psi,
> + u64 cgroup_id, u32 resource,
> + u32 threshold_us, u32 window_us)
> +{
> + enum psi_res res = resource & ~BPF_PSI_FULL;
> + bool full = resource & BPF_PSI_FULL;
> + struct psi_trigger_params params;
> + struct cgroup *cgroup __maybe_unused = NULL;
> + struct psi_group *group;
> + struct psi_trigger *t;
> + int ret = 0;
> +
> + if (res >= NR_PSI_RESOURCES)
> + return -EINVAL;
> +
> + if (IS_ENABLED(CONFIG_CGROUPS) && cgroup_id) {
> + cgroup = cgroup_get_from_id(cgroup_id);
> + if (IS_ERR_OR_NULL(cgroup))
> + return PTR_ERR(cgroup);
> +
> + group = cgroup_psi(cgroup);
> + } else {
> + group = &psi_system;
> + }
> +
> + params.type = PSI_BPF;
> + params.bpf_psi = bpf_psi;
> + params.privileged = capable(CAP_SYS_RESOURCE);
> + params.res = res;
> + params.full = full;
> + params.threshold_us = threshold_us;
> + params.window_us = window_us;
> +
> + t = psi_trigger_create(group, &params);
> + if (IS_ERR(t))
> + ret = PTR_ERR(t);
> + else
> + t->cgroup_id = cgroup_id;
> +
> +#ifdef CONFIG_CGROUPS
> + if (cgroup)
> + cgroup_put(cgroup);
> +#endif
> +
> + return ret;
> +}
> +__bpf_kfunc_end_defs();
> +
> +BTF_KFUNCS_START(bpf_psi_kfuncs)
> +BTF_ID_FLAGS(func, bpf_psi_create_trigger, KF_TRUSTED_ARGS)
> +BTF_KFUNCS_END(bpf_psi_kfuncs)
> +
> +static int bpf_psi_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
> +{
> + if (btf_id_set8_contains(&bpf_psi_kfuncs, kfunc_id) &&
> + prog->aux->st_ops != &bpf_psi_bpf_ops)
> + return -EACCES;
> +
> + return 0;
> +}
> +
> +static const struct btf_kfunc_id_set bpf_psi_kfunc_set = {
> + .owner = THIS_MODULE,
> + .set = &bpf_psi_kfuncs,
> + .filter = bpf_psi_kfunc_filter,
> +};
> +
>  static int bpf_psi_ops_reg(void *kdata, struct bpf_link *link)
>  {
>  struct bpf_psi_ops *ops = kdata;
> @@ -287,6 +374,13 @@ static int __init bpf_psi_struct_ops_init(void)
>  if (!bpf_psi_wq)
>  return -ENOMEM;
>  
> + err = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
> + &bpf_psi_kfunc_set);
> + if (err) {
> + pr_warn("error while registering bpf psi kfuncs: %d", err);
> + goto err;
> + }
> +
>  err = register_bpf_struct_ops(&bpf_psi_bpf_ops, bpf_psi_ops);
>  if (err) {
>  pr_warn("error while registering bpf psi struct ops: %d", err);
> -- 
> 2.51.0
>

  reply	other threads:[~2025-12-08  8:49 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27 23:21 [PATCH v2 11/23] mm: introduce BPF kfunc to access memory events Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 12/23] bpf: selftests: selftests for memcg stat kfuncs Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 13/23] mm: introduce bpf_out_of_memory() BPF kfunc Roman Gushchin
2025-10-27 23:57   ` bot+bpf-ci
2025-10-28 16:43     ` Roman Gushchin
2025-11-10  9:46   ` Michal Hocko
2025-11-11 19:13     ` Roman Gushchin
2025-11-12  7:50       ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 14/23] mm: allow specifying custom oom constraint for BPF triggers Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 15:58     ` Chris Mason
2025-10-28 16:20       ` Roman Gushchin
2025-10-28 16:35         ` Chris Mason
2025-11-10  9:31   ` Michal Hocko
2025-11-11 19:17     ` Roman Gushchin
2025-11-12  7:52       ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 15/23] mm: introduce bpf_task_is_oom_victim() kfunc Roman Gushchin
2025-10-28 17:32   ` Tejun Heo
2025-10-28 18:09     ` Roman Gushchin
2025-10-28 18:31       ` Tejun Heo
2025-10-27 23:21 ` [PATCH v2 16/23] libbpf: introduce bpf_map__attach_struct_ops_opts() Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:07     ` Roman Gushchin
2025-10-28 17:24       ` Andrii Nakryiko
2025-10-27 23:22 ` [PATCH v2 17/23] bpf: selftests: introduce read_cgroup_file() helper Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 16:31     ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 18/23] bpf: selftests: BPF OOM handler test Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 19/23] sched: psi: refactor psi_trigger_create() Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 20/23] sched: psi: implement bpf_psi struct ops Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:40   ` Tejun Heo
2025-10-28 18:29     ` Roman Gushchin
2025-10-28 18:35       ` Tejun Heo
2025-10-28 19:54         ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 21/23] sched: psi: implement bpf_psi_create_trigger() kfunc Roman Gushchin
2025-12-08  8:49   ` hui.zhu [this message]
2025-12-09  1:49     ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 22/23] bpf: selftests: add config for psi Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 23/23] bpf: selftests: PSI struct ops test Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:13     ` Roman Gushchin
2025-10-28 17:30       ` Alexei Starovoitov
2025-11-10  9:48   ` Michal Hocko
2025-11-11 19:03     ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d9a162605a3f32ac215430131f7745488deaa34@linux.dev \
    --to=hui.zhu@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=inwardvessel@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=song@kernel.org \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.