public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next 0/3] bpf: Add new bpf helper bpf_for_each_cpu
@ 2023-08-01 14:29 Yafang Shao
  2023-08-01 14:29 ` [RFC PATCH bpf-next 1/3] bpf: Add bpf_for_each_cpu helper Yafang Shao
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Yafang Shao @ 2023-08-01 14:29 UTC (permalink / raw)
  To: ast, daniel, john.fastabend, andrii, martin.lau, song, yhs,
	kpsingh, sdf, haoluo, jolsa
  Cc: bpf, Yafang Shao

Some statistic data is stored in percpu pointer but the kernel doesn't
aggregate it into a single value, for example, the data in struct
psi_group_cpu.

Currently, we can traverse percpu data using for_loop and bpf_per_cpu_ptr:

  for_loop(nr_cpus, callback_fn, callback_ctx, 0)

In the callback_fn, we retrieve the percpu pointer with bpf_per_cpu_ptr().
The drawback is that 'nr_cpus' cannot be a variable; otherwise, it will be
rejected by the verifier, hindering deployment, as servers may have
different 'nr_cpus'. Using CONFIG_NR_CPUS is not ideal.

Alternatively, with the bpf_cpumask family, we can obtain a task's cpumask.
However, it requires creating a bpf_cpumask, copying the cpumask from the
task, and then parsing the CPU IDs from it, resulting in low efficiency.
Introducing other kfuncs like bpf_cpumask_next might be necessary.

A new bpf helper, bpf_for_each_cpu, is introduced to conveniently traverse
percpu data, covering all scenarios. It includes
for_each_{possible, present, online}_cpu. The user can also traverse CPUs
from a specific task, such as walking the CPUs of a cpuset cgroup when the
task is in that cgroup.

In our use case, we utilize this new helper to traverse percpu psi data.
This aids in understanding why CPU, Memory, and IO pressure data are high
on a server or a container.

Due to the __percpu annotation, clang-14+ and pahole-1.23+ are required.

Yafang Shao (3):
  bpf: Add bpf_for_each_cpu helper
  cgroup, psi: Init root cgroup psi to psi_system
  selftests/bpf: Add selftest for for_each_cpu

 include/linux/bpf.h                                |   1 +
 include/linux/psi.h                                |   2 +-
 include/uapi/linux/bpf.h                           |  32 +++++
 kernel/bpf/bpf_iter.c                              |  72 +++++++++++
 kernel/bpf/helpers.c                               |   2 +
 kernel/bpf/verifier.c                              |  29 ++++-
 kernel/cgroup/cgroup.c                             |   5 +-
 tools/include/uapi/linux/bpf.h                     |  32 +++++
 .../selftests/bpf/prog_tests/for_each_cpu.c        | 137 +++++++++++++++++++++
 .../selftests/bpf/progs/test_for_each_cpu.c        |  63 ++++++++++
 10 files changed, 372 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/for_each_cpu.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_for_each_cpu.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-08-03 16:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-01 14:29 [RFC PATCH bpf-next 0/3] bpf: Add new bpf helper bpf_for_each_cpu Yafang Shao
2023-08-01 14:29 ` [RFC PATCH bpf-next 1/3] bpf: Add bpf_for_each_cpu helper Yafang Shao
2023-08-01 14:29 ` [RFC PATCH bpf-next 2/3] cgroup, psi: Init root cgroup psi to psi_system Yafang Shao
2023-08-01 14:29 ` [RFC PATCH bpf-next 3/3] selftests/bpf: Add selftest for for_each_cpu Yafang Shao
2023-08-01 17:53 ` [RFC PATCH bpf-next 0/3] bpf: Add new bpf helper bpf_for_each_cpu Yonghong Song
2023-08-02  2:33   ` Yafang Shao
2023-08-02  2:45     ` Alexei Starovoitov
2023-08-02  2:57       ` Yafang Shao
2023-08-02  3:29       ` David Vernet
2023-08-02  6:54         ` Yonghong Song
2023-08-02 15:46           ` David Vernet
2023-08-02 16:23             ` Alexei Starovoitov
2023-08-02 16:33         ` Alexei Starovoitov
2023-08-02 17:06           ` David Vernet
2023-08-02 18:13             ` Alexei Starovoitov
2023-08-03  8:21           ` Alan Maguire
2023-08-03 15:22             ` Yonghong Song
2023-08-03 16:10               ` Alan Maguire

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox