Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
       [not found]         ` <ZFPw0scDq1eIzfHr@kernel.org>
@ 2023-05-04 18:50           ` Andrii Nakryiko
  2023-05-04 19:07             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2023-05-04 18:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Jiri Olsa, Namhyung Kim, Clark Williams,
	Kate Carcia, linux-kernel, linux-perf-users, Adrian Hunter,
	Changbin Du, Hao Luo, Ian Rogers, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf

On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, May 04, 2023 at 10:25:30AM -0700, Linus Torvalds escreveu:
> > On Thu, May 4, 2023 at 4:09 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > Does building runqslower works for you in this same environment where
> > > building perf failed?
>
> > I don't know, and I don't care. I've never used that thing, and I'm
> > not going to.
>
> > And it's irrelevant. Two wrongs do not make a right.
>
> > I'm going to ignore perf tools pulls going forward if this is the kind
> > of argument for garbage that you use.
>
> > Because a billion flies *can* be wrong.
>
> I pushed two reverts there that make this back into a
> opt-in/experimental feature till we fix the issue you reported:
>
> ⬢[acme@toolbox perf-tools]$ git log --oneline -3
> e7b7a54767a71c67 (HEAD -> perf-tools, acme/perf-tools) Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL"
> 6957bdf37a1e6eca Revert "perf build: Warn for BPF skeletons if endian mismatches"
> 1f85d016768ff19f (tag: perf-tools-for-v6.4-1-2023-05-03) perf test record+probe_libc_inet_pton: Fix call chain match on x86_64
> ⬢[acme@toolbox perf-tools]$
>
> Its in:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-tools
>
> Using a vmlinux.h file built by bpftool from the BTF info, be it in a
> vmlinux file or in /sys/kernel/btf/vmlinux (a RAW BTF file) is used for
> building the BPF bytecode, using clang:
>
> ⬢[acme@toolbox perf-tools]$ head tools/perf/util/bpf_skel/sample_filter.bpf.c
> // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> // Copyright (c) 2023 Google
> #include "vmlinux.h"
> #include <bpf/bpf_helpers.h>
> #include <bpf/bpf_tracing.h>
> #include <bpf/bpf_core_read.h>
>
> #include "sample-filter.h"
>
> /* BPF map that will be filled by user space */
> ⬢[acme@toolbox perf-tools]$
>
> So that it can access kernel types and store the type info for those
> types together with the BPF bytecode, as BTF info, and later use this
> and relocation records for libbpf to be able to adjust things when
> accessed data structures change in the kernel and needs adjustments
> based in both the kernel BTF info (/sys/kernel/btf/vmlinux) and the
> BPF bytecode being loaded (in its .BTF ELF section).
>
> Andrii, can you add some more information about the usage of vmlinux.h
> instead of using kernel headers?
>

I'll just say that vmlinux.h is not a hard requirement to build BPF
programs, it's more a convenience allowing easy access to definitions
of both UAPI and kernel-internal structures for tracing needs and
marking them relocatable using BPF CO-RE machinery. Lots of real-world
applications just check-in pregenerated vmlinux.h to avoid build-time
dependency on up-to-date host kernel and such.

If vmlinux.h generation and usage is causing issues, though, given
that perf's BPF programs don't seem to be using many different kernel
types, it might be a better option to just use UAPI headers for public
kernel type definitions, and just define CO-RE-relocatable minimal
definitions locally in perf's BPF code for the other types necessary.
E.g., if perf needs only pid and tgid from task_struct, this would
suffice:

struct task_struct {
    int pid;
    int tgid;
} __attribute__((preserve_access_index));

> - Arnaldo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 18:50           ` BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4 Andrii Nakryiko
@ 2023-05-04 19:07             ` Arnaldo Carvalho de Melo
  2023-05-04 21:48               ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-04 19:07 UTC (permalink / raw)
  To: Andrii Nakryiko, Linus Torvalds
  Cc: Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Jiri Olsa, Namhyung Kim, Clark Williams, Kate Carcia,
	linux-kernel, linux-perf-users, Adrian Hunter, Changbin Du,
	Hao Luo, Ian Rogers, James Clark, Kan Liang, Roman Lozko,
	Stephane Eranian, Thomas Richter, Arnaldo Carvalho de Melo, bpf

Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Andrii, can you add some more information about the usage of vmlinux.h
> > instead of using kernel headers?
 
> I'll just say that vmlinux.h is not a hard requirement to build BPF
> programs, it's more a convenience allowing easy access to definitions
> of both UAPI and kernel-internal structures for tracing needs and
> marking them relocatable using BPF CO-RE machinery. Lots of real-world
> applications just check-in pregenerated vmlinux.h to avoid build-time
> dependency on up-to-date host kernel and such.
 
> If vmlinux.h generation and usage is causing issues, though, given
> that perf's BPF programs don't seem to be using many different kernel
> types, it might be a better option to just use UAPI headers for public
> kernel type definitions, and just define CO-RE-relocatable minimal
> definitions locally in perf's BPF code for the other types necessary.
> E.g., if perf needs only pid and tgid from task_struct, this would
> suffice:
 
> struct task_struct {
>     int pid;
>     int tgid;
> } __attribute__((preserve_access_index));

Yeah, that seems like a way better approach, no vmlinux involved, libbpf
CO-RE notices that task_struct changed from this two integers version
(of course) and does the relocation to where it is in the running kernel
by using /sys/kernel/btf/vmlinux.

I looked and the creation of vmlinux.h was introduced in:

commit 944138f048f7d7591ec7568c94b21de8df2724d4
Author: Namhyung Kim <namhyung@kernel.org>
Date:   Thu Jul 1 14:12:27 2021 -0700

    perf stat: Enable BPF counter with --for-each-cgroup

    Recently bperf was added to use BPF to count perf events for various
    purposes.  This is an extension for the approach and targetting to
    cgroup usages.

    Unlike the other bperf, it doesn't share the events with other
    processes but it'd reduces unnecessary events (and the overhead of
    multiplexing) for each monitored cgroup within the perf session.

    When --for-each-cgroup is used with --bpf-counters, it will open
    cgroup-switches event per cpu internally and attach the new BPF
    program to read given perf_events and to aggregate the results for
    cgroups.  It's only called when task is switched to a task in a
    different cgroup.

    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Link: http://lore.kernel.org/lkml/20210701211227.1403788-1-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Which I think was the first BPF skel to access a kernel data structure,
yeah:

tools/perf/util/bpf_skel/bperf_cgroup.bpf.c

For things like:

+static inline int get_cgroup_v1_idx(__u32 *cgrps, int size)
+{
+       struct task_struct *p = (void *)bpf_get_current_task();
+       struct cgroup *cgrp;
+       register int i = 0;
+       __u32 *elem;
+       int level;
+       int cnt;
+
+       cgrp = BPF_CORE_READ(p, cgroups, subsys[perf_event_cgrp_id], cgroup);
+       level = BPF_CORE_READ(cgrp, level);

So we can completely remove touching vmlinux from the perf building
process.

If we can get the revert of the patches making BPF skels to build by
default for v6.4 then we would do this work, test it thorougly and have
it available for v6.5.

Linus, would that be a way forward?

- Arnaldo

For reference, here is the definition for BPF_CORE_READ() from tools/lib/bpf/bpf_core_read.h

/*
 * BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially
 * when there are few pointer chasing steps.
 * E.g., what in non-BPF world (or in BPF w/ BCC) would be something like:
 *      int x = s->a.b.c->d.e->f->g;
 * can be succinctly achieved using BPF_CORE_READ as:
 *      int x = BPF_CORE_READ(s, a.b.c, d.e, f, g);
 *
 * BPF_CORE_READ will decompose above statement into 4 bpf_core_read (BPF
 * CO-RE relocatable bpf_probe_read_kernel() wrapper) calls, logically
 * equivalent to:
 * 1. const void *__t = s->a.b.c;
 * 2. __t = __t->d.e;
 * 3. __t = __t->f;
 * 4. return __t->g;
 *
 * Equivalence is logical, because there is a heavy type casting/preservation
 * involved, as well as all the reads are happening through
 * bpf_probe_read_kernel() calls using __builtin_preserve_access_index() to
 * emit CO-RE relocations.
 *
 * N.B. Only up to 9 "field accessors" are supported, which should be more
 * than enough for any practical purpose.
 */
#define BPF_CORE_READ(src, a, ...) ({                                       \
        ___type((src), a, ##__VA_ARGS__) __r;                               \
        BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__);                  \
        __r;                                                                \
})

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 19:07             ` Arnaldo Carvalho de Melo
@ 2023-05-04 21:48               ` Arnaldo Carvalho de Melo
  2023-05-04 22:01                 ` Arnaldo Carvalho de Melo
                                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-04 21:48 UTC (permalink / raw)
  To: Andrii Nakryiko, Namhyung Kim, Linus Torvalds
  Cc: Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Jiri Olsa, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo, Ian Rogers,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > Andrii, can you add some more information about the usage of vmlinux.h
> > > instead of using kernel headers?
>  
> > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > programs, it's more a convenience allowing easy access to definitions
> > of both UAPI and kernel-internal structures for tracing needs and
> > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > applications just check-in pregenerated vmlinux.h to avoid build-time
> > dependency on up-to-date host kernel and such.
>  
> > If vmlinux.h generation and usage is causing issues, though, given
> > that perf's BPF programs don't seem to be using many different kernel
> > types, it might be a better option to just use UAPI headers for public
> > kernel type definitions, and just define CO-RE-relocatable minimal
> > definitions locally in perf's BPF code for the other types necessary.
> > E.g., if perf needs only pid and tgid from task_struct, this would
> > suffice:
>  
> > struct task_struct {
> >     int pid;
> >     int tgid;
> > } __attribute__((preserve_access_index));
> 
> Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> CO-RE notices that task_struct changed from this two integers version
> (of course) and does the relocation to where it is in the running kernel
> by using /sys/kernel/btf/vmlinux.

Doing it for one of the skels, build tested, runtime untested, but not
using any vmlinux, BTF to help, not that bad, more verbose, but at least
we state what are the fields we actually use, have those attribute
documenting that those offsets will be recorded for future use, etc.

Namhyung, can you please check that this works?

Thanks,

- Arnaldo

diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
index 6a438e0102c5a2cb..f376d162549ebd74 100644
--- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
+++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
@@ -1,11 +1,40 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 // Copyright (c) 2021 Facebook
 // Copyright (c) 2021 Google
-#include "vmlinux.h"
+#include <linux/types.h>
+#include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 #include <bpf/bpf_core_read.h>
 
+// libbpf's CO-RE will take care of the relocations so that these fields match
+// the layout of these structs in the kernel where this ends up running on.
+
+struct cgroup_subsys_state {
+	struct cgroup *cgroup;
+} __attribute__((preserve_access_index));
+
+struct css_set {
+	struct cgroup_subsys_state *subsys[13];
+} __attribute__((preserve_access_index));
+
+struct task_struct {
+	struct css_set *cgroups;
+} __attribute__((preserve_access_index));
+
+struct kernfs_node {
+	__u64 id;
+}  __attribute__((preserve_access_index));
+
+struct cgroup {
+	struct kernfs_node *kn;
+	int                level;
+}  __attribute__((preserve_access_index));
+
+enum cgroup_subsys_id {
+	perf_event_cgrp_id  = 8,
+};
+
 #define MAX_LEVELS  10  // max cgroup hierarchy level: arbitrary
 #define MAX_EVENTS  32  // max events per cgroup: arbitrary
 
@@ -52,7 +81,7 @@ struct cgroup___new {
 /* old kernel cgroup definition */
 struct cgroup___old {
 	int level;
-	u64 ancestor_ids[];
+	__u64 ancestor_ids[];
 } __attribute__((preserve_access_index));
 
 const volatile __u32 num_events = 1;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 21:48               ` Arnaldo Carvalho de Melo
@ 2023-05-04 22:01                 ` Arnaldo Carvalho de Melo
  2023-05-05 13:18                   ` Arnaldo Carvalho de Melo
  2023-05-05 13:20                   ` Arnaldo Carvalho de Melo
  2023-05-04 22:03                 ` Ian Rogers
  2023-05-04 22:46                 ` Namhyung Kim
  2 siblings, 2 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-04 22:01 UTC (permalink / raw)
  To: Andrii Nakryiko, Namhyung Kim, Linus Torvalds
  Cc: Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Jiri Olsa, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo, Ian Rogers,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

Em Thu, May 04, 2023 at 06:48:50PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > instead of using kernel headers?
> >  
> > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > programs, it's more a convenience allowing easy access to definitions
> > > of both UAPI and kernel-internal structures for tracing needs and
> > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > dependency on up-to-date host kernel and such.
> >  
> > > If vmlinux.h generation and usage is causing issues, though, given
> > > that perf's BPF programs don't seem to be using many different kernel
> > > types, it might be a better option to just use UAPI headers for public
> > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > definitions locally in perf's BPF code for the other types necessary.
> > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > suffice:
> >  
> > > struct task_struct {
> > >     int pid;
> > >     int tgid;
> > > } __attribute__((preserve_access_index));
> > 
> > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > CO-RE notices that task_struct changed from this two integers version
> > (of course) and does the relocation to where it is in the running kernel
> > by using /sys/kernel/btf/vmlinux.
> 
> Doing it for one of the skels, build tested, runtime untested, but not
> using any vmlinux, BTF to help, not that bad, more verbose, but at least
> we state what are the fields we actually use, have those attribute
> documenting that those offsets will be recorded for future use, etc.
> 
> Namhyung, can you please check that this works?

Second case was simpler:

diff --git a/tools/perf/util/bpf_skel/bperf_follower.bpf.c b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
index f193998530d431d8..1ab06f2ff5ad7548 100644
--- a/tools/perf/util/bpf_skel/bperf_follower.bpf.c
+++ b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 // Copyright (c) 2021 Facebook
-#include "vmlinux.h"
+#include <linux/types.h>
+#include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 #include "bperf_u.h"

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 21:48               ` Arnaldo Carvalho de Melo
  2023-05-04 22:01                 ` Arnaldo Carvalho de Melo
@ 2023-05-04 22:03                 ` Ian Rogers
  2023-05-04 23:03                   ` Jiri Olsa
  2023-05-04 22:46                 ` Namhyung Kim
  2 siblings, 1 reply; 30+ messages in thread
From: Ian Rogers @ 2023-05-04 22:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Namhyung Kim, Linus Torvalds, Song Liu,
	Andrii Nakryiko, Ingo Molnar, Thomas Gleixner, Jiri Olsa,
	Clark Williams, Kate Carcia, linux-kernel, linux-perf-users,
	Adrian Hunter, Changbin Du, Hao Luo, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf

On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > instead of using kernel headers?
> >
> > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > programs, it's more a convenience allowing easy access to definitions
> > > of both UAPI and kernel-internal structures for tracing needs and
> > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > dependency on up-to-date host kernel and such.
> >
> > > If vmlinux.h generation and usage is causing issues, though, given
> > > that perf's BPF programs don't seem to be using many different kernel
> > > types, it might be a better option to just use UAPI headers for public
> > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > definitions locally in perf's BPF code for the other types necessary.
> > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > suffice:
> >
> > > struct task_struct {
> > >     int pid;
> > >     int tgid;
> > > } __attribute__((preserve_access_index));
> >
> > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > CO-RE notices that task_struct changed from this two integers version
> > (of course) and does the relocation to where it is in the running kernel
> > by using /sys/kernel/btf/vmlinux.
>
> Doing it for one of the skels, build tested, runtime untested, but not
> using any vmlinux, BTF to help, not that bad, more verbose, but at least
> we state what are the fields we actually use, have those attribute
> documenting that those offsets will be recorded for future use, etc.
>
> Namhyung, can you please check that this works?
>
> Thanks,
>
> - Arnaldo
>
> diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> index 6a438e0102c5a2cb..f376d162549ebd74 100644
> --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> @@ -1,11 +1,40 @@
>  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>  // Copyright (c) 2021 Facebook
>  // Copyright (c) 2021 Google
> -#include "vmlinux.h"
> +#include <linux/types.h>
> +#include <linux/bpf.h>

Compared to vmlinux.h here be dragons. It is easy to start dragging in
all of libc and that may not work due to missing #ifdefs, etc.. Could
we check in a vmlinux.h like libbpf-tools does?
https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64

This would also remove some of the errors that could be introduced by
copy+pasting enums, etc. and also highlight issues with things being
renamed as build time rather than runtime failures.
Could this be some shared resource for the different linux tools
projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
install_headers target that builds a vmlinux.h.

Thanks,
Ian

>  #include <bpf/bpf_helpers.h>
>  #include <bpf/bpf_tracing.h>
>  #include <bpf/bpf_core_read.h>
>
> +// libbpf's CO-RE will take care of the relocations so that these fields match
> +// the layout of these structs in the kernel where this ends up running on.
> +
> +struct cgroup_subsys_state {
> +       struct cgroup *cgroup;
> +} __attribute__((preserve_access_index));
> +
> +struct css_set {
> +       struct cgroup_subsys_state *subsys[13];
> +} __attribute__((preserve_access_index));
> +
> +struct task_struct {
> +       struct css_set *cgroups;
> +} __attribute__((preserve_access_index));
> +
> +struct kernfs_node {
> +       __u64 id;
> +}  __attribute__((preserve_access_index));
> +
> +struct cgroup {
> +       struct kernfs_node *kn;
> +       int                level;
> +}  __attribute__((preserve_access_index));
> +
> +enum cgroup_subsys_id {
> +       perf_event_cgrp_id  = 8,
> +};
> +
>  #define MAX_LEVELS  10  // max cgroup hierarchy level: arbitrary
>  #define MAX_EVENTS  32  // max events per cgroup: arbitrary
>
> @@ -52,7 +81,7 @@ struct cgroup___new {
>  /* old kernel cgroup definition */
>  struct cgroup___old {
>         int level;
> -       u64 ancestor_ids[];
> +       __u64 ancestor_ids[];
>  } __attribute__((preserve_access_index));
>
>  const volatile __u32 num_events = 1;

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 21:48               ` Arnaldo Carvalho de Melo
  2023-05-04 22:01                 ` Arnaldo Carvalho de Melo
  2023-05-04 22:03                 ` Ian Rogers
@ 2023-05-04 22:46                 ` Namhyung Kim
  2 siblings, 0 replies; 30+ messages in thread
From: Namhyung Kim @ 2023-05-04 22:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Linus Torvalds, Song Liu, Andrii Nakryiko,
	Ingo Molnar, Thomas Gleixner, Jiri Olsa, Clark Williams,
	Kate Carcia, linux-kernel, linux-perf-users, Adrian Hunter,
	Changbin Du, Hao Luo, Ian Rogers, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf

On Thu, May 04, 2023 at 06:48:50PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > instead of using kernel headers?
> >  
> > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > programs, it's more a convenience allowing easy access to definitions
> > > of both UAPI and kernel-internal structures for tracing needs and
> > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > dependency on up-to-date host kernel and such.
> >  
> > > If vmlinux.h generation and usage is causing issues, though, given
> > > that perf's BPF programs don't seem to be using many different kernel
> > > types, it might be a better option to just use UAPI headers for public
> > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > definitions locally in perf's BPF code for the other types necessary.
> > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > suffice:
> >  
> > > struct task_struct {
> > >     int pid;
> > >     int tgid;
> > > } __attribute__((preserve_access_index));
> > 
> > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > CO-RE notices that task_struct changed from this two integers version
> > (of course) and does the relocation to where it is in the running kernel
> > by using /sys/kernel/btf/vmlinux.
> 
> Doing it for one of the skels, build tested, runtime untested, but not
> using any vmlinux, BTF to help, not that bad, more verbose, but at least
> we state what are the fields we actually use, have those attribute
> documenting that those offsets will be recorded for future use, etc.
> 
> Namhyung, can you please check that this works?

Yep, it works great!

  $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,user.slice,system.slice sleep 1
  
   Performance counter stats for 'system wide':
  
           64,110.41 msec cpu-clock                        /                #   64.004 CPUs utilized
              15,787      context-switches                 /                #  246.247 /sec
                  72      cpu-migrations                   /                #    1.123 /sec
               1,236      page-faults                      /                #   19.279 /sec
         848,608,137      cycles                           /                #    0.013 GHz                         (83.23%)
         106,928,070      stalled-cycles-frontend          /                #   12.60% frontend cycles idle        (83.23%)
         209,204,795      stalled-cycles-backend           /                #   24.65% backend cycles idle         (83.23%)
         645,183,025      instructions                     /                #    0.76  insn per cycle
                                                    #    0.32  stalled cycles per insn     (83.24%)
         141,776,876      branches                         /                #    2.211 M/sec                       (83.63%)
           3,001,078      branch-misses                    /                #    2.12% of all branches             (83.44%)
               66.67 msec cpu-clock                        user.slice       #    0.067 CPUs utilized
                 695      context-switches                 user.slice       #   10.424 K/sec
                  22      cpu-migrations                   user.slice       #  329.966 /sec
               1,202      page-faults                      user.slice       #   18.028 K/sec
         150,514,330      cycles                           user.slice       #    2.257 GHz                         (90.17%)
          13,504,605      stalled-cycles-frontend          user.slice       #    8.97% frontend cycles idle        (69.71%)
          38,859,376      stalled-cycles-backend           user.slice       #   25.82% backend cycles idle         (95.28%)
         189,382,145      instructions                     user.slice       #    1.26  insn per cycle
                                                    #    0.21  stalled cycles per insn     (88.92%)
          36,019,878      branches                         user.slice       #  540.242 M/sec                       (90.16%)
             697,723      branch-misses                    user.slice       #    1.94% of all branches             (65.77%)
               44.33 msec cpu-clock                        system.slice     #    0.044 CPUs utilized
               2,382      context-switches                 system.slice     #   53.732 K/sec
                  42      cpu-migrations                   system.slice     #  947.418 /sec
                  34      page-faults                      system.slice     #  766.958 /sec
         100,383,549      cycles                           system.slice     #    2.264 GHz                         (87.27%)
          10,165,225      stalled-cycles-frontend          system.slice     #   10.13% frontend cycles idle        (71.73%)
          29,964,682      stalled-cycles-backend           system.slice     #   29.85% backend cycles idle         (84.94%)
         101,210,743      instructions                     system.slice     #    1.01  insn per cycle
                                                    #    0.30  stalled cycles per insn     (80.68%)
          19,893,831      branches                         system.slice     #  448.757 M/sec                       (86.94%)
             397,854      branch-misses                    system.slice     #    2.00% of all branches             (88.42%)
  
         1.001667221 seconds time elapsed
  
Thanks,
Namhyung


> diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> index 6a438e0102c5a2cb..f376d162549ebd74 100644
> --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> @@ -1,11 +1,40 @@
>  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>  // Copyright (c) 2021 Facebook
>  // Copyright (c) 2021 Google
> -#include "vmlinux.h"
> +#include <linux/types.h>
> +#include <linux/bpf.h>
>  #include <bpf/bpf_helpers.h>
>  #include <bpf/bpf_tracing.h>
>  #include <bpf/bpf_core_read.h>
>  
> +// libbpf's CO-RE will take care of the relocations so that these fields match
> +// the layout of these structs in the kernel where this ends up running on.
> +
> +struct cgroup_subsys_state {
> +	struct cgroup *cgroup;
> +} __attribute__((preserve_access_index));
> +
> +struct css_set {
> +	struct cgroup_subsys_state *subsys[13];
> +} __attribute__((preserve_access_index));
> +
> +struct task_struct {
> +	struct css_set *cgroups;
> +} __attribute__((preserve_access_index));
> +
> +struct kernfs_node {
> +	__u64 id;
> +}  __attribute__((preserve_access_index));
> +
> +struct cgroup {
> +	struct kernfs_node *kn;
> +	int                level;
> +}  __attribute__((preserve_access_index));
> +
> +enum cgroup_subsys_id {
> +	perf_event_cgrp_id  = 8,
> +};
> +
>  #define MAX_LEVELS  10  // max cgroup hierarchy level: arbitrary
>  #define MAX_EVENTS  32  // max events per cgroup: arbitrary
>  
> @@ -52,7 +81,7 @@ struct cgroup___new {
>  /* old kernel cgroup definition */
>  struct cgroup___old {
>  	int level;
> -	u64 ancestor_ids[];
> +	__u64 ancestor_ids[];
>  } __attribute__((preserve_access_index));
>  
>  const volatile __u32 num_events = 1;
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 22:03                 ` Ian Rogers
@ 2023-05-04 23:03                   ` Jiri Olsa
  2023-05-04 23:15                     ` Namhyung Kim
                                       ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Jiri Olsa @ 2023-05-04 23:03 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Namhyung Kim,
	Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > instead of using kernel headers?
> > >
> > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > programs, it's more a convenience allowing easy access to definitions
> > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > dependency on up-to-date host kernel and such.
> > >
> > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > that perf's BPF programs don't seem to be using many different kernel
> > > > types, it might be a better option to just use UAPI headers for public
> > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > definitions locally in perf's BPF code for the other types necessary.
> > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > suffice:
> > >
> > > > struct task_struct {
> > > >     int pid;
> > > >     int tgid;
> > > > } __attribute__((preserve_access_index));
> > >
> > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > CO-RE notices that task_struct changed from this two integers version
> > > (of course) and does the relocation to where it is in the running kernel
> > > by using /sys/kernel/btf/vmlinux.
> >
> > Doing it for one of the skels, build tested, runtime untested, but not
> > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > we state what are the fields we actually use, have those attribute
> > documenting that those offsets will be recorded for future use, etc.
> >
> > Namhyung, can you please check that this works?
> >
> > Thanks,
> >
> > - Arnaldo
> >
> > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > @@ -1,11 +1,40 @@
> >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> >  // Copyright (c) 2021 Facebook
> >  // Copyright (c) 2021 Google
> > -#include "vmlinux.h"
> > +#include <linux/types.h>
> > +#include <linux/bpf.h>
> 
> Compared to vmlinux.h here be dragons. It is easy to start dragging in
> all of libc and that may not work due to missing #ifdefs, etc.. Could
> we check in a vmlinux.h like libbpf-tools does?
> https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> 
> This would also remove some of the errors that could be introduced by
> copy+pasting enums, etc. and also highlight issues with things being
> renamed as build time rather than runtime failures.

we already have to deal with that, right? doing checks on fields in
structs like mm_struct___old

> Could this be some shared resource for the different linux tools
> projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> install_headers target that builds a vmlinux.h.

I tried to do the minimal header and it's not too big,
I pushed it in here:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h

compile tested so far

jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 23:03                   ` Jiri Olsa
@ 2023-05-04 23:15                     ` Namhyung Kim
  2023-05-05  9:36                       ` Jiri Olsa
  2023-05-04 23:19                     ` Ian Rogers
  2023-05-05 13:33                     ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 30+ messages in thread
From: Namhyung Kim @ 2023-05-04 23:15 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ian Rogers, Arnaldo Carvalho de Melo, Andrii Nakryiko,
	Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

Hi Jiri,

On Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa wrote:
> On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > >
> > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > instead of using kernel headers?
> > > >
> > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > dependency on up-to-date host kernel and such.
> > > >
> > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > types, it might be a better option to just use UAPI headers for public
> > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > suffice:
> > > >
> > > > > struct task_struct {
> > > > >     int pid;
> > > > >     int tgid;
> > > > > } __attribute__((preserve_access_index));
> > > >
> > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > CO-RE notices that task_struct changed from this two integers version
> > > > (of course) and does the relocation to where it is in the running kernel
> > > > by using /sys/kernel/btf/vmlinux.
> > >
> > > Doing it for one of the skels, build tested, runtime untested, but not
> > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > we state what are the fields we actually use, have those attribute
> > > documenting that those offsets will be recorded for future use, etc.
> > >
> > > Namhyung, can you please check that this works?
> > >
> > > Thanks,
> > >
> > > - Arnaldo
> > >
> > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > @@ -1,11 +1,40 @@
> > >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > >  // Copyright (c) 2021 Facebook
> > >  // Copyright (c) 2021 Google
> > > -#include "vmlinux.h"
> > > +#include <linux/types.h>
> > > +#include <linux/bpf.h>
> > 
> > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > we check in a vmlinux.h like libbpf-tools does?
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> > 
> > This would also remove some of the errors that could be introduced by
> > copy+pasting enums, etc. and also highlight issues with things being
> > renamed as build time rather than runtime failures.
> 
> we already have to deal with that, right? doing checks on fields in
> structs like mm_struct___old
> 
> > Could this be some shared resource for the different linux tools
> > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > install_headers target that builds a vmlinux.h.
> 
> I tried to do the minimal header and it's not too big,
> I pushed it in here:
>   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
> 
> compile tested so far
 
Cool.  But I think you missed this.

diff --git a/tools/perf/util/bpf_skel/perf-defs.h b/tools/perf/util/bpf_skel/perf-defs.h
index 1320e1be03b8..4cfa8a9fce39 100644
--- a/tools/perf/util/bpf_skel/perf-defs.h
+++ b/tools/perf/util/bpf_skel/perf-defs.h
@@ -253,6 +253,7 @@ typedef struct {
 } atomic64_t;
 
 struct rw_semaphore {
+	atomic_long_t owner;
 } __attribute__((preserve_access_index));
 
 typedef atomic64_t atomic_long_t;


Thanks,
Namhyung

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 23:03                   ` Jiri Olsa
  2023-05-04 23:15                     ` Namhyung Kim
@ 2023-05-04 23:19                     ` Ian Rogers
  2023-05-05  9:39                       ` Jiri Olsa
  2023-05-05 13:33                     ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 30+ messages in thread
From: Ian Rogers @ 2023-05-04 23:19 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Namhyung Kim,
	Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

On Thu, May 4, 2023 at 4:03 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > >
> > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > instead of using kernel headers?
> > > >
> > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > dependency on up-to-date host kernel and such.
> > > >
> > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > types, it might be a better option to just use UAPI headers for public
> > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > suffice:
> > > >
> > > > > struct task_struct {
> > > > >     int pid;
> > > > >     int tgid;
> > > > > } __attribute__((preserve_access_index));
> > > >
> > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > CO-RE notices that task_struct changed from this two integers version
> > > > (of course) and does the relocation to where it is in the running kernel
> > > > by using /sys/kernel/btf/vmlinux.
> > >
> > > Doing it for one of the skels, build tested, runtime untested, but not
> > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > we state what are the fields we actually use, have those attribute
> > > documenting that those offsets will be recorded for future use, etc.
> > >
> > > Namhyung, can you please check that this works?
> > >
> > > Thanks,
> > >
> > > - Arnaldo
> > >
> > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > @@ -1,11 +1,40 @@
> > >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > >  // Copyright (c) 2021 Facebook
> > >  // Copyright (c) 2021 Google
> > > -#include "vmlinux.h"
> > > +#include <linux/types.h>
> > > +#include <linux/bpf.h>
> >
> > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > we check in a vmlinux.h like libbpf-tools does?
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> >
> > This would also remove some of the errors that could be introduced by
> > copy+pasting enums, etc. and also highlight issues with things being
> > renamed as build time rather than runtime failures.
>
> we already have to deal with that, right? doing checks on fields in
> structs like mm_struct___old

We do, but the way I detected the problems in the first place was by
building against older kernels. Now the build will always succeed but
fail at runtime.

> > Could this be some shared resource for the different linux tools
> > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > install_headers target that builds a vmlinux.h.
>
> I tried to do the minimal header and it's not too big,
> I pushed it in here:
>   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
>
> compile tested so far
>
> jirka

Cool, could we just call it vmlinux.h rather than perf-defs.h?

I notice cgroup_subsys_id is in there which is called out in Andrii's
CO-RE  guide/blog:
https://nakryiko.com/posts/bpf-core-reference-guide/#relocatable-enums
perhaps we can do something with names/types to make sure a helper is
being used for these enum values.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 23:15                     ` Namhyung Kim
@ 2023-05-05  9:36                       ` Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: Jiri Olsa @ 2023-05-05  9:36 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Ian Rogers, Arnaldo Carvalho de Melo, Andrii Nakryiko,
	Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

On Thu, May 04, 2023 at 04:15:08PM -0700, Namhyung Kim wrote:
> Hi Jiri,
> 
> On Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa wrote:
> > On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > >
> > > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > > instead of using kernel headers?
> > > > >
> > > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > > dependency on up-to-date host kernel and such.
> > > > >
> > > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > > types, it might be a better option to just use UAPI headers for public
> > > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > > suffice:
> > > > >
> > > > > > struct task_struct {
> > > > > >     int pid;
> > > > > >     int tgid;
> > > > > > } __attribute__((preserve_access_index));
> > > > >
> > > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > > CO-RE notices that task_struct changed from this two integers version
> > > > > (of course) and does the relocation to where it is in the running kernel
> > > > > by using /sys/kernel/btf/vmlinux.
> > > >
> > > > Doing it for one of the skels, build tested, runtime untested, but not
> > > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > > we state what are the fields we actually use, have those attribute
> > > > documenting that those offsets will be recorded for future use, etc.
> > > >
> > > > Namhyung, can you please check that this works?
> > > >
> > > > Thanks,
> > > >
> > > > - Arnaldo
> > > >
> > > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > @@ -1,11 +1,40 @@
> > > >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > >  // Copyright (c) 2021 Facebook
> > > >  // Copyright (c) 2021 Google
> > > > -#include "vmlinux.h"
> > > > +#include <linux/types.h>
> > > > +#include <linux/bpf.h>
> > > 
> > > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > > we check in a vmlinux.h like libbpf-tools does?
> > > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> > > 
> > > This would also remove some of the errors that could be introduced by
> > > copy+pasting enums, etc. and also highlight issues with things being
> > > renamed as build time rather than runtime failures.
> > 
> > we already have to deal with that, right? doing checks on fields in
> > structs like mm_struct___old
> > 
> > > Could this be some shared resource for the different linux tools
> > > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > > install_headers target that builds a vmlinux.h.
> > 
> > I tried to do the minimal header and it's not too big,
> > I pushed it in here:
> >   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
> > 
> > compile tested so far
>  
> Cool.  But I think you missed this.
> 
> diff --git a/tools/perf/util/bpf_skel/perf-defs.h b/tools/perf/util/bpf_skel/perf-defs.h
> index 1320e1be03b8..4cfa8a9fce39 100644
> --- a/tools/perf/util/bpf_skel/perf-defs.h
> +++ b/tools/perf/util/bpf_skel/perf-defs.h
> @@ -253,6 +253,7 @@ typedef struct {
>  } atomic64_t;
>  
>  struct rw_semaphore {
> +	atomic_long_t owner;
>  } __attribute__((preserve_access_index));

ah right, I did not see that because my clang took another #ifdef leg

thanks,
jirka

>  
>  typedef atomic64_t atomic_long_t;
> 
> 
> Thanks,
> Namhyung

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 23:19                     ` Ian Rogers
@ 2023-05-05  9:39                       ` Jiri Olsa
  2023-05-05 11:42                         ` Jiri Olsa
  0 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2023-05-05  9:39 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Andrii Nakryiko,
	Namhyung Kim, Linus Torvalds, Song Liu, Andrii Nakryiko,
	Ingo Molnar, Thomas Gleixner, Clark Williams, Kate Carcia,
	linux-kernel, linux-perf-users, Adrian Hunter, Changbin Du,
	Hao Luo, James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

On Thu, May 04, 2023 at 04:19:47PM -0700, Ian Rogers wrote:
> On Thu, May 4, 2023 at 4:03 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > >
> > > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > > instead of using kernel headers?
> > > > >
> > > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > > dependency on up-to-date host kernel and such.
> > > > >
> > > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > > types, it might be a better option to just use UAPI headers for public
> > > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > > suffice:
> > > > >
> > > > > > struct task_struct {
> > > > > >     int pid;
> > > > > >     int tgid;
> > > > > > } __attribute__((preserve_access_index));
> > > > >
> > > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > > CO-RE notices that task_struct changed from this two integers version
> > > > > (of course) and does the relocation to where it is in the running kernel
> > > > > by using /sys/kernel/btf/vmlinux.
> > > >
> > > > Doing it for one of the skels, build tested, runtime untested, but not
> > > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > > we state what are the fields we actually use, have those attribute
> > > > documenting that those offsets will be recorded for future use, etc.
> > > >
> > > > Namhyung, can you please check that this works?
> > > >
> > > > Thanks,
> > > >
> > > > - Arnaldo
> > > >
> > > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > @@ -1,11 +1,40 @@
> > > >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > >  // Copyright (c) 2021 Facebook
> > > >  // Copyright (c) 2021 Google
> > > > -#include "vmlinux.h"
> > > > +#include <linux/types.h>
> > > > +#include <linux/bpf.h>
> > >
> > > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > > we check in a vmlinux.h like libbpf-tools does?
> > > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> > >
> > > This would also remove some of the errors that could be introduced by
> > > copy+pasting enums, etc. and also highlight issues with things being
> > > renamed as build time rather than runtime failures.
> >
> > we already have to deal with that, right? doing checks on fields in
> > structs like mm_struct___old
> 
> We do, but the way I detected the problems in the first place was by
> building against older kernels. Now the build will always succeed but
> fail at runtime.
> 
> > > Could this be some shared resource for the different linux tools
> > > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > > install_headers target that builds a vmlinux.h.
> >
> > I tried to do the minimal header and it's not too big,
> > I pushed it in here:
> >   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
> >
> > compile tested so far
> >
> > jirka
> 
> Cool, could we just call it vmlinux.h rather than perf-defs.h?

right, it also makes the change smaller

> 
> I notice cgroup_subsys_id is in there which is called out in Andrii's
> CO-RE  guide/blog:
> https://nakryiko.com/posts/bpf-core-reference-guide/#relocatable-enums
> perhaps we can do something with names/types to make sure a helper is
> being used for these enum values.

ok, I'll check on that..  so far I made some clean ups and updated the branch

thanks,
jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05  9:39                       ` Jiri Olsa
@ 2023-05-05 11:42                         ` Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: Jiri Olsa @ 2023-05-05 11:42 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Andrii Nakryiko,
	Namhyung Kim, Linus Torvalds, Song Liu, Andrii Nakryiko,
	Ingo Molnar, Thomas Gleixner, Clark Williams, Kate Carcia,
	linux-kernel, linux-perf-users, Adrian Hunter, Changbin Du,
	Hao Luo, James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

On Fri, May 05, 2023 at 11:39:19AM +0200, Jiri Olsa wrote:
> On Thu, May 04, 2023 at 04:19:47PM -0700, Ian Rogers wrote:
> > On Thu, May 4, 2023 at 4:03 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > > > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > >
> > > > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > > > instead of using kernel headers?
> > > > > >
> > > > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > > > dependency on up-to-date host kernel and such.
> > > > > >
> > > > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > > > types, it might be a better option to just use UAPI headers for public
> > > > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > > > suffice:
> > > > > >
> > > > > > > struct task_struct {
> > > > > > >     int pid;
> > > > > > >     int tgid;
> > > > > > > } __attribute__((preserve_access_index));
> > > > > >
> > > > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > > > CO-RE notices that task_struct changed from this two integers version
> > > > > > (of course) and does the relocation to where it is in the running kernel
> > > > > > by using /sys/kernel/btf/vmlinux.
> > > > >
> > > > > Doing it for one of the skels, build tested, runtime untested, but not
> > > > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > > > we state what are the fields we actually use, have those attribute
> > > > > documenting that those offsets will be recorded for future use, etc.
> > > > >
> > > > > Namhyung, can you please check that this works?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > - Arnaldo
> > > > >
> > > > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > > @@ -1,11 +1,40 @@
> > > > >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > > >  // Copyright (c) 2021 Facebook
> > > > >  // Copyright (c) 2021 Google
> > > > > -#include "vmlinux.h"
> > > > > +#include <linux/types.h>
> > > > > +#include <linux/bpf.h>
> > > >
> > > > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > > > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > > > we check in a vmlinux.h like libbpf-tools does?
> > > > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > > > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> > > >
> > > > This would also remove some of the errors that could be introduced by
> > > > copy+pasting enums, etc. and also highlight issues with things being
> > > > renamed as build time rather than runtime failures.
> > >
> > > we already have to deal with that, right? doing checks on fields in
> > > structs like mm_struct___old
> > 
> > We do, but the way I detected the problems in the first place was by
> > building against older kernels. Now the build will always succeed but
> > fail at runtime.
> > 
> > > > Could this be some shared resource for the different linux tools
> > > > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > > > install_headers target that builds a vmlinux.h.
> > >
> > > I tried to do the minimal header and it's not too big,
> > > I pushed it in here:
> > >   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
> > >
> > > compile tested so far
> > >
> > > jirka
> > 
> > Cool, could we just call it vmlinux.h rather than perf-defs.h?
> 
> right, it also makes the change smaller
> 
> > 
> > I notice cgroup_subsys_id is in there which is called out in Andrii's
> > CO-RE  guide/blog:
> > https://nakryiko.com/posts/bpf-core-reference-guide/#relocatable-enums
> > perhaps we can do something with names/types to make sure a helper is
> > being used for these enum values.

both bperf_cgroup and off_cpu programs use bpf_core_enum_value, so we should be fine

jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 22:01                 ` Arnaldo Carvalho de Melo
@ 2023-05-05 13:18                   ` Arnaldo Carvalho de Melo
  2023-05-06  1:13                     ` Yang Jihong
  2023-05-05 13:20                   ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-05 13:18 UTC (permalink / raw)
  To: Yang Jihong, Andrii Nakryiko
  Cc: Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Jiri Olsa, Clark Williams, Kate Carcia,
	linux-kernel, linux-perf-users, Adrian Hunter, Changbin Du,
	Hao Luo, Ian Rogers, James Clark, Kan Liang, Roman Lozko,
	Stephane Eranian, Thomas Richter, Arnaldo Carvalho de Melo, bpf

Em Thu, May 04, 2023 at 07:01:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, May 04, 2023 at 06:48:50PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > instead of using kernel headers?
> > >  
> > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > programs, it's more a convenience allowing easy access to definitions
> > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > dependency on up-to-date host kernel and such.
> > >  
> > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > that perf's BPF programs don't seem to be using many different kernel
> > > > types, it might be a better option to just use UAPI headers for public
> > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > definitions locally in perf's BPF code for the other types necessary.
> > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > suffice:
> > >  
> > > > struct task_struct {
> > > >     int pid;
> > > >     int tgid;
> > > > } __attribute__((preserve_access_index));
> > > 
> > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > CO-RE notices that task_struct changed from this two integers version
> > > (of course) and does the relocation to where it is in the running kernel
> > > by using /sys/kernel/btf/vmlinux.
> > 
> > Doing it for one of the skels, build tested, runtime untested, but not
> > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > we state what are the fields we actually use, have those attribute
> > documenting that those offsets will be recorded for future use, etc.

Yang, can you please check that this works?


From bd6289bc3ffc89aecad3bd8798d76626c8c16d39 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Fri, 5 May 2023 10:13:09 -0300
Subject: [PATCH 1/1] perf kwork_trace.bpf: Stop using vmlinux.h, grab copies
 of used structs

And mark them with __attribute__((preserve_access_index)) so that
libbpf's CO-RE code can fixup offsets if they differ with the kernel
data structure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/bpf_skel/kwork_trace.bpf.c | 70 +++++++++++++++++++++-
 1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf_skel/kwork_trace.bpf.c b/tools/perf/util/bpf_skel/kwork_trace.bpf.c
index 063c124e099938ed..e38fe54c7667fa74 100644
--- a/tools/perf/util/bpf_skel/kwork_trace.bpf.c
+++ b/tools/perf/util/bpf_skel/kwork_trace.bpf.c
@@ -1,13 +1,81 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 // Copyright (c) 2022, Huawei
 
-#include "vmlinux.h"
+#include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 
 #define KWORK_COUNT 100
 #define MAX_KWORKNAME 128
 
+
+// non-UAPI kernel data structures, just the fields used in this tool,
+// preserving the access index so that libbpf can fixup offsets with the ones
+// used in the kernel when loading the BPF bytecode, if they differ from what
+// is used here.
+
+enum {
+	HI_SOFTIRQ = 0,
+	TIMER_SOFTIRQ,
+	NET_TX_SOFTIRQ,
+	NET_RX_SOFTIRQ,
+	BLOCK_SOFTIRQ,
+	IRQ_POLL_SOFTIRQ,
+	TASKLET_SOFTIRQ,
+	SCHED_SOFTIRQ,
+	HRTIMER_SOFTIRQ,
+	RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */
+
+	NR_SOFTIRQS
+};
+
+struct trace_entry {
+	short unsigned int type;
+	unsigned char	   flags;
+	unsigned char	   preempt_count;
+	int		   pid;
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_irq_handler_entry {
+	struct trace_entry ent;
+	int		   irq;
+	__u32		   __data_loc_name;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_irq_handler_exit {
+	struct trace_entry ent;
+	int		   irq;
+	int		   ret;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_softirq {
+	struct trace_entry ent;
+	unsigned int	   vec;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_workqueue_execute_start {
+	struct trace_entry ent;
+	void		   *work;
+	void		   *function;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_workqueue_execute_end {
+	struct trace_entry ent;
+	void		   *work;
+	void		   *function;
+	char		  __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_workqueue_activate_work {
+	struct trace_entry ent;
+	void		   *work;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
 /*
  * This should be in sync with "util/kwork.h"
  */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 22:01                 ` Arnaldo Carvalho de Melo
  2023-05-05 13:18                   ` Arnaldo Carvalho de Melo
@ 2023-05-05 13:20                   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-05 13:20 UTC (permalink / raw)
  To: Namhyung Kim, Andrii Nakryiko, Linus Torvalds
  Cc: Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Jiri Olsa, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo, Ian Rogers,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf

Em Thu, May 04, 2023 at 07:01:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, May 04, 2023 at 06:48:50PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > instead of using kernel headers?
> > >  
> > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > programs, it's more a convenience allowing easy access to definitions
> > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > dependency on up-to-date host kernel and such.
> > >  
> > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > that perf's BPF programs don't seem to be using many different kernel
> > > > types, it might be a better option to just use UAPI headers for public
> > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > definitions locally in perf's BPF code for the other types necessary.
> > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > suffice:
> > >  
> > > > struct task_struct {
> > > >     int pid;
> > > >     int tgid;
> > > > } __attribute__((preserve_access_index));
> > > 
> > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > CO-RE notices that task_struct changed from this two integers version
> > > (of course) and does the relocation to where it is in the running kernel
> > > by using /sys/kernel/btf/vmlinux.
> > 
> > Doing it for one of the skels, build tested, runtime untested, but not
> > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > we state what are the fields we actually use, have those attribute
> > documenting that those offsets will be recorded for future use, etc.
> > 

Namhyung, can you please check that this one for the recent sample works?

From c6972dae6c962d7be5ba006ab90c9955268debc5 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Fri, 5 May 2023 09:55:18 -0300
Subject: [PATCH 1/2] perf sample_filter.bpf: Stop using vmlinux.h generated by
 bpftool, use CO-RE

Including linux/bpf.h and linux/perf_events.h we get the UAPI structs
and then define a subset  'struct perf_sample_data' with the fields we
use in this tool while using __attribute__((preserve_access_index)) so
that at libbpf load time it can fixup the offsets according to the
'struct perf_data_sample' obtained from the running kernel BTF
(/sys/kernel/btf/vmlinux).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/bpf_skel/sample_filter.bpf.c | 37 +++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf_skel/sample_filter.bpf.c b/tools/perf/util/bpf_skel/sample_filter.bpf.c
index cffe493af1ed5f31..045532c2366d74ef 100644
--- a/tools/perf/util/bpf_skel/sample_filter.bpf.c
+++ b/tools/perf/util/bpf_skel/sample_filter.bpf.c
@@ -1,12 +1,47 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 // Copyright (c) 2023 Google
-#include "vmlinux.h"
+#include <linux/bpf.h>
+#include <linux/perf_event.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 #include <bpf/bpf_core_read.h>
 
 #include "sample-filter.h"
 
+// non-UAPI kernel data structures, just the fields used in this tool,
+// preserving the access index so that libbpf can fixup offsets with the ones
+// used in the kernel when loading the BPF bytecode, if they differ from what
+// is used here.
+
+struct perf_sample_data {
+	__u64			 addr;
+	__u64			 period;
+	union perf_sample_weight weight;
+	__u64			 txn;
+	union perf_mem_data_src  data_src;
+	__u64			 ip;
+	struct {
+		__u32		 pid;
+		__u32		 tid;
+	} tid_entry;
+	__u64			 time;
+	__u64			 id;
+	struct {
+		__u32		 cpu;
+	} cpu_entry;
+	__u64			 phys_addr;
+	__u64			 data_page_size;
+	__u64			 code_page_size;
+} __attribute__((__aligned__(64))) __attribute__((preserve_access_index));
+
+struct bpf_perf_event_data_kern {
+	struct perf_sample_data *  data;
+	struct perf_event *        event;
+
+	/* size: 24, cachelines: 1, members: 3 */
+	/* last cacheline: 24 bytes */
+} __attribute__((preserve_access_index));
+
 /* BPF map that will be filled by user space */
 struct filters {
 	__uint(type, BPF_MAP_TYPE_ARRAY);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-04 23:03                   ` Jiri Olsa
  2023-05-04 23:15                     ` Namhyung Kim
  2023-05-04 23:19                     ` Ian Rogers
@ 2023-05-05 13:33                     ` Arnaldo Carvalho de Melo
  2023-05-05 15:14                       ` Alexei Starovoitov
  2023-05-05 16:56                       ` [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was " Arnaldo Carvalho de Melo
  2 siblings, 2 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-05 13:33 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ian Rogers, Andrii Nakryiko, Namhyung Kim, Linus Torvalds,
	Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Clark Williams, Kate Carcia, linux-kernel, linux-perf-users,
	Adrian Hunter, Changbin Du, Hao Luo, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf

Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > >
> > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > instead of using kernel headers?
> > > >
> > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > dependency on up-to-date host kernel and such.
> > > >
> > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > types, it might be a better option to just use UAPI headers for public
> > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > suffice:
> > > >
> > > > > struct task_struct {
> > > > >     int pid;
> > > > >     int tgid;
> > > > > } __attribute__((preserve_access_index));
> > > >
> > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > CO-RE notices that task_struct changed from this two integers version
> > > > (of course) and does the relocation to where it is in the running kernel
> > > > by using /sys/kernel/btf/vmlinux.
> > >
> > > Doing it for one of the skels, build tested, runtime untested, but not
> > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > we state what are the fields we actually use, have those attribute
> > > documenting that those offsets will be recorded for future use, etc.
> > >
> > > Namhyung, can you please check that this works?
> > >
> > > Thanks,
> > >
> > > - Arnaldo
> > >
> > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > @@ -1,11 +1,40 @@
> > >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > >  // Copyright (c) 2021 Facebook
> > >  // Copyright (c) 2021 Google
> > > -#include "vmlinux.h"
> > > +#include <linux/types.h>
> > > +#include <linux/bpf.h>
> > 
> > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > we check in a vmlinux.h like libbpf-tools does?
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> > 
> > This would also remove some of the errors that could be introduced by
> > copy+pasting enums, etc. and also highlight issues with things being
> > renamed as build time rather than runtime failures.
> 
> we already have to deal with that, right? doing checks on fields in
> structs like mm_struct___old
> 
> > Could this be some shared resource for the different linux tools
> > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > install_headers target that builds a vmlinux.h.
> 
> I tried to do the minimal header and it's not too big,
> I pushed it in here:
>   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
> 
> compile tested so far

I see it and it makes the change to be minimal, which is good at the
current stage, but I wonder if it wouldn't be better for us to define
just the ones not in UAPI and use the #include <linux/bpf.h>,
<linux/perf_event.h> as I did in the patches I posted here and Namhyung
tested at least one, this way the added vmlinux.h file get even smaller
by not including things like:

[acme@quaco perf-tools]$ egrep -w '(perf_event_sample_format|bpf_perf_event_value|perf_sample_weight|perf_mem_data_src) {' include/uapi/linux/*.h
include/uapi/linux/bpf.h:struct bpf_perf_event_value {
include/uapi/linux/perf_event.h:enum perf_event_sample_format {
include/uapi/linux/perf_event.h:union perf_mem_data_src {
include/uapi/linux/perf_event.h:union perf_mem_data_src {
include/uapi/linux/perf_event.h:union perf_sample_weight {
[acme@quaco perf-tools]$

Also why do we need these:

+struct mm_struct {
+} __attribute__((preserve_access_index));
+
+struct raw_spinlock {
+} __attribute__((preserve_access_index));
+
+typedef struct raw_spinlock raw_spinlock_t;
+
+struct spinlock {
+} __attribute__((preserve_access_index));
+
+typedef struct spinlock spinlock_t;
+
+struct sighand_struct {
+	spinlock_t siglock;
+} __attribute__((preserve_access_index));

We don't use them, they're just pointers you kept on:

+struct task_struct {
+	struct css_set *cgroups;
+	pid_t pid;
+	pid_t tgid;
+	char comm[16];
+	struct mm_struct *mm;
+	struct sighand_struct *sighand;
+	unsigned int flags;
+} __attribute__((preserve_access_index));

That with the preserve_access_index isn't needed, we need just the
fields that we access in the tools, right?

- Arnaldo


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 13:33                     ` Arnaldo Carvalho de Melo
@ 2023-05-05 15:14                       ` Alexei Starovoitov
  2023-05-05 16:56                       ` [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was " Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2023-05-05 15:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Quentin Monnet
  Cc: Jiri Olsa, Ian Rogers, Andrii Nakryiko, Namhyung Kim,
	Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, LKML,
	linux-perf-use., Adrian Hunter, Changbin Du, Hao Luo, James Clark,
	Kan Liang, Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf

On Fri, May 5, 2023 at 6:33 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > >
> > > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > > instead of using kernel headers?
> > > > >
> > > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > > dependency on up-to-date host kernel and such.
> > > > >
> > > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > > types, it might be a better option to just use UAPI headers for public
> > > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > > suffice:
> > > > >
> > > > > > struct task_struct {
> > > > > >     int pid;
> > > > > >     int tgid;
> > > > > > } __attribute__((preserve_access_index));
> > > > >
> > > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > > CO-RE notices that task_struct changed from this two integers version
> > > > > (of course) and does the relocation to where it is in the running kernel
> > > > > by using /sys/kernel/btf/vmlinux.
> > > >
> > > > Doing it for one of the skels, build tested, runtime untested, but not
> > > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > > we state what are the fields we actually use, have those attribute
> > > > documenting that those offsets will be recorded for future use, etc.
> > > >
> > > > Namhyung, can you please check that this works?
> > > >
> > > > Thanks,
> > > >
> > > > - Arnaldo
> > > >
> > > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > > @@ -1,11 +1,40 @@
> > > >  // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > >  // Copyright (c) 2021 Facebook
> > > >  // Copyright (c) 2021 Google
> > > > -#include "vmlinux.h"
> > > > +#include <linux/types.h>
> > > > +#include <linux/bpf.h>
> > >
> > > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > > we check in a vmlinux.h like libbpf-tools does?
> > > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> > >
> > > This would also remove some of the errors that could be introduced by
> > > copy+pasting enums, etc. and also highlight issues with things being
> > > renamed as build time rather than runtime failures.
> >
> > we already have to deal with that, right? doing checks on fields in
> > structs like mm_struct___old
> >
> > > Could this be some shared resource for the different linux tools
> > > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > > install_headers target that builds a vmlinux.h.
> >
> > I tried to do the minimal header and it's not too big,
> > I pushed it in here:
> >   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
> >
> > compile tested so far
>
> I see it and it makes the change to be minimal, which is good at the
> current stage, but I wonder if it wouldn't be better for us to define
> just the ones not in UAPI and use the #include <linux/bpf.h>,
> <linux/perf_event.h> as I did in the patches I posted here and Namhyung
> tested at least one, this way the added vmlinux.h file get even smaller
> by not including things like:
>
> [acme@quaco perf-tools]$ egrep -w '(perf_event_sample_format|bpf_perf_event_value|perf_sample_weight|perf_mem_data_src) {' include/uapi/linux/*.h
> include/uapi/linux/bpf.h:struct bpf_perf_event_value {
> include/uapi/linux/perf_event.h:enum perf_event_sample_format {
> include/uapi/linux/perf_event.h:union perf_mem_data_src {
> include/uapi/linux/perf_event.h:union perf_mem_data_src {
> include/uapi/linux/perf_event.h:union perf_sample_weight {
> [acme@quaco perf-tools]$
>
> Also why do we need these:
>
> +struct mm_struct {
> +} __attribute__((preserve_access_index));
> +
> +struct raw_spinlock {
> +} __attribute__((preserve_access_index));
> +
> +typedef struct raw_spinlock raw_spinlock_t;
> +
> +struct spinlock {
> +} __attribute__((preserve_access_index));
> +
> +typedef struct spinlock spinlock_t;
> +
> +struct sighand_struct {
> +       spinlock_t siglock;
> +} __attribute__((preserve_access_index));
>
> We don't use them, they're just pointers you kept on:
>
> +struct task_struct {
> +       struct css_set *cgroups;
> +       pid_t pid;
> +       pid_t tgid;
> +       char comm[16];
> +       struct mm_struct *mm;
> +       struct sighand_struct *sighand;
> +       unsigned int flags;
> +} __attribute__((preserve_access_index));
>
> That with the preserve_access_index isn't needed, we need just the
> fields that we access in the tools, right?

Aside from that you probably want to take a look at BTFgen.
Old doc:
https://github.com/aquasecurity/btfhub/blob/main/docs/btfgen-internals.md
which landed as
"bpftool gen min_core_btf"
man bpftool-gen

It addresses the use case for kernels _without_ CONFIG_DEBUG_INFO_BTF.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 13:33                     ` Arnaldo Carvalho de Melo
  2023-05-05 15:14                       ` Alexei Starovoitov
@ 2023-05-05 16:56                       ` Arnaldo Carvalho de Melo
  2023-05-05 17:04                         ` Ian Rogers
  2023-05-08 21:53                         ` Ian Rogers
  1 sibling, 2 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-05 16:56 UTC (permalink / raw)
  To: Jiri Olsa, Linus Torvalds
  Cc: Ian Rogers, Andrii Nakryiko, Namhyung Kim, Song Liu,
	Andrii Nakryiko, Ingo Molnar, Thomas Gleixner, Clark Williams,
	Kate Carcia, linux-kernel, linux-perf-users, Adrian Hunter,
	Changbin Du, Hao Luo, James Clark, Kan Liang, Roman Lozko,
	Stephane Eranian, Thomas Richter, Arnaldo Carvalho de Melo, bpf,
	Alexei Starovoitov, Yang Jihong, Mark Rutland, Paul Clarke

Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> That with the preserve_access_index isn't needed, we need just the
> fields that we access in the tools, right?

I'm now doing build test this in many distro containers, without the two
reverts, i.e. BPF skels continue as opt-out as in my pull request, to
test build and also for the functionality tests on the tools using such
bpf skels, see below, no touching of vmlinux nor BTF data during the
build.

- Arnaldo

From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 4 May 2023 19:03:51 -0300
Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
 use subset of used structs + CO-RE

Linus reported a build break due to using a vmlinux without a BTF elf
section to generate the vmlinux.h header with bpftool for use in the BPF
tools in tools/perf/util/bpf_skel/*.bpf.c.

Instead add a vmlinux.h file with the structs needed with the fields the
tools need, marking the structs with __attribute__((preserve_access_index)),
so that libbpf's CO-RE code can fixup the struct field offsets.

In some cases the vmlinux.h file that was being generated by bpftool
from the kernel BTF information was not needed at all, just including
linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
types were not being used.

To keep te patch small, include those UAPI headers from the trimmed down
vmlinux.h file, that then provides the tools with just the structs and
the subset of its fields needed for them.

Testing it:

  # perf lock contention -b find / > /dev/null
  ^C contended   total wait     max wait     avg wait         type   caller

           7     53.59 us     10.86 us      7.66 us     rwlock:R   start_this_handle+0xa0
           2     30.35 us     21.99 us     15.17 us      rwsem:R   iterate_dir+0x52
           1      9.04 us      9.04 us      9.04 us     rwlock:W   start_this_handle+0x291
           1      8.73 us      8.73 us      8.73 us     spinlock   raw_spin_rq_lock_nested+0x1e
  #
  # perf lock contention -abl find / > /dev/null
  ^C contended   total wait     max wait     avg wait            address   symbol

           1    262.96 ms    262.96 ms    262.96 ms   ffff8e67502d0170    (mutex)
          12    244.24 us     39.91 us     20.35 us   ffff8e6af56f8070   mmap_lock (rwsem)
           7     30.28 us      6.85 us      4.33 us   ffff8e6c865f1d40   rq_lock (spinlock)
           3      7.42 us      4.03 us      2.47 us   ffff8e6c864b1d40   rq_lock (spinlock)
           2      3.72 us      2.19 us      1.86 us   ffff8e6c86571d40   rq_lock (spinlock)
           1      2.42 us      2.42 us      2.42 us   ffff8e6c86471d40   rq_lock (spinlock)
           4      2.11 us       559 ns       527 ns   ffffffff9a146c80   rcu_state (spinlock)
           3      1.45 us       818 ns       482 ns   ffff8e674ae8384c    (rwlock)
           1       870 ns       870 ns       870 ns   ffff8e68456ee060    (rwlock)
           1       663 ns       663 ns       663 ns   ffff8e6c864f1d40   rq_lock (spinlock)
           1       573 ns       573 ns       573 ns   ffff8e6c86531d40   rq_lock (spinlock)
           1       472 ns       472 ns       472 ns   ffff8e6c86431740    (spinlock)
           1       397 ns       397 ns       397 ns   ffff8e67413a4f04    (spinlock)
  #
  # perf test offcpu
  95: perf record offcpu profiling tests                              : Ok
  #
  # perf kwork latency --use-bpf
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)flush_memcg_stats_dwork     | 0000 |   1056.212 ms |         2 |   2112.345 ms |     550113.229573 s |     550115.341919 s |
    (w)toggle_allocation_gate      | 0000 |     10.144 ms |        62 |    416.389 ms |     550113.453518 s |     550113.869907 s |
    (w)0xffff8e6748e28080          | 0002 |      0.623 ms |         1 |      0.623 ms |     550110.989841 s |     550110.990464 s |
    (w)vmstat_shepherd             | 0000 |      0.586 ms |        10 |      2.828 ms |     550111.971536 s |     550111.974364 s |
    (w)vmstat_update               | 0007 |      0.363 ms |         5 |      1.634 ms |     550113.222520 s |     550113.224154 s |
    (w)vmstat_update               | 0000 |      0.324 ms |        10 |      2.827 ms |     550111.971526 s |     550111.974354 s |
    (w)0xffff8e674c5f4a58          | 0002 |      0.102 ms |         5 |      0.134 ms |     550110.989839 s |     550110.989972 s |
    (w)psi_avgs_work               | 0001 |      0.086 ms |         3 |      0.107 ms |     550114.957852 s |     550114.957959 s |
    (w)psi_avgs_work               | 0000 |      0.079 ms |         5 |      0.100 ms |     550118.605668 s |     550118.605768 s |
    (w)kfree_rcu_monitor           | 0006 |      0.079 ms |         1 |      0.079 ms |     550110.925821 s |     550110.925900 s |
    (w)psi_avgs_work               | 0004 |      0.079 ms |         1 |      0.079 ms |     550109.581835 s |     550109.581914 s |
    (w)psi_avgs_work               | 0001 |      0.078 ms |         1 |      0.078 ms |     550109.197809 s |     550109.197887 s |
    (w)psi_avgs_work               | 0002 |      0.077 ms |         5 |      0.086 ms |     550110.669819 s |     550110.669905 s |
  <SNIP>
  # strace -e bpf -o perf-stat-bpf-counters.output perf stat -e cycles --bpf-counters sleep 1

   Performance counter stats for 'sleep 1':

           6,197,983      cycles

         1.003922848 seconds time elapsed

         0.000000000 seconds user
         0.002032000 seconds sys

  # head -7 perf-stat-bpf-counters.output
  bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 16) = 3
  bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=88, info=0x7ffcead64990}}, 16) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x24129e0, value=0x7ffcead65a48, flags=BPF_ANY}, 32) = 0
  bpf(BPF_LINK_GET_FD_BY_ID, {link_id=1252}, 12) = -1 ENOENT (No such file or directory)
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65780, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 116) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65920, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 128) = 4
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 28) = 4
  #

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Co-developed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.perf            |  20 +---
 tools/perf/util/bpf_skel/.gitignore |   1 -
 tools/perf/util/bpf_skel/vmlinux.h  | 173 ++++++++++++++++++++++++++++
 3 files changed, 174 insertions(+), 20 deletions(-)
 create mode 100644 tools/perf/util/bpf_skel/vmlinux.h

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 48aba186ceb50792..61c33d100b2bcc90 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1063,25 +1063,7 @@ $(BPFTOOL): | $(SKEL_TMP_OUT)
 	$(Q)CFLAGS= $(MAKE) -C ../bpf/bpftool \
 		OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
 
-VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux)				\
-		     $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux)	\
-		     ../../vmlinux					\
-		     /sys/kernel/btf/vmlinux				\
-		     /boot/vmlinux-$(shell uname -r)
-VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
-
-$(SKEL_OUT)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL)
-ifeq ($(VMLINUX_H),)
-	$(QUIET_GEN)$(BPFTOOL) btf dump file $< format c > $@ || \
-	(echo "Failure to generate vmlinux.h needed for the recommended BPF skeleton support." && \
-	echo "To disable this use the build option NO_BPF_SKEL=1." && \
-	echo "Alternatively point at a pre-generated vmlinux.h with VMLINUX_H=<path>." && \
-	false)
-else
-	$(Q)cp "$(VMLINUX_H)" $@
-endif
-
-$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) $(SKEL_OUT)/vmlinux.h | $(SKEL_TMP_OUT)
+$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
 	$(QUIET_CLANG)$(CLANG) -g -O2 -target bpf -Wall -Werror $(BPF_INCLUDE) \
 	  -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@
 
diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore
index cd01455e1b53c3d9..7a1c832825de8445 100644
--- a/tools/perf/util/bpf_skel/.gitignore
+++ b/tools/perf/util/bpf_skel/.gitignore
@@ -1,4 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
 .tmp
 *.skel.h
-vmlinux.h
diff --git a/tools/perf/util/bpf_skel/vmlinux.h b/tools/perf/util/bpf_skel/vmlinux.h
new file mode 100644
index 0000000000000000..449b1ea91fc48143
--- /dev/null
+++ b/tools/perf/util/bpf_skel/vmlinux.h
@@ -0,0 +1,173 @@
+#ifndef __VMLINUX_H
+#define __VMLINUX_H
+
+#include <linux/bpf.h>
+#include <linux/types.h>
+#include <linux/perf_event.h>
+#include <stdbool.h>
+
+// non-UAPI kernel data structures, used in the .bpf.c BPF tool component.
+
+// Just the fields used in these tools preserving the access index so that
+// libbpf can fixup offsets with the ones used in the kernel when loading the
+// BPF bytecode, if they differ from what is used here.
+
+typedef __u8 u8;
+typedef __u32 u32;
+typedef __u64 u64;
+typedef __s64 s64;
+
+typedef int pid_t;
+
+enum cgroup_subsys_id {
+	perf_event_cgrp_id  = 8,
+};
+
+enum {
+	HI_SOFTIRQ = 0,
+	TIMER_SOFTIRQ,
+	NET_TX_SOFTIRQ,
+	NET_RX_SOFTIRQ,
+	BLOCK_SOFTIRQ,
+	IRQ_POLL_SOFTIRQ,
+	TASKLET_SOFTIRQ,
+	SCHED_SOFTIRQ,
+	HRTIMER_SOFTIRQ,
+	RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */
+
+	NR_SOFTIRQS
+};
+
+typedef struct {
+	s64	counter;
+} __attribute__((preserve_access_index)) atomic64_t;
+
+typedef atomic64_t atomic_long_t;
+
+struct raw_spinlock {
+	int rawlock;
+} __attribute__((preserve_access_index));
+
+typedef struct raw_spinlock raw_spinlock_t;
+
+typedef struct {
+	struct raw_spinlock rlock;
+} __attribute__((preserve_access_index)) spinlock_t;
+
+struct sighand_struct {
+	spinlock_t siglock;
+} __attribute__((preserve_access_index));
+
+struct rw_semaphore {
+	atomic_long_t owner;
+} __attribute__((preserve_access_index));
+
+struct mutex {
+	atomic_long_t owner;
+} __attribute__((preserve_access_index));
+
+struct kernfs_node {
+	u64 id;
+} __attribute__((preserve_access_index));
+
+struct cgroup {
+	struct kernfs_node *kn;
+	int                level;
+}  __attribute__((preserve_access_index));
+
+struct cgroup_subsys_state {
+	struct cgroup *cgroup;
+} __attribute__((preserve_access_index));
+
+struct css_set {
+	struct cgroup_subsys_state *subsys[13];
+	struct cgroup *dfl_cgrp;
+} __attribute__((preserve_access_index));
+
+struct mm_struct {
+	struct rw_semaphore mmap_lock;
+} __attribute__((preserve_access_index));
+
+struct task_struct {
+	unsigned int	      flags;
+	struct mm_struct      *mm;
+	pid_t		      pid;
+	pid_t		      tgid;
+	char		      comm[16];
+	struct sighand_struct *sighand;
+	struct css_set	      *cgroups;
+} __attribute__((preserve_access_index));
+
+struct trace_entry {
+	short unsigned int type;
+	unsigned char	   flags;
+	unsigned char	   preempt_count;
+	int		   pid;
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_irq_handler_entry {
+	struct trace_entry ent;
+	int		   irq;
+	u32		   __data_loc_name;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_irq_handler_exit {
+	struct trace_entry ent;
+	int		   irq;
+	int		   ret;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_softirq {
+	struct trace_entry ent;
+	unsigned int	   vec;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_workqueue_execute_start {
+	struct trace_entry ent;
+	void		   *work;
+	void		   *function;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_workqueue_execute_end {
+	struct trace_entry ent;
+	void		   *work;
+	void		   *function;
+	char		  __data[];
+} __attribute__((preserve_access_index));
+
+struct trace_event_raw_workqueue_activate_work {
+	struct trace_entry ent;
+	void		   *work;
+	char		   __data[];
+} __attribute__((preserve_access_index));
+
+struct perf_sample_data {
+	u64			 addr;
+	u64			 period;
+	union perf_sample_weight weight;
+	u64			 txn;
+	union perf_mem_data_src	 data_src;
+	u64			 ip;
+	struct {
+		u32		 pid;
+		u32		 tid;
+	} tid_entry;
+	u64			 time;
+	u64			 id;
+	struct {
+		u32		 cpu;
+	} cpu_entry;
+	u64			 phys_addr;
+	u64			 data_page_size;
+	u64			 code_page_size;
+} __attribute__((__aligned__(64))) __attribute__((preserve_access_index));
+
+struct bpf_perf_event_data_kern {
+	struct perf_sample_data *data;
+	struct perf_event	*event;
+} __attribute__((preserve_access_index));
+#endif // __VMLINUX_H
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 16:56                       ` [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was " Arnaldo Carvalho de Melo
@ 2023-05-05 17:04                         ` Ian Rogers
  2023-05-05 20:43                           ` Jiri Olsa
  2023-05-08 21:53                         ` Ian Rogers
  1 sibling, 1 reply; 30+ messages in thread
From: Ian Rogers @ 2023-05-05 17:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Linus Torvalds, Andrii Nakryiko, Namhyung Kim,
	Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Clark Williams, Kate Carcia, linux-kernel, linux-perf-users,
	Adrian Hunter, Changbin Du, Hao Luo, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov, Yang Jihong,
	Mark Rutland, Paul Clarke

On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > That with the preserve_access_index isn't needed, we need just the
> > fields that we access in the tools, right?
>
> I'm now doing build test this in many distro containers, without the two
> reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> test build and also for the functionality tests on the tools using such
> bpf skels, see below, no touching of vmlinux nor BTF data during the
> build.
>
> - Arnaldo
>
> From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date: Thu, 4 May 2023 19:03:51 -0300
> Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
>  use subset of used structs + CO-RE
>
> Linus reported a build break due to using a vmlinux without a BTF elf
> section to generate the vmlinux.h header with bpftool for use in the BPF
> tools in tools/perf/util/bpf_skel/*.bpf.c.
>
> Instead add a vmlinux.h file with the structs needed with the fields the
> tools need, marking the structs with __attribute__((preserve_access_index)),
> so that libbpf's CO-RE code can fixup the struct field offsets.
>
> In some cases the vmlinux.h file that was being generated by bpftool
> from the kernel BTF information was not needed at all, just including
> linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> types were not being used.
>
> To keep te patch small, include those UAPI headers from the trimmed down
> vmlinux.h file, that then provides the tools with just the structs and
> the subset of its fields needed for them.
>
> Testing it:
>
>   # perf lock contention -b find / > /dev/null
>   ^C contended   total wait     max wait     avg wait         type   caller
>
>            7     53.59 us     10.86 us      7.66 us     rwlock:R   start_this_handle+0xa0
>            2     30.35 us     21.99 us     15.17 us      rwsem:R   iterate_dir+0x52
>            1      9.04 us      9.04 us      9.04 us     rwlock:W   start_this_handle+0x291
>            1      8.73 us      8.73 us      8.73 us     spinlock   raw_spin_rq_lock_nested+0x1e
>   #
>   # perf lock contention -abl find / > /dev/null
>   ^C contended   total wait     max wait     avg wait            address   symbol
>
>            1    262.96 ms    262.96 ms    262.96 ms   ffff8e67502d0170    (mutex)
>           12    244.24 us     39.91 us     20.35 us   ffff8e6af56f8070   mmap_lock (rwsem)
>            7     30.28 us      6.85 us      4.33 us   ffff8e6c865f1d40   rq_lock (spinlock)
>            3      7.42 us      4.03 us      2.47 us   ffff8e6c864b1d40   rq_lock (spinlock)
>            2      3.72 us      2.19 us      1.86 us   ffff8e6c86571d40   rq_lock (spinlock)
>            1      2.42 us      2.42 us      2.42 us   ffff8e6c86471d40   rq_lock (spinlock)
>            4      2.11 us       559 ns       527 ns   ffffffff9a146c80   rcu_state (spinlock)
>            3      1.45 us       818 ns       482 ns   ffff8e674ae8384c    (rwlock)
>            1       870 ns       870 ns       870 ns   ffff8e68456ee060    (rwlock)
>            1       663 ns       663 ns       663 ns   ffff8e6c864f1d40   rq_lock (spinlock)
>            1       573 ns       573 ns       573 ns   ffff8e6c86531d40   rq_lock (spinlock)
>            1       472 ns       472 ns       472 ns   ffff8e6c86431740    (spinlock)
>            1       397 ns       397 ns       397 ns   ffff8e67413a4f04    (spinlock)
>   #
>   # perf test offcpu
>   95: perf record offcpu profiling tests                              : Ok
>   #
>   # perf kwork latency --use-bpf
>   Starting trace, Hit <Ctrl+C> to stop and report
>   ^C
>     Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
>    --------------------------------------------------------------------------------------------------------------------------------
>     (w)flush_memcg_stats_dwork     | 0000 |   1056.212 ms |         2 |   2112.345 ms |     550113.229573 s |     550115.341919 s |
>     (w)toggle_allocation_gate      | 0000 |     10.144 ms |        62 |    416.389 ms |     550113.453518 s |     550113.869907 s |
>     (w)0xffff8e6748e28080          | 0002 |      0.623 ms |         1 |      0.623 ms |     550110.989841 s |     550110.990464 s |
>     (w)vmstat_shepherd             | 0000 |      0.586 ms |        10 |      2.828 ms |     550111.971536 s |     550111.974364 s |
>     (w)vmstat_update               | 0007 |      0.363 ms |         5 |      1.634 ms |     550113.222520 s |     550113.224154 s |
>     (w)vmstat_update               | 0000 |      0.324 ms |        10 |      2.827 ms |     550111.971526 s |     550111.974354 s |
>     (w)0xffff8e674c5f4a58          | 0002 |      0.102 ms |         5 |      0.134 ms |     550110.989839 s |     550110.989972 s |
>     (w)psi_avgs_work               | 0001 |      0.086 ms |         3 |      0.107 ms |     550114.957852 s |     550114.957959 s |
>     (w)psi_avgs_work               | 0000 |      0.079 ms |         5 |      0.100 ms |     550118.605668 s |     550118.605768 s |
>     (w)kfree_rcu_monitor           | 0006 |      0.079 ms |         1 |      0.079 ms |     550110.925821 s |     550110.925900 s |
>     (w)psi_avgs_work               | 0004 |      0.079 ms |         1 |      0.079 ms |     550109.581835 s |     550109.581914 s |
>     (w)psi_avgs_work               | 0001 |      0.078 ms |         1 |      0.078 ms |     550109.197809 s |     550109.197887 s |
>     (w)psi_avgs_work               | 0002 |      0.077 ms |         5 |      0.086 ms |     550110.669819 s |     550110.669905 s |
>   <SNIP>
>   # strace -e bpf -o perf-stat-bpf-counters.output perf stat -e cycles --bpf-counters sleep 1
>
>    Performance counter stats for 'sleep 1':
>
>            6,197,983      cycles
>
>          1.003922848 seconds time elapsed
>
>          0.000000000 seconds user
>          0.002032000 seconds sys
>
>   # head -7 perf-stat-bpf-counters.output
>   bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 16) = 3
>   bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=88, info=0x7ffcead64990}}, 16) = 0
>   bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x24129e0, value=0x7ffcead65a48, flags=BPF_ANY}, 32) = 0
>   bpf(BPF_LINK_GET_FD_BY_ID, {link_id=1252}, 12) = -1 ENOENT (No such file or directory)
>   bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65780, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 116) = 4
>   bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65920, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 128) = 4
>   bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 28) = 4
>   #
>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Co-developed-by: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/Makefile.perf            |  20 +---
>  tools/perf/util/bpf_skel/.gitignore |   1 -
>  tools/perf/util/bpf_skel/vmlinux.h  | 173 ++++++++++++++++++++++++++++
>  3 files changed, 174 insertions(+), 20 deletions(-)
>  create mode 100644 tools/perf/util/bpf_skel/vmlinux.h
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 48aba186ceb50792..61c33d100b2bcc90 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -1063,25 +1063,7 @@ $(BPFTOOL): | $(SKEL_TMP_OUT)
>         $(Q)CFLAGS= $(MAKE) -C ../bpf/bpftool \
>                 OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
>
> -VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux)                           \
> -                    $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux)    \
> -                    ../../vmlinux                                      \
> -                    /sys/kernel/btf/vmlinux                            \
> -                    /boot/vmlinux-$(shell uname -r)
> -VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
> -
> -$(SKEL_OUT)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL)
> -ifeq ($(VMLINUX_H),)
> -       $(QUIET_GEN)$(BPFTOOL) btf dump file $< format c > $@ || \
> -       (echo "Failure to generate vmlinux.h needed for the recommended BPF skeleton support." && \
> -       echo "To disable this use the build option NO_BPF_SKEL=1." && \
> -       echo "Alternatively point at a pre-generated vmlinux.h with VMLINUX_H=<path>." && \
> -       false)
> -else
> -       $(Q)cp "$(VMLINUX_H)" $@
> -endif

Perhaps we can define VMLINUX_H in the Makefile.config to be
tools/perf/util/bpf_skel/vmlinux.h, then someone can clear it like:
make -C tools/perf VMLINIX_H=
to trigger generation, etc. Such thoughts may be better kept for 6.5 :-)

Thanks,
Ian

> -$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) $(SKEL_OUT)/vmlinux.h | $(SKEL_TMP_OUT)
> +$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
>         $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf -Wall -Werror $(BPF_INCLUDE) \
>           -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@
>
> diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore
> index cd01455e1b53c3d9..7a1c832825de8445 100644
> --- a/tools/perf/util/bpf_skel/.gitignore
> +++ b/tools/perf/util/bpf_skel/.gitignore
> @@ -1,4 +1,3 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  .tmp
>  *.skel.h
> -vmlinux.h
> diff --git a/tools/perf/util/bpf_skel/vmlinux.h b/tools/perf/util/bpf_skel/vmlinux.h
> new file mode 100644
> index 0000000000000000..449b1ea91fc48143
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/vmlinux.h
> @@ -0,0 +1,173 @@
> +#ifndef __VMLINUX_H
> +#define __VMLINUX_H
> +
> +#include <linux/bpf.h>
> +#include <linux/types.h>
> +#include <linux/perf_event.h>
> +#include <stdbool.h>
> +
> +// non-UAPI kernel data structures, used in the .bpf.c BPF tool component.
> +
> +// Just the fields used in these tools preserving the access index so that
> +// libbpf can fixup offsets with the ones used in the kernel when loading the
> +// BPF bytecode, if they differ from what is used here.
> +
> +typedef __u8 u8;
> +typedef __u32 u32;
> +typedef __u64 u64;
> +typedef __s64 s64;
> +
> +typedef int pid_t;
> +
> +enum cgroup_subsys_id {
> +       perf_event_cgrp_id  = 8,
> +};
> +
> +enum {
> +       HI_SOFTIRQ = 0,
> +       TIMER_SOFTIRQ,
> +       NET_TX_SOFTIRQ,
> +       NET_RX_SOFTIRQ,
> +       BLOCK_SOFTIRQ,
> +       IRQ_POLL_SOFTIRQ,
> +       TASKLET_SOFTIRQ,
> +       SCHED_SOFTIRQ,
> +       HRTIMER_SOFTIRQ,
> +       RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */
> +
> +       NR_SOFTIRQS
> +};
> +
> +typedef struct {
> +       s64     counter;
> +} __attribute__((preserve_access_index)) atomic64_t;
> +
> +typedef atomic64_t atomic_long_t;
> +
> +struct raw_spinlock {
> +       int rawlock;
> +} __attribute__((preserve_access_index));
> +
> +typedef struct raw_spinlock raw_spinlock_t;
> +
> +typedef struct {
> +       struct raw_spinlock rlock;
> +} __attribute__((preserve_access_index)) spinlock_t;
> +
> +struct sighand_struct {
> +       spinlock_t siglock;
> +} __attribute__((preserve_access_index));
> +
> +struct rw_semaphore {
> +       atomic_long_t owner;
> +} __attribute__((preserve_access_index));
> +
> +struct mutex {
> +       atomic_long_t owner;
> +} __attribute__((preserve_access_index));
> +
> +struct kernfs_node {
> +       u64 id;
> +} __attribute__((preserve_access_index));
> +
> +struct cgroup {
> +       struct kernfs_node *kn;
> +       int                level;
> +}  __attribute__((preserve_access_index));
> +
> +struct cgroup_subsys_state {
> +       struct cgroup *cgroup;
> +} __attribute__((preserve_access_index));
> +
> +struct css_set {
> +       struct cgroup_subsys_state *subsys[13];
> +       struct cgroup *dfl_cgrp;
> +} __attribute__((preserve_access_index));
> +
> +struct mm_struct {
> +       struct rw_semaphore mmap_lock;
> +} __attribute__((preserve_access_index));
> +
> +struct task_struct {
> +       unsigned int          flags;
> +       struct mm_struct      *mm;
> +       pid_t                 pid;
> +       pid_t                 tgid;
> +       char                  comm[16];
> +       struct sighand_struct *sighand;
> +       struct css_set        *cgroups;
> +} __attribute__((preserve_access_index));
> +
> +struct trace_entry {
> +       short unsigned int type;
> +       unsigned char      flags;
> +       unsigned char      preempt_count;
> +       int                pid;
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_irq_handler_entry {
> +       struct trace_entry ent;
> +       int                irq;
> +       u32                __data_loc_name;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_irq_handler_exit {
> +       struct trace_entry ent;
> +       int                irq;
> +       int                ret;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_softirq {
> +       struct trace_entry ent;
> +       unsigned int       vec;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_workqueue_execute_start {
> +       struct trace_entry ent;
> +       void               *work;
> +       void               *function;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_workqueue_execute_end {
> +       struct trace_entry ent;
> +       void               *work;
> +       void               *function;
> +       char              __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_workqueue_activate_work {
> +       struct trace_entry ent;
> +       void               *work;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct perf_sample_data {
> +       u64                      addr;
> +       u64                      period;
> +       union perf_sample_weight weight;
> +       u64                      txn;
> +       union perf_mem_data_src  data_src;
> +       u64                      ip;
> +       struct {
> +               u32              pid;
> +               u32              tid;
> +       } tid_entry;
> +       u64                      time;
> +       u64                      id;
> +       struct {
> +               u32              cpu;
> +       } cpu_entry;
> +       u64                      phys_addr;
> +       u64                      data_page_size;
> +       u64                      code_page_size;
> +} __attribute__((__aligned__(64))) __attribute__((preserve_access_index));
> +
> +struct bpf_perf_event_data_kern {
> +       struct perf_sample_data *data;
> +       struct perf_event       *event;
> +} __attribute__((preserve_access_index));
> +#endif // __VMLINUX_H
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 17:04                         ` Ian Rogers
@ 2023-05-05 20:43                           ` Jiri Olsa
  2023-05-05 20:46                             ` Ian Rogers
  2023-05-05 21:33                             ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 30+ messages in thread
From: Jiri Olsa @ 2023-05-05 20:43 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Linus Torvalds,
	Andrii Nakryiko, Namhyung Kim, Song Liu, Andrii Nakryiko,
	Ingo Molnar, Thomas Gleixner, Clark Williams, Kate Carcia,
	linux-kernel, linux-perf-users, Adrian Hunter, Changbin Du,
	Hao Luo, James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov,
	Yang Jihong, Mark Rutland, Paul Clarke

On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > That with the preserve_access_index isn't needed, we need just the
> > > fields that we access in the tools, right?
> >
> > I'm now doing build test this in many distro containers, without the two
> > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > test build and also for the functionality tests on the tools using such
> > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > build.
> >
> > - Arnaldo
> >
> > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Date: Thu, 4 May 2023 19:03:51 -0300
> > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> >  use subset of used structs + CO-RE
> >
> > Linus reported a build break due to using a vmlinux without a BTF elf
> > section to generate the vmlinux.h header with bpftool for use in the BPF
> > tools in tools/perf/util/bpf_skel/*.bpf.c.
> >
> > Instead add a vmlinux.h file with the structs needed with the fields the
> > tools need, marking the structs with __attribute__((preserve_access_index)),
> > so that libbpf's CO-RE code can fixup the struct field offsets.
> >
> > In some cases the vmlinux.h file that was being generated by bpftool
> > from the kernel BTF information was not needed at all, just including
> > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > types were not being used.
> >
> > To keep te patch small, include those UAPI headers from the trimmed down
> > vmlinux.h file, that then provides the tools with just the structs and
> > the subset of its fields needed for them.
> >
> > Testing it:
> >
> >   # perf lock contention -b find / > /dev/null

I tested perf lock con -abv -L rcu_state sleep 1
and needed fix below

jirka


---
From b12aea55f1171dc09cde2957f9019c84bda7adbb Mon Sep 17 00:00:00 2001
From: Jiri Olsa <jolsa@kernel.org>
Date: Fri, 5 May 2023 13:28:46 +0200
Subject: [PATCH] perf tools: Fix lock_contention bpf program

We need to define empty 'struct rq' so the runqueues gets
resolved properly:

  # ./perf lock con -b
  libbpf: extern (var ksym) 'runqueues': incompatible types, expected [99] fwd rq, but kernel has [19783] struct rq
  libbpf: failed to load object 'lock_contention_bpf'
  libbpf: failed to load BPF skeleton 'lock_contention_bpf': -22
  Failed to load lock-contention BPF skeleton

Also rq__old/rq__new need additional '_' so the suffix is ignored
properly.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/bpf_skel/lock_contention.bpf.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
index 8911e2a077d8..c2bf24c68c14 100644
--- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
+++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
@@ -416,13 +416,15 @@ int contention_end(u64 *ctx)
 	return 0;
 }
 
+struct rq {};
+
 extern struct rq runqueues __ksym;
 
-struct rq__old {
+struct rq___old {
 	raw_spinlock_t lock;
 } __attribute__((preserve_access_index));
 
-struct rq__new {
+struct rq___new {
 	raw_spinlock_t __lock;
 } __attribute__((preserve_access_index));
 
@@ -434,8 +436,8 @@ int BPF_PROG(collect_lock_syms)
 
 	for (int i = 0; i < MAX_CPUS; i++) {
 		struct rq *rq = bpf_per_cpu_ptr(&runqueues, i);
-		struct rq__new *rq_new = (void *)rq;
-		struct rq__old *rq_old = (void *)rq;
+		struct rq___new *rq_new = (void *)rq;
+		struct rq___old *rq_old = (void *)rq;
 
 		if (rq == NULL)
 			break;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 20:43                           ` Jiri Olsa
@ 2023-05-05 20:46                             ` Ian Rogers
  2023-05-05 20:48                               ` Namhyung Kim
                                                 ` (2 more replies)
  2023-05-05 21:33                             ` Arnaldo Carvalho de Melo
  1 sibling, 3 replies; 30+ messages in thread
From: Ian Rogers @ 2023-05-05 20:46 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Linus Torvalds, Andrii Nakryiko,
	Namhyung Kim, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov,
	Yang Jihong, Mark Rutland, Paul Clarke

On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > >
> > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > That with the preserve_access_index isn't needed, we need just the
> > > > fields that we access in the tools, right?
> > >
> > > I'm now doing build test this in many distro containers, without the two
> > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > test build and also for the functionality tests on the tools using such
> > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > build.
> > >
> > > - Arnaldo
> > >
> > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > >  use subset of used structs + CO-RE
> > >
> > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > >
> > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > >
> > > In some cases the vmlinux.h file that was being generated by bpftool
> > > from the kernel BTF information was not needed at all, just including
> > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > types were not being used.
> > >
> > > To keep te patch small, include those UAPI headers from the trimmed down
> > > vmlinux.h file, that then provides the tools with just the structs and
> > > the subset of its fields needed for them.
> > >
> > > Testing it:
> > >
> > >   # perf lock contention -b find / > /dev/null
>
> I tested perf lock con -abv -L rcu_state sleep 1
> and needed fix below
>
> jirka

I thought this was fixed by:
https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/
but I think that is just in perf-tools-next.

Thanks,
Ian

> ---
> From b12aea55f1171dc09cde2957f9019c84bda7adbb Mon Sep 17 00:00:00 2001
> From: Jiri Olsa <jolsa@kernel.org>
> Date: Fri, 5 May 2023 13:28:46 +0200
> Subject: [PATCH] perf tools: Fix lock_contention bpf program
>
> We need to define empty 'struct rq' so the runqueues gets
> resolved properly:
>
>   # ./perf lock con -b
>   libbpf: extern (var ksym) 'runqueues': incompatible types, expected [99] fwd rq, but kernel has [19783] struct rq
>   libbpf: failed to load object 'lock_contention_bpf'
>   libbpf: failed to load BPF skeleton 'lock_contention_bpf': -22
>   Failed to load lock-contention BPF skeleton
>
> Also rq__old/rq__new need additional '_' so the suffix is ignored
> properly.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/util/bpf_skel/lock_contention.bpf.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> index 8911e2a077d8..c2bf24c68c14 100644
> --- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
> +++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> @@ -416,13 +416,15 @@ int contention_end(u64 *ctx)
>         return 0;
>  }
>
> +struct rq {};
> +
>  extern struct rq runqueues __ksym;
>
> -struct rq__old {
> +struct rq___old {
>         raw_spinlock_t lock;
>  } __attribute__((preserve_access_index));
>
> -struct rq__new {
> +struct rq___new {
>         raw_spinlock_t __lock;
>  } __attribute__((preserve_access_index));
>
> @@ -434,8 +436,8 @@ int BPF_PROG(collect_lock_syms)
>
>         for (int i = 0; i < MAX_CPUS; i++) {
>                 struct rq *rq = bpf_per_cpu_ptr(&runqueues, i);
> -               struct rq__new *rq_new = (void *)rq;
> -               struct rq__old *rq_old = (void *)rq;
> +               struct rq___new *rq_new = (void *)rq;
> +               struct rq___old *rq_old = (void *)rq;
>
>                 if (rq == NULL)
>                         break;
> --
> 2.40.1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 20:46                             ` Ian Rogers
@ 2023-05-05 20:48                               ` Namhyung Kim
  2023-05-10 18:56                                 ` Arnaldo Carvalho de Melo
  2023-05-05 20:49                               ` Arnaldo Carvalho de Melo
  2023-05-05 21:15                               ` Jiri Olsa
  2 siblings, 1 reply; 30+ messages in thread
From: Namhyung Kim @ 2023-05-05 20:48 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Linus Torvalds,
	Andrii Nakryiko, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov,
	Yang Jihong, Mark Rutland, Paul Clarke

On Fri, May 5, 2023 at 1:46 PM Ian Rogers <irogers@google.com> wrote:
>
> On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > >
> > > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > > That with the preserve_access_index isn't needed, we need just the
> > > > > fields that we access in the tools, right?
> > > >
> > > > I'm now doing build test this in many distro containers, without the two
> > > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > > test build and also for the functionality tests on the tools using such
> > > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > > build.
> > > >
> > > > - Arnaldo
> > > >
> > > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > > >  use subset of used structs + CO-RE
> > > >
> > > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > > >
> > > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > > >
> > > > In some cases the vmlinux.h file that was being generated by bpftool
> > > > from the kernel BTF information was not needed at all, just including
> > > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > > types were not being used.
> > > >
> > > > To keep te patch small, include those UAPI headers from the trimmed down
> > > > vmlinux.h file, that then provides the tools with just the structs and
> > > > the subset of its fields needed for them.
> > > >
> > > > Testing it:
> > > >
> > > >   # perf lock contention -b find / > /dev/null
> >
> > I tested perf lock con -abv -L rcu_state sleep 1
> > and needed fix below
> >
> > jirka
>
> I thought this was fixed by:
> https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/
> but I think that is just in perf-tools-next.

Right, but we might still need the empty rq definition.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 20:46                             ` Ian Rogers
  2023-05-05 20:48                               ` Namhyung Kim
@ 2023-05-05 20:49                               ` Arnaldo Carvalho de Melo
  2023-05-05 21:15                               ` Jiri Olsa
  2 siblings, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-05 20:49 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Jiri Olsa, Linus Torvalds, Andrii Nakryiko, Namhyung Kim,
	Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Clark Williams, Kate Carcia, linux-kernel, linux-perf-users,
	Adrian Hunter, Changbin Du, Hao Luo, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov, Yang Jihong,
	Mark Rutland, Paul Clarke

Em Fri, May 05, 2023 at 01:46:30PM -0700, Ian Rogers escreveu:
> On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > >
> > > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > > That with the preserve_access_index isn't needed, we need just the
> > > > > fields that we access in the tools, right?
> > > >
> > > > I'm now doing build test this in many distro containers, without the two
> > > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > > test build and also for the functionality tests on the tools using such
> > > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > > build.
> > > >
> > > > - Arnaldo
> > > >
> > > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > > >  use subset of used structs + CO-RE
> > > >
> > > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > > >
> > > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > > >
> > > > In some cases the vmlinux.h file that was being generated by bpftool
> > > > from the kernel BTF information was not needed at all, just including
> > > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > > types were not being used.
> > > >
> > > > To keep te patch small, include those UAPI headers from the trimmed down
> > > > vmlinux.h file, that then provides the tools with just the structs and
> > > > the subset of its fields needed for them.
> > > >
> > > > Testing it:
> > > >
> > > >   # perf lock contention -b find / > /dev/null
> >
> > I tested perf lock con -abv -L rcu_state sleep 1
> > and needed fix below
> >
> > jirka
> 
> I thought this was fixed by:
> https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/
> but I think that is just in perf-tools-next.

Nope, we have it in perf-tools:

commit e53de7b65a3ca59af268c78df2d773f277f717fd
Author: Namhyung Kim <namhyung@kernel.org>
Date:   Thu Apr 27 16:48:32 2023 -0700

    perf lock contention: Fix struct rq lock access

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 20:46                             ` Ian Rogers
  2023-05-05 20:48                               ` Namhyung Kim
  2023-05-05 20:49                               ` Arnaldo Carvalho de Melo
@ 2023-05-05 21:15                               ` Jiri Olsa
  2023-05-05 21:21                                 ` Andrii Nakryiko
  2 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2023-05-05 21:15 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Linus Torvalds,
	Andrii Nakryiko, Namhyung Kim, Song Liu, Andrii Nakryiko,
	Ingo Molnar, Thomas Gleixner, Clark Williams, Kate Carcia,
	linux-kernel, linux-perf-users, Adrian Hunter, Changbin Du,
	Hao Luo, James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov,
	Yang Jihong, Mark Rutland, Paul Clarke

On Fri, May 05, 2023 at 01:46:30PM -0700, Ian Rogers wrote:
> On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > >
> > > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > > That with the preserve_access_index isn't needed, we need just the
> > > > > fields that we access in the tools, right?
> > > >
> > > > I'm now doing build test this in many distro containers, without the two
> > > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > > test build and also for the functionality tests on the tools using such
> > > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > > build.
> > > >
> > > > - Arnaldo
> > > >
> > > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > > >  use subset of used structs + CO-RE
> > > >
> > > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > > >
> > > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > > >
> > > > In some cases the vmlinux.h file that was being generated by bpftool
> > > > from the kernel BTF information was not needed at all, just including
> > > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > > types were not being used.
> > > >
> > > > To keep te patch small, include those UAPI headers from the trimmed down
> > > > vmlinux.h file, that then provides the tools with just the structs and
> > > > the subset of its fields needed for them.
> > > >
> > > > Testing it:
> > > >
> > > >   # perf lock contention -b find / > /dev/null
> >
> > I tested perf lock con -abv -L rcu_state sleep 1
> > and needed fix below
> >
> > jirka
> 
> I thought this was fixed by:
> https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/
> but I think that is just in perf-tools-next.

ah ok, missed that one

thanks,
jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 21:15                               ` Jiri Olsa
@ 2023-05-05 21:21                                 ` Andrii Nakryiko
  2023-05-05 21:52                                   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2023-05-05 21:21 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ian Rogers, Arnaldo Carvalho de Melo, Linus Torvalds,
	Namhyung Kim, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Adrian Hunter, Changbin Du, Hao Luo,
	James Clark, Kan Liang, Roman Lozko, Stephane Eranian,
	Thomas Richter, Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov,
	Yang Jihong, Mark Rutland, Paul Clarke

On Fri, May 5, 2023 at 2:15 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, May 05, 2023 at 01:46:30PM -0700, Ian Rogers wrote:
> > On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > > > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > >
> > > > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > > > That with the preserve_access_index isn't needed, we need just the
> > > > > > fields that we access in the tools, right?
> > > > >
> > > > > I'm now doing build test this in many distro containers, without the two
> > > > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > > > test build and also for the functionality tests on the tools using such
> > > > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > > > build.
> > > > >
> > > > > - Arnaldo
> > > > >
> > > > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > > > >  use subset of used structs + CO-RE
> > > > >
> > > > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > > > >
> > > > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > > > >
> > > > > In some cases the vmlinux.h file that was being generated by bpftool
> > > > > from the kernel BTF information was not needed at all, just including
> > > > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > > > types were not being used.
> > > > >
> > > > > To keep te patch small, include those UAPI headers from the trimmed down
> > > > > vmlinux.h file, that then provides the tools with just the structs and
> > > > > the subset of its fields needed for them.
> > > > >
> > > > > Testing it:
> > > > >
> > > > >   # perf lock contention -b find / > /dev/null
> > >
> > > I tested perf lock con -abv -L rcu_state sleep 1
> > > and needed fix below
> > >
> > > jirka
> >
> > I thought this was fixed by:
> > https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/
> > but I think that is just in perf-tools-next.
>
> ah ok, missed that one

Please try validating with veristat to check if all of perf's .bpf.o
files are successful. Veristat is part of selftests and can be built
with just `make -C tools/testing/selftests/bpf veristat`. After that;

 sudo ~/bin/veristat tools/perf/util/bpf_skel/.tmp/*.bpf.o

This is a surer way to check that BPF object files are ok at least on
your currently running kernel, than trying to exercise each BPF
program through perf commands.

>
> thanks,
> jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 20:43                           ` Jiri Olsa
  2023-05-05 20:46                             ` Ian Rogers
@ 2023-05-05 21:33                             ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-05 21:33 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ian Rogers, Linus Torvalds, Andrii Nakryiko, Namhyung Kim,
	Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Clark Williams, Kate Carcia, linux-kernel, linux-perf-users,
	Adrian Hunter, Changbin Du, Hao Luo, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov, Yang Jihong,
	Mark Rutland, Paul Clarke

Em Fri, May 05, 2023 at 10:43:36PM +0200, Jiri Olsa escreveu:
> On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > >
> > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > That with the preserve_access_index isn't needed, we need just the
> > > > fields that we access in the tools, right?
> > >
> > > I'm now doing build test this in many distro containers, without the two
> > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > test build and also for the functionality tests on the tools using such
> > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > build.
> > >
> > > - Arnaldo
> > >
> > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > >  use subset of used structs + CO-RE
> > >
> > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > >
> > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > >
> > > In some cases the vmlinux.h file that was being generated by bpftool
> > > from the kernel BTF information was not needed at all, just including
> > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > types were not being used.
> > >
> > > To keep te patch small, include those UAPI headers from the trimmed down
> > > vmlinux.h file, that then provides the tools with just the structs and
> > > the subset of its fields needed for them.
> > >
> > > Testing it:
> > >
> > >   # perf lock contention -b find / > /dev/null
> 
> I tested perf lock con -abv -L rcu_state sleep 1
> and needed fix below
> 
> jirka

patch not applying trying to do it manually.

- Arnaldo
 
> 
> ---
> From b12aea55f1171dc09cde2957f9019c84bda7adbb Mon Sep 17 00:00:00 2001
> From: Jiri Olsa <jolsa@kernel.org>
> Date: Fri, 5 May 2023 13:28:46 +0200
> Subject: [PATCH] perf tools: Fix lock_contention bpf program
> 
> We need to define empty 'struct rq' so the runqueues gets
> resolved properly:
> 
>   # ./perf lock con -b
>   libbpf: extern (var ksym) 'runqueues': incompatible types, expected [99] fwd rq, but kernel has [19783] struct rq
>   libbpf: failed to load object 'lock_contention_bpf'
>   libbpf: failed to load BPF skeleton 'lock_contention_bpf': -22
>   Failed to load lock-contention BPF skeleton
> 
> Also rq__old/rq__new need additional '_' so the suffix is ignored
> properly.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/util/bpf_skel/lock_contention.bpf.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> index 8911e2a077d8..c2bf24c68c14 100644
> --- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
> +++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> @@ -416,13 +416,15 @@ int contention_end(u64 *ctx)
>  	return 0;
>  }
>  
> +struct rq {};
> +
>  extern struct rq runqueues __ksym;
>  
> -struct rq__old {
> +struct rq___old {
>  	raw_spinlock_t lock;
>  } __attribute__((preserve_access_index));
>  
> -struct rq__new {
> +struct rq___new {
>  	raw_spinlock_t __lock;
>  } __attribute__((preserve_access_index));
>  
> @@ -434,8 +436,8 @@ int BPF_PROG(collect_lock_syms)
>  
>  	for (int i = 0; i < MAX_CPUS; i++) {
>  		struct rq *rq = bpf_per_cpu_ptr(&runqueues, i);
> -		struct rq__new *rq_new = (void *)rq;
> -		struct rq__old *rq_old = (void *)rq;
> +		struct rq___new *rq_new = (void *)rq;
> +		struct rq___old *rq_old = (void *)rq;
>  
>  		if (rq == NULL)
>  			break;
> -- 
> 2.40.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 21:21                                 ` Andrii Nakryiko
@ 2023-05-05 21:52                                   ` Arnaldo Carvalho de Melo
  2023-05-05 21:55                                     ` Andrii Nakryiko
  0 siblings, 1 reply; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-05 21:52 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Ian Rogers, Linus Torvalds, Namhyung Kim, Song Liu,
	Andrii Nakryiko, Ingo Molnar, Thomas Gleixner, Clark Williams,
	Kate Carcia, linux-kernel, linux-perf-users, Adrian Hunter,
	Changbin Du, Hao Luo, James Clark, Kan Liang, Roman Lozko,
	Stephane Eranian, Thomas Richter, Arnaldo Carvalho de Melo, bpf,
	Alexei Starovoitov, Yang Jihong, Mark Rutland, Paul Clarke

Em Fri, May 05, 2023 at 02:21:56PM -0700, Andrii Nakryiko escreveu:
> On Fri, May 5, 2023 at 2:15 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Fri, May 05, 2023 at 01:46:30PM -0700, Ian Rogers wrote:
> > > On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > > > > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > >
> > > > > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > > > > That with the preserve_access_index isn't needed, we need just the
> > > > > > > fields that we access in the tools, right?
> > > > > >
> > > > > > I'm now doing build test this in many distro containers, without the two
> > > > > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > > > > test build and also for the functionality tests on the tools using such
> > > > > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > > > > build.
> > > > > >
> > > > > > - Arnaldo
> > > > > >
> > > > > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > > > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > > > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > > > > >  use subset of used structs + CO-RE
> > > > > >
> > > > > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > > > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > > > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > > > > >
> > > > > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > > > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > > > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > > > > >
> > > > > > In some cases the vmlinux.h file that was being generated by bpftool
> > > > > > from the kernel BTF information was not needed at all, just including
> > > > > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > > > > types were not being used.
> > > > > >
> > > > > > To keep te patch small, include those UAPI headers from the trimmed down
> > > > > > vmlinux.h file, that then provides the tools with just the structs and
> > > > > > the subset of its fields needed for them.
> > > > > >
> > > > > > Testing it:
> > > > > >
> > > > > >   # perf lock contention -b find / > /dev/null
> > > >
> > > > I tested perf lock con -abv -L rcu_state sleep 1
> > > > and needed fix below
> > > >
> > > > jirka
> > >
> > > I thought this was fixed by:
> > > https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/
> > > but I think that is just in perf-tools-next.
> >
> > ah ok, missed that one
> 
> Please try validating with veristat to check if all of perf's .bpf.o
> files are successful. Veristat is part of selftests and can be built
> with just `make -C tools/testing/selftests/bpf veristat`. After that;
> 
>  sudo ~/bin/veristat tools/perf/util/bpf_skel/.tmp/*.bpf.o
> 
> This is a surer way to check that BPF object files are ok at least on
> your currently running kernel, than trying to exercise each BPF
> program through perf commands.

[acme@quaco perf-tools]$ sudo tools/testing/selftests/bpf/veristat /tmp/build/perf-tools/util/bpf_skel/.tmp/*.bpf.o
Processing 'bperf_cgroup.bpf.o'...
Processing 'bperf_follower.bpf.o'...
Processing 'bperf_leader.bpf.o'...
Processing 'bpf_prog_profiler.bpf.o'...
Processing 'func_latency.bpf.o'...
Processing 'kwork_trace.bpf.o'...
Processing 'lock_contention.bpf.o'...
Processing 'off_cpu.bpf.o'...
Processing 'sample_filter.bpf.o'...
File                     Program                          Verdict  Duration (us)   Insns  States  Peak states
-----------------------  -------------------------------  -------  -------------  ------  ------  -----------
bperf_cgroup.bpf.o       on_cgrp_switch                   success           6479   17025     417          174
bperf_cgroup.bpf.o       trigger_read                     success           6370   17025     417          174
bperf_follower.bpf.o     fexit_XXX                        failure              0       0       0            0
bperf_leader.bpf.o       on_switch                        success            360      49       3            3
bpf_prog_profiler.bpf.o  fentry_XXX                       failure              0       0       0            0
bpf_prog_profiler.bpf.o  fexit_XXX                        failure              0       0       0            0
func_latency.bpf.o       func_begin                       success            351      69       6            6
func_latency.bpf.o       func_end                         success            318     158      15           15
kwork_trace.bpf.o        latency_softirq_entry            success            334     108      10           10
kwork_trace.bpf.o        latency_softirq_raise            success            896    1993      34           34
kwork_trace.bpf.o        latency_workqueue_activate_work  success            333      46       4            4
kwork_trace.bpf.o        latency_workqueue_execute_start  success           1112    2219      41           41
kwork_trace.bpf.o        report_irq_handler_entry         success           1067    2118      34           34
kwork_trace.bpf.o        report_irq_handler_exit          success            334     110      10           10
kwork_trace.bpf.o        report_softirq_entry             success            897    1993      34           34
kwork_trace.bpf.o        report_softirq_exit              success            329     108      10           10
kwork_trace.bpf.o        report_workqueue_execute_end     success           1124    2219      41           41
kwork_trace.bpf.o        report_workqueue_execute_start   success            295      46       4            4
lock_contention.bpf.o    collect_lock_syms                failure              0       0       0            0
lock_contention.bpf.o    contention_begin                 failure              0       0       0            0
lock_contention.bpf.o    contention_end                   failure              0       0       0            0
off_cpu.bpf.o            on_newtask                       success            387      37       3            3
off_cpu.bpf.o            on_switch                        success            536     220      20           20
sample_filter.bpf.o      perf_sample_filter               success         190443  190237   11173          923
-----------------------  -------------------------------  -------  -------------  ------  ------  -----------
Done. Processed 9 files, 0 programs. Skipped 24 files, 0 programs.
[acme@quaco perf-tools]$

What extra info can we get from these "failure" lines?

- Arnaldo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 21:52                                   ` Arnaldo Carvalho de Melo
@ 2023-05-05 21:55                                     ` Andrii Nakryiko
  0 siblings, 0 replies; 30+ messages in thread
From: Andrii Nakryiko @ 2023-05-05 21:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Linus Torvalds, Namhyung Kim, Song Liu,
	Andrii Nakryiko, Ingo Molnar, Thomas Gleixner, Clark Williams,
	Kate Carcia, linux-kernel, linux-perf-users, Adrian Hunter,
	Changbin Du, Hao Luo, James Clark, Kan Liang, Roman Lozko,
	Stephane Eranian, Thomas Richter, Arnaldo Carvalho de Melo, bpf,
	Alexei Starovoitov, Yang Jihong, Mark Rutland, Paul Clarke

On Fri, May 5, 2023 at 2:52 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Fri, May 05, 2023 at 02:21:56PM -0700, Andrii Nakryiko escreveu:
> > On Fri, May 5, 2023 at 2:15 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Fri, May 05, 2023 at 01:46:30PM -0700, Ian Rogers wrote:
> > > > On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > >
> > > > > On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > > > > > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > > > >
> > > > > > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > > > > > That with the preserve_access_index isn't needed, we need just the
> > > > > > > > fields that we access in the tools, right?
> > > > > > >
> > > > > > > I'm now doing build test this in many distro containers, without the two
> > > > > > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > > > > > test build and also for the functionality tests on the tools using such
> > > > > > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > > > > > build.
> > > > > > >
> > > > > > > - Arnaldo
> > > > > > >
> > > > > > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > > > > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > > > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > > > > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > > > > > >  use subset of used structs + CO-RE
> > > > > > >
> > > > > > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > > > > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > > > > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > > > > > >
> > > > > > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > > > > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > > > > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > > > > > >
> > > > > > > In some cases the vmlinux.h file that was being generated by bpftool
> > > > > > > from the kernel BTF information was not needed at all, just including
> > > > > > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > > > > > types were not being used.
> > > > > > >
> > > > > > > To keep te patch small, include those UAPI headers from the trimmed down
> > > > > > > vmlinux.h file, that then provides the tools with just the structs and
> > > > > > > the subset of its fields needed for them.
> > > > > > >
> > > > > > > Testing it:
> > > > > > >
> > > > > > >   # perf lock contention -b find / > /dev/null
> > > > >
> > > > > I tested perf lock con -abv -L rcu_state sleep 1
> > > > > and needed fix below
> > > > >
> > > > > jirka
> > > >
> > > > I thought this was fixed by:
> > > > https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/
> > > > but I think that is just in perf-tools-next.
> > >
> > > ah ok, missed that one
> >
> > Please try validating with veristat to check if all of perf's .bpf.o
> > files are successful. Veristat is part of selftests and can be built
> > with just `make -C tools/testing/selftests/bpf veristat`. After that;
> >
> >  sudo ~/bin/veristat tools/perf/util/bpf_skel/.tmp/*.bpf.o
> >
> > This is a surer way to check that BPF object files are ok at least on
> > your currently running kernel, than trying to exercise each BPF
> > program through perf commands.
>
> [acme@quaco perf-tools]$ sudo tools/testing/selftests/bpf/veristat /tmp/build/perf-tools/util/bpf_skel/.tmp/*.bpf.o
> Processing 'bperf_cgroup.bpf.o'...
> Processing 'bperf_follower.bpf.o'...
> Processing 'bperf_leader.bpf.o'...
> Processing 'bpf_prog_profiler.bpf.o'...
> Processing 'func_latency.bpf.o'...
> Processing 'kwork_trace.bpf.o'...
> Processing 'lock_contention.bpf.o'...
> Processing 'off_cpu.bpf.o'...
> Processing 'sample_filter.bpf.o'...
> File                     Program                          Verdict  Duration (us)   Insns  States  Peak states
> -----------------------  -------------------------------  -------  -------------  ------  ------  -----------
> bperf_cgroup.bpf.o       on_cgrp_switch                   success           6479   17025     417          174
> bperf_cgroup.bpf.o       trigger_read                     success           6370   17025     417          174
> bperf_follower.bpf.o     fexit_XXX                        failure              0       0       0            0
> bperf_leader.bpf.o       on_switch                        success            360      49       3            3
> bpf_prog_profiler.bpf.o  fentry_XXX                       failure              0       0       0            0
> bpf_prog_profiler.bpf.o  fexit_XXX                        failure              0       0       0            0
> func_latency.bpf.o       func_begin                       success            351      69       6            6
> func_latency.bpf.o       func_end                         success            318     158      15           15
> kwork_trace.bpf.o        latency_softirq_entry            success            334     108      10           10
> kwork_trace.bpf.o        latency_softirq_raise            success            896    1993      34           34
> kwork_trace.bpf.o        latency_workqueue_activate_work  success            333      46       4            4
> kwork_trace.bpf.o        latency_workqueue_execute_start  success           1112    2219      41           41
> kwork_trace.bpf.o        report_irq_handler_entry         success           1067    2118      34           34
> kwork_trace.bpf.o        report_irq_handler_exit          success            334     110      10           10
> kwork_trace.bpf.o        report_softirq_entry             success            897    1993      34           34
> kwork_trace.bpf.o        report_softirq_exit              success            329     108      10           10
> kwork_trace.bpf.o        report_workqueue_execute_end     success           1124    2219      41           41
> kwork_trace.bpf.o        report_workqueue_execute_start   success            295      46       4            4
> lock_contention.bpf.o    collect_lock_syms                failure              0       0       0            0
> lock_contention.bpf.o    contention_begin                 failure              0       0       0            0
> lock_contention.bpf.o    contention_end                   failure              0       0       0            0
> off_cpu.bpf.o            on_newtask                       success            387      37       3            3
> off_cpu.bpf.o            on_switch                        success            536     220      20           20
> sample_filter.bpf.o      perf_sample_filter               success         190443  190237   11173          923
> -----------------------  -------------------------------  -------  -------------  ------  ------  -----------
> Done. Processed 9 files, 0 programs. Skipped 24 files, 0 programs.
> [acme@quaco perf-tools]$
>
> What extra info can we get from these "failure" lines?

you can ask for verifier log with -v (or very detailed verifier log
with -vl2). And you can narrow down to single program at a time with
-f <prog_name>:

sudo ./veristat -v <path-to-bpf.o> -f <prog_name>

Not every possible program can be correctly loaded by veristat (e.g.,
if fentry/fexit doesn't specify target function). But in the above
case everything from lock_contention.bpf.o is definitely worth
checking.

>
> - Arnaldo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 13:18                   ` Arnaldo Carvalho de Melo
@ 2023-05-06  1:13                     ` Yang Jihong
  0 siblings, 0 replies; 30+ messages in thread
From: Yang Jihong @ 2023-05-06  1:13 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Andrii Nakryiko
  Cc: Linus Torvalds, Song Liu, Andrii Nakryiko, Ingo Molnar,
	Thomas Gleixner, Jiri Olsa, Clark Williams, Kate Carcia,
	linux-kernel, linux-perf-users, Adrian Hunter, Changbin Du,
	Hao Luo, Ian Rogers, James Clark, Kan Liang, Roman Lozko,
	Stephane Eranian, Thomas Richter, Arnaldo Carvalho de Melo, bpf

Hello,

On 2023/5/5 21:18, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 04, 2023 at 07:01:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Thu, May 04, 2023 at 06:48:50PM -0300, Arnaldo Carvalho de Melo escreveu:
>>> Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>> Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
>>>>> On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>>>>> Andrii, can you add some more information about the usage of vmlinux.h
>>>>>> instead of using kernel headers?
>>>>   
>>>>> I'll just say that vmlinux.h is not a hard requirement to build BPF
>>>>> programs, it's more a convenience allowing easy access to definitions
>>>>> of both UAPI and kernel-internal structures for tracing needs and
>>>>> marking them relocatable using BPF CO-RE machinery. Lots of real-world
>>>>> applications just check-in pregenerated vmlinux.h to avoid build-time
>>>>> dependency on up-to-date host kernel and such.
>>>>   
>>>>> If vmlinux.h generation and usage is causing issues, though, given
>>>>> that perf's BPF programs don't seem to be using many different kernel
>>>>> types, it might be a better option to just use UAPI headers for public
>>>>> kernel type definitions, and just define CO-RE-relocatable minimal
>>>>> definitions locally in perf's BPF code for the other types necessary.
>>>>> E.g., if perf needs only pid and tgid from task_struct, this would
>>>>> suffice:
>>>>   
>>>>> struct task_struct {
>>>>>      int pid;
>>>>>      int tgid;
>>>>> } __attribute__((preserve_access_index));
>>>>
>>>> Yeah, that seems like a way better approach, no vmlinux involved, libbpf
>>>> CO-RE notices that task_struct changed from this two integers version
>>>> (of course) and does the relocation to where it is in the running kernel
>>>> by using /sys/kernel/btf/vmlinux.
>>>
>>> Doing it for one of the skels, build tested, runtime untested, but not
>>> using any vmlinux, BTF to help, not that bad, more verbose, but at least
>>> we state what are the fields we actually use, have those attribute
>>> documenting that those offsets will be recorded for future use, etc.
> 
> Yang, can you please check that this works?
> 
Yes, I've tested this patch and it works :)

Tested-by: Yang Jihong <yangjihong1@huawei.com>

# perf kwork report -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
   Kwork Name                     | Cpu  | Total Runtime | Count     | 
Max runtime   | Max runtime start   | Max runtime end     |
 
--------------------------------------------------------------------------------------------------------------------------------
   (s)SCHED:7                     | 0005 |      1.440 ms |         3 | 
     1.377 ms |     165822.963188 s |     165822.964565 s |
   (s)SCHED:7                     | 0001 |      1.388 ms |         2 | 
     1.377 ms |     165822.963188 s |     165822.964565 s |
   (w)e1000_watchdog              | 0002 |      0.532 ms |         1 | 
     0.532 ms |     165823.806777 s |     165823.807309 s |
   (w)flush_to_ldisc              | 0005 |      0.524 ms |         1 | 
     0.524 ms |     165824.255219 s |     165824.255743 s |
   (s)NET_RX:3                    | 0002 |      0.512 ms |         4 | 
     0.247 ms |     165824.254103 s |     165824.254350 s |
   (w)wq_barrier_func             | 0003 |      0.394 ms |         1 | 
     0.394 ms |     165822.964179 s |     165822.964573 s |
   eth0:10                        | 0002 |      0.250 ms |         4 | 
     0.115 ms |     165822.964814 s |     165822.964929 s |
   (w)wb_workfn                   | 0005 |      0.208 ms |         1 | 
     0.208 ms |     165823.998558 s |     165823.998766 s |
   (s)TIMER:1                     | 0002 |      0.205 ms |         4 | 
     0.077 ms |     165823.806548 s |     165823.806626 s |
   (s)SCHED:7                     | 0002 |      0.185 ms |         5 | 
     0.049 ms |     165824.255119 s |     165824.255168 s |
   (s)TIMER:1                     | 0003 |      0.167 ms |         4 | 
     0.079 ms |     165823.998326 s |     165823.998405 s |
   (s)SCHED:7                     | 0000 |      0.165 ms |         4 | 
     0.073 ms |     165823.807235 s |     165823.807308 s |
   (w)flush_to_ldisc              | 0003 |      0.156 ms |         1 | 
     0.156 ms |     165824.255723 s |     165824.255879 s |
   (s)TIMER:1                     | 0007 |      0.152 ms |         2 | 
     0.100 ms |     165823.358416 s |     165823.358517 s |
   (w)vmstat_update               | 0001 |      0.094 ms |         1 | 
     0.094 ms |     165824.256460 s |     165824.256554 s |
   (s)RCU:9                       | 0001 |      0.088 ms |         4 | 
     0.040 ms |     165822.964728 s |     165822.964768 s |
   (w)vmstat_shepherd             | 0000 |      0.078 ms |         1 | 
     0.078 ms |     165824.256321 s |     165824.256399 s |
   (s)SCHED:7                     | 0007 |      0.066 ms |         2 | 
     0.038 ms |     165823.358528 s |     165823.358566 s |
   (s)SCHED:7                     | 0003 |      0.057 ms |         2 | 
     0.034 ms |     165823.998412 s |     165823.998446 s |
   virtio0-requests:25            | 0000 |      0.056 ms |         1 | 
     0.056 ms |     165824.255959 s |     165824.256016 s |
   (s)TIMER:1                     | 0000 |      0.055 ms |         1 | 
     0.055 ms |     165824.256164 s |     165824.256219 s |
   (s)RCU:9                       | 0005 |      0.045 ms |         3 | 
     0.023 ms |     165822.964728 s |     165822.964751 s |
   (s)RCU:9                       | 0002 |      0.028 ms |         2 | 
     0.017 ms |     165823.174539 s |     165823.174556 s |
   (s)RCU:9                       | 0007 |      0.016 ms |         1 | 
     0.016 ms |     165823.358571 s |     165823.358587 s |
   (s)RCU:9                       | 0000 |      0.013 ms |         1 | 
     0.013 ms |     165824.256264 s |     165824.256278 s |
   (s)RCU:9                       | 0003 |      0.011 ms |         1 | 
     0.011 ms |     165822.973142 s |     165822.973153 s |
 
--------------------------------------------------------------------------------------------------------------------------------

# perf kwork lat -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
   Kwork Name                     | Cpu  | Avg delay     | Count     | 
Max delay     | Max delay start     | Max delay end       |
 
--------------------------------------------------------------------------------------------------------------------------------
   (w)neigh_periodic_work         | 0001 |      0.568 ms |         1 | 
     0.568 ms |     165839.038372 s |     165839.038941 s |
   (s)RCU:9                       | 0000 |      0.292 ms |         1 | 
     0.292 ms |     165840.176224 s |     165840.176515 s |
   (s)TIMER:1                     | 0000 |      0.260 ms |         1 | 
     0.260 ms |     165840.176206 s |     165840.176466 s |
   (w)disk_events_workfn          | 0001 |      0.256 ms |         1 | 
     0.256 ms |     165839.038327 s |     165839.038583 s |
   (s)RCU:9                       | 0001 |      0.204 ms |         1 | 
     0.204 ms |     165839.038213 s |     165839.038417 s |
   (s)SCHED:7                     | 0001 |      0.153 ms |         1 | 
     0.153 ms |     165839.038231 s |     165839.038384 s |
   (s)NET_RX:3                    | 0002 |      0.132 ms |         3 | 
     0.184 ms |     165840.175984 s |     165840.176168 s |
   (w)ata_sff_pio_task            | 0001 |      0.124 ms |         1 | 
     0.124 ms |     165839.038852 s |     165839.038976 s |
   (s)SCHED:7                     | 0000 |      0.122 ms |         2 | 
     0.193 ms |     165840.176244 s |     165840.176437 s |
   (s)RCU:9                       | 0007 |      0.106 ms |         1 | 
     0.106 ms |     165838.982220 s |     165838.982326 s |
   (s)RCU:9                       | 0003 |      0.092 ms |         1 | 
     0.092 ms |     165839.040098 s |     165839.040189 s |
   (s)TIMER:1                     | 0001 |      0.085 ms |         1 | 
     0.085 ms |     165839.038179 s |     165839.038264 s |
   (s)SCHED:7                     | 0007 |      0.078 ms |         3 | 
     0.086 ms |     165839.990180 s |     165839.990265 s |
   (s)TIMER:1                     | 0007 |      0.077 ms |         3 | 
     0.081 ms |     165839.990136 s |     165839.990216 s |
   (s)TIMER:1                     | 0003 |      0.076 ms |         1 | 
     0.076 ms |     165839.040066 s |     165839.040142 s |
   (s)RCU:9                       | 0002 |      0.075 ms |         2 | 
     0.088 ms |     165839.118209 s |     165839.118297 s |
   (w)flush_to_ldisc              | 0006 |      0.070 ms |         1 | 
     0.070 ms |     165840.175332 s |     165840.175402 s |
   (s)TIMER:1                     | 0006 |      0.067 ms |         2 | 
     0.085 ms |     165838.908056 s |     165838.908142 s |
   (s)TIMER:1                     | 0002 |      0.067 ms |         2 | 
     0.074 ms |     165840.175076 s |     165840.175149 s |
   (s)SCHED:7                     | 0006 |      0.065 ms |         1 | 
     0.065 ms |     165838.908107 s |     165838.908171 s |
   (s)SCHED:7                     | 0003 |      0.051 ms |         1 | 
     0.051 ms |     165839.040115 s |     165839.040166 s |
   (s)SCHED:7                     | 0002 |      0.046 ms |         2 | 
     0.048 ms |     165840.175123 s |     165840.175171 s |
   (s)RCU:9                       | 0006 |      0.035 ms |         2 | 
     0.036 ms |     165838.913051 s |     165838.913087 s |
   (s)BLOCK:4                     | 0001 |      0.023 ms |         1 | 
     0.023 ms |     165839.039281 s |     165839.039303 s |
 
--------------------------------------------------------------------------------------------------------------------------------

Thanks,
Yang

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 16:56                       ` [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was " Arnaldo Carvalho de Melo
  2023-05-05 17:04                         ` Ian Rogers
@ 2023-05-08 21:53                         ` Ian Rogers
  1 sibling, 0 replies; 30+ messages in thread
From: Ian Rogers @ 2023-05-08 21:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Linus Torvalds, Andrii Nakryiko, Namhyung Kim,
	Song Liu, Andrii Nakryiko, Ingo Molnar, Thomas Gleixner,
	Clark Williams, Kate Carcia, linux-kernel, linux-perf-users,
	Adrian Hunter, Changbin Du, Hao Luo, James Clark, Kan Liang,
	Roman Lozko, Stephane Eranian, Thomas Richter,
	Arnaldo Carvalho de Melo, bpf, Alexei Starovoitov, Yang Jihong,
	Mark Rutland, Paul Clarke

On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > That with the preserve_access_index isn't needed, we need just the
> > fields that we access in the tools, right?
>
> I'm now doing build test this in many distro containers, without the two
> reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> test build and also for the functionality tests on the tools using such
> bpf skels, see below, no touching of vmlinux nor BTF data during the
> build.
>
> - Arnaldo
>
> From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date: Thu, 4 May 2023 19:03:51 -0300
> Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
>  use subset of used structs + CO-RE
>
> Linus reported a build break due to using a vmlinux without a BTF elf
> section to generate the vmlinux.h header with bpftool for use in the BPF
> tools in tools/perf/util/bpf_skel/*.bpf.c.
>
> Instead add a vmlinux.h file with the structs needed with the fields the
> tools need, marking the structs with __attribute__((preserve_access_index)),
> so that libbpf's CO-RE code can fixup the struct field offsets.
>
> In some cases the vmlinux.h file that was being generated by bpftool
> from the kernel BTF information was not needed at all, just including
> linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> types were not being used.
>
> To keep te patch small, include those UAPI headers from the trimmed down
> vmlinux.h file, that then provides the tools with just the structs and
> the subset of its fields needed for them.
>
> Testing it:
>
>   # perf lock contention -b find / > /dev/null
>   ^C contended   total wait     max wait     avg wait         type   caller
>
>            7     53.59 us     10.86 us      7.66 us     rwlock:R   start_this_handle+0xa0
>            2     30.35 us     21.99 us     15.17 us      rwsem:R   iterate_dir+0x52
>            1      9.04 us      9.04 us      9.04 us     rwlock:W   start_this_handle+0x291
>            1      8.73 us      8.73 us      8.73 us     spinlock   raw_spin_rq_lock_nested+0x1e
>   #
>   # perf lock contention -abl find / > /dev/null
>   ^C contended   total wait     max wait     avg wait            address   symbol
>
>            1    262.96 ms    262.96 ms    262.96 ms   ffff8e67502d0170    (mutex)
>           12    244.24 us     39.91 us     20.35 us   ffff8e6af56f8070   mmap_lock (rwsem)
>            7     30.28 us      6.85 us      4.33 us   ffff8e6c865f1d40   rq_lock (spinlock)
>            3      7.42 us      4.03 us      2.47 us   ffff8e6c864b1d40   rq_lock (spinlock)
>            2      3.72 us      2.19 us      1.86 us   ffff8e6c86571d40   rq_lock (spinlock)
>            1      2.42 us      2.42 us      2.42 us   ffff8e6c86471d40   rq_lock (spinlock)
>            4      2.11 us       559 ns       527 ns   ffffffff9a146c80   rcu_state (spinlock)
>            3      1.45 us       818 ns       482 ns   ffff8e674ae8384c    (rwlock)
>            1       870 ns       870 ns       870 ns   ffff8e68456ee060    (rwlock)
>            1       663 ns       663 ns       663 ns   ffff8e6c864f1d40   rq_lock (spinlock)
>            1       573 ns       573 ns       573 ns   ffff8e6c86531d40   rq_lock (spinlock)
>            1       472 ns       472 ns       472 ns   ffff8e6c86431740    (spinlock)
>            1       397 ns       397 ns       397 ns   ffff8e67413a4f04    (spinlock)
>   #
>   # perf test offcpu
>   95: perf record offcpu profiling tests                              : Ok
>   #
>   # perf kwork latency --use-bpf
>   Starting trace, Hit <Ctrl+C> to stop and report
>   ^C
>     Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
>    --------------------------------------------------------------------------------------------------------------------------------
>     (w)flush_memcg_stats_dwork     | 0000 |   1056.212 ms |         2 |   2112.345 ms |     550113.229573 s |     550115.341919 s |
>     (w)toggle_allocation_gate      | 0000 |     10.144 ms |        62 |    416.389 ms |     550113.453518 s |     550113.869907 s |
>     (w)0xffff8e6748e28080          | 0002 |      0.623 ms |         1 |      0.623 ms |     550110.989841 s |     550110.990464 s |
>     (w)vmstat_shepherd             | 0000 |      0.586 ms |        10 |      2.828 ms |     550111.971536 s |     550111.974364 s |
>     (w)vmstat_update               | 0007 |      0.363 ms |         5 |      1.634 ms |     550113.222520 s |     550113.224154 s |
>     (w)vmstat_update               | 0000 |      0.324 ms |        10 |      2.827 ms |     550111.971526 s |     550111.974354 s |
>     (w)0xffff8e674c5f4a58          | 0002 |      0.102 ms |         5 |      0.134 ms |     550110.989839 s |     550110.989972 s |
>     (w)psi_avgs_work               | 0001 |      0.086 ms |         3 |      0.107 ms |     550114.957852 s |     550114.957959 s |
>     (w)psi_avgs_work               | 0000 |      0.079 ms |         5 |      0.100 ms |     550118.605668 s |     550118.605768 s |
>     (w)kfree_rcu_monitor           | 0006 |      0.079 ms |         1 |      0.079 ms |     550110.925821 s |     550110.925900 s |
>     (w)psi_avgs_work               | 0004 |      0.079 ms |         1 |      0.079 ms |     550109.581835 s |     550109.581914 s |
>     (w)psi_avgs_work               | 0001 |      0.078 ms |         1 |      0.078 ms |     550109.197809 s |     550109.197887 s |
>     (w)psi_avgs_work               | 0002 |      0.077 ms |         5 |      0.086 ms |     550110.669819 s |     550110.669905 s |
>   <SNIP>
>   # strace -e bpf -o perf-stat-bpf-counters.output perf stat -e cycles --bpf-counters sleep 1
>
>    Performance counter stats for 'sleep 1':
>
>            6,197,983      cycles
>
>          1.003922848 seconds time elapsed
>
>          0.000000000 seconds user
>          0.002032000 seconds sys
>
>   # head -7 perf-stat-bpf-counters.output
>   bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 16) = 3
>   bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=88, info=0x7ffcead64990}}, 16) = 0
>   bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x24129e0, value=0x7ffcead65a48, flags=BPF_ANY}, 32) = 0
>   bpf(BPF_LINK_GET_FD_BY_ID, {link_id=1252}, 12) = -1 ENOENT (No such file or directory)
>   bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65780, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 116) = 4
>   bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65920, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 128) = 4
>   bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 28) = 4
>   #
>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Co-developed-by: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/Makefile.perf            |  20 +---
>  tools/perf/util/bpf_skel/.gitignore |   1 -
>  tools/perf/util/bpf_skel/vmlinux.h  | 173 ++++++++++++++++++++++++++++
>  3 files changed, 174 insertions(+), 20 deletions(-)
>  create mode 100644 tools/perf/util/bpf_skel/vmlinux.h
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 48aba186ceb50792..61c33d100b2bcc90 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -1063,25 +1063,7 @@ $(BPFTOOL): | $(SKEL_TMP_OUT)
>         $(Q)CFLAGS= $(MAKE) -C ../bpf/bpftool \
>                 OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
>
> -VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux)                           \
> -                    $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux)    \
> -                    ../../vmlinux                                      \
> -                    /sys/kernel/btf/vmlinux                            \
> -                    /boot/vmlinux-$(shell uname -r)
> -VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
> -
> -$(SKEL_OUT)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL)
> -ifeq ($(VMLINUX_H),)
> -       $(QUIET_GEN)$(BPFTOOL) btf dump file $< format c > $@ || \
> -       (echo "Failure to generate vmlinux.h needed for the recommended BPF skeleton support." && \
> -       echo "To disable this use the build option NO_BPF_SKEL=1." && \
> -       echo "Alternatively point at a pre-generated vmlinux.h with VMLINUX_H=<path>." && \
> -       false)
> -else
> -       $(Q)cp "$(VMLINUX_H)" $@
> -endif
> -
> -$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) $(SKEL_OUT)/vmlinux.h | $(SKEL_TMP_OUT)
> +$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
>         $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf -Wall -Werror $(BPF_INCLUDE) \
>           -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@
>
> diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore
> index cd01455e1b53c3d9..7a1c832825de8445 100644
> --- a/tools/perf/util/bpf_skel/.gitignore
> +++ b/tools/perf/util/bpf_skel/.gitignore
> @@ -1,4 +1,3 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  .tmp
>  *.skel.h
> -vmlinux.h
> diff --git a/tools/perf/util/bpf_skel/vmlinux.h b/tools/perf/util/bpf_skel/vmlinux.h
> new file mode 100644
> index 0000000000000000..449b1ea91fc48143
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/vmlinux.h
> @@ -0,0 +1,173 @@
> +#ifndef __VMLINUX_H
> +#define __VMLINUX_H
> +
> +#include <linux/bpf.h>
> +#include <linux/types.h>

Is this inclusion tools/include/linux/types.h or
tools/include/uapi/linux/types.h? The former is the norm in the perf
tree:
https://lore.kernel.org/linux-perf-users/CAP-5=fXKi+VAr-_n5tAaJ7Z2fvU7jc5N-CKCjkCAh_01_pzMfA@mail.gmail.com/
and that has the definitions:
typedef uint64_t u64;
typedef int64_t s64;

> +#include <linux/perf_event.h>
> +#include <stdbool.h>
> +
> +// non-UAPI kernel data structures, used in the .bpf.c BPF tool component.
> +
> +// Just the fields used in these tools preserving the access index so that
> +// libbpf can fixup offsets with the ones used in the kernel when loading the
> +// BPF bytecode, if they differ from what is used here.
> +
> +typedef __u8 u8;
> +typedef __u32 u32;
> +typedef __u64 u64;
> +typedef __s64 s64;

which then collide with these two definitions. On my builds this triggers:
error: typedef redefinition with different types ('__u64' (aka
'unsigned long long') vs 'uint64_t' (aka 'unsigned long'))
I'm working around the issue by going back to using a generated vmlinux.h.

Thanks,
Ian

> +
> +typedef int pid_t;
> +
> +enum cgroup_subsys_id {
> +       perf_event_cgrp_id  = 8,
> +};
> +
> +enum {
> +       HI_SOFTIRQ = 0,
> +       TIMER_SOFTIRQ,
> +       NET_TX_SOFTIRQ,
> +       NET_RX_SOFTIRQ,
> +       BLOCK_SOFTIRQ,
> +       IRQ_POLL_SOFTIRQ,
> +       TASKLET_SOFTIRQ,
> +       SCHED_SOFTIRQ,
> +       HRTIMER_SOFTIRQ,
> +       RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */
> +
> +       NR_SOFTIRQS
> +};
> +
> +typedef struct {
> +       s64     counter;
> +} __attribute__((preserve_access_index)) atomic64_t;
> +
> +typedef atomic64_t atomic_long_t;
> +
> +struct raw_spinlock {
> +       int rawlock;
> +} __attribute__((preserve_access_index));
> +
> +typedef struct raw_spinlock raw_spinlock_t;
> +
> +typedef struct {
> +       struct raw_spinlock rlock;
> +} __attribute__((preserve_access_index)) spinlock_t;
> +
> +struct sighand_struct {
> +       spinlock_t siglock;
> +} __attribute__((preserve_access_index));
> +
> +struct rw_semaphore {
> +       atomic_long_t owner;
> +} __attribute__((preserve_access_index));
> +
> +struct mutex {
> +       atomic_long_t owner;
> +} __attribute__((preserve_access_index));
> +
> +struct kernfs_node {
> +       u64 id;
> +} __attribute__((preserve_access_index));
> +
> +struct cgroup {
> +       struct kernfs_node *kn;
> +       int                level;
> +}  __attribute__((preserve_access_index));
> +
> +struct cgroup_subsys_state {
> +       struct cgroup *cgroup;
> +} __attribute__((preserve_access_index));
> +
> +struct css_set {
> +       struct cgroup_subsys_state *subsys[13];
> +       struct cgroup *dfl_cgrp;
> +} __attribute__((preserve_access_index));
> +
> +struct mm_struct {
> +       struct rw_semaphore mmap_lock;
> +} __attribute__((preserve_access_index));
> +
> +struct task_struct {
> +       unsigned int          flags;
> +       struct mm_struct      *mm;
> +       pid_t                 pid;
> +       pid_t                 tgid;
> +       char                  comm[16];
> +       struct sighand_struct *sighand;
> +       struct css_set        *cgroups;
> +} __attribute__((preserve_access_index));
> +
> +struct trace_entry {
> +       short unsigned int type;
> +       unsigned char      flags;
> +       unsigned char      preempt_count;
> +       int                pid;
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_irq_handler_entry {
> +       struct trace_entry ent;
> +       int                irq;
> +       u32                __data_loc_name;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_irq_handler_exit {
> +       struct trace_entry ent;
> +       int                irq;
> +       int                ret;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_softirq {
> +       struct trace_entry ent;
> +       unsigned int       vec;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_workqueue_execute_start {
> +       struct trace_entry ent;
> +       void               *work;
> +       void               *function;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_workqueue_execute_end {
> +       struct trace_entry ent;
> +       void               *work;
> +       void               *function;
> +       char              __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct trace_event_raw_workqueue_activate_work {
> +       struct trace_entry ent;
> +       void               *work;
> +       char               __data[];
> +} __attribute__((preserve_access_index));
> +
> +struct perf_sample_data {
> +       u64                      addr;
> +       u64                      period;
> +       union perf_sample_weight weight;
> +       u64                      txn;
> +       union perf_mem_data_src  data_src;
> +       u64                      ip;
> +       struct {
> +               u32              pid;
> +               u32              tid;
> +       } tid_entry;
> +       u64                      time;
> +       u64                      id;
> +       struct {
> +               u32              cpu;
> +       } cpu_entry;
> +       u64                      phys_addr;
> +       u64                      data_page_size;
> +       u64                      code_page_size;
> +} __attribute__((__aligned__(64))) __attribute__((preserve_access_index));
> +
> +struct bpf_perf_event_data_kern {
> +       struct perf_sample_data *data;
> +       struct perf_event       *event;
> +} __attribute__((preserve_access_index));
> +#endif // __VMLINUX_H
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4
  2023-05-05 20:48                               ` Namhyung Kim
@ 2023-05-10 18:56                                 ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-10 18:56 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ian Rogers, Jiri Olsa, Linus Torvalds, Andrii Nakryiko, Song Liu,
	Andrii Nakryiko, Ingo Molnar, Thomas Gleixner, Clark Williams,
	Kate Carcia, linux-kernel, linux-perf-users, Adrian Hunter,
	Changbin Du, Hao Luo, James Clark, Kan Liang, Roman Lozko,
	Stephane Eranian, Thomas Richter, Arnaldo Carvalho de Melo, bpf,
	Alexei Starovoitov, Yang Jihong, Mark Rutland, Paul Clarke

Em Fri, May 05, 2023 at 01:48:52PM -0700, Namhyung Kim escreveu:
> On Fri, May 5, 2023 at 1:46 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Fri, May 5, 2023 at 1:43 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Fri, May 05, 2023 at 10:04:47AM -0700, Ian Rogers wrote:
> > > > On Fri, May 5, 2023 at 9:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > >
> > > > > Em Fri, May 05, 2023 at 10:33:15AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> > > > > > That with the preserve_access_index isn't needed, we need just the
> > > > > > fields that we access in the tools, right?
> > > > >
> > > > > I'm now doing build test this in many distro containers, without the two
> > > > > reverts, i.e. BPF skels continue as opt-out as in my pull request, to
> > > > > test build and also for the functionality tests on the tools using such
> > > > > bpf skels, see below, no touching of vmlinux nor BTF data during the
> > > > > build.
> > > > >
> > > > > - Arnaldo
> > > > >
> > > > > From 882adaee50bc27f85374aeb2fbaa5b76bef60d05 Mon Sep 17 00:00:00 2001
> > > > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > > Date: Thu, 4 May 2023 19:03:51 -0300
> > > > > Subject: [PATCH 1/1] perf bpf skels: Stop using vmlinux.h generated from BTF,
> > > > >  use subset of used structs + CO-RE
> > > > >
> > > > > Linus reported a build break due to using a vmlinux without a BTF elf
> > > > > section to generate the vmlinux.h header with bpftool for use in the BPF
> > > > > tools in tools/perf/util/bpf_skel/*.bpf.c.
> > > > >
> > > > > Instead add a vmlinux.h file with the structs needed with the fields the
> > > > > tools need, marking the structs with __attribute__((preserve_access_index)),
> > > > > so that libbpf's CO-RE code can fixup the struct field offsets.
> > > > >
> > > > > In some cases the vmlinux.h file that was being generated by bpftool
> > > > > from the kernel BTF information was not needed at all, just including
> > > > > linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
> > > > > types were not being used.
> > > > >
> > > > > To keep te patch small, include those UAPI headers from the trimmed down
> > > > > vmlinux.h file, that then provides the tools with just the structs and
> > > > > the subset of its fields needed for them.
> > > > >
> > > > > Testing it:
> > > > >
> > > > >   # perf lock contention -b find / > /dev/null
> > >
> > > I tested perf lock con -abv -L rcu_state sleep 1
> > > and needed fix below
> > >
> > > jirka
> >
> > I thought this was fixed by:
> > https://lore.kernel.org/lkml/20230427234833.1576130-1-namhyung@kernel.org/

Those are upstream already:

⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master | grep -m1 -B1 "perf lock contention: Fix struct rq lock access"
b9f82b5c63bf5390 perf lock contention: Rework offset calculation with BPF CO-RE
e53de7b65a3ca59a perf lock contention: Fix struct rq lock access
⬢[acme@toolbox perf-tools]$

> > but I think that is just in perf-tools-next.
 
> Right, but we might still need the empty rq definition.

Yeah, without the empty struct diff libbpf complains about a mismatch of
just a forward declaration as the type for 'runqueues' on the
lock_contention.bpf.c file while the kernel has a the type as 'struct
rq':

[root@quaco ~]# perf lock con -ab sleep 1
libbpf: extern (var ksym) 'runqueues': incompatible types, expected [95] fwd rq, but kernel has [55509] struct rq
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -22
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed
[root@quaco ~]#

Adding:

struct rq {};

libbpf is happy:

[root@quaco ~]# perf lock con -ab sleep 1
 contended   total wait     max wait     avg wait         type   caller

         2     50.64 us     25.38 us     25.32 us     spinlock   tick_do_update_jiffies64+0x25
         1     26.18 us     26.18 us     26.18 us     spinlock   tick_do_update_jiffies64+0x25
[root@quaco ~]#

- Arnaldo

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-05-10 18:56 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20230503211801.897735-1-acme@kernel.org>
     [not found] ` <CAHk-=wjY_3cBELRSLMpqCt6Eb71Qei2agfKSNsrr5KcpdEQCaA@mail.gmail.com>
     [not found]   ` <CAHk-=wgci+OTRacQZcvvapRcWkoiTFJ=VTe_JYtabGgZ9refmg@mail.gmail.com>
     [not found]     ` <ZFOSUab5XEJD0kxj@kernel.org>
     [not found]       ` <CAHk-=wgv1sKTdLWPC7XR1Px=pDNrDPDTKdX-T_2AQOwgkpWB2A@mail.gmail.com>
     [not found]         ` <ZFPw0scDq1eIzfHr@kernel.org>
2023-05-04 18:50           ` BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4 Andrii Nakryiko
2023-05-04 19:07             ` Arnaldo Carvalho de Melo
2023-05-04 21:48               ` Arnaldo Carvalho de Melo
2023-05-04 22:01                 ` Arnaldo Carvalho de Melo
2023-05-05 13:18                   ` Arnaldo Carvalho de Melo
2023-05-06  1:13                     ` Yang Jihong
2023-05-05 13:20                   ` Arnaldo Carvalho de Melo
2023-05-04 22:03                 ` Ian Rogers
2023-05-04 23:03                   ` Jiri Olsa
2023-05-04 23:15                     ` Namhyung Kim
2023-05-05  9:36                       ` Jiri Olsa
2023-05-04 23:19                     ` Ian Rogers
2023-05-05  9:39                       ` Jiri Olsa
2023-05-05 11:42                         ` Jiri Olsa
2023-05-05 13:33                     ` Arnaldo Carvalho de Melo
2023-05-05 15:14                       ` Alexei Starovoitov
2023-05-05 16:56                       ` [PATCH RFC/RFT] perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE. was " Arnaldo Carvalho de Melo
2023-05-05 17:04                         ` Ian Rogers
2023-05-05 20:43                           ` Jiri Olsa
2023-05-05 20:46                             ` Ian Rogers
2023-05-05 20:48                               ` Namhyung Kim
2023-05-10 18:56                                 ` Arnaldo Carvalho de Melo
2023-05-05 20:49                               ` Arnaldo Carvalho de Melo
2023-05-05 21:15                               ` Jiri Olsa
2023-05-05 21:21                                 ` Andrii Nakryiko
2023-05-05 21:52                                   ` Arnaldo Carvalho de Melo
2023-05-05 21:55                                     ` Andrii Nakryiko
2023-05-05 21:33                             ` Arnaldo Carvalho de Melo
2023-05-08 21:53                         ` Ian Rogers
2023-05-04 22:46                 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox