BPF List
 help / color / mirror / Atom feed
From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, kafai@meta.com, kernel-team@meta.com,
	Mykyta Yatsenko <yatsenko@meta.com>
Subject: Re: [PATCH bpf-next] selftests/bpf: emit top frequent C code lines in veristat
Date: Fri, 30 Aug 2024 21:31:02 +0100	[thread overview]
Message-ID: <86ff6bd1-6b26-47c0-9f5b-294a8609c0d4@gmail.com> (raw)
In-Reply-To: <CAEf4BzYjzfK9ppcrLg6As_8egG4f49HjeG-UGzNgBg9qtao0Uw@mail.gmail.com>

On 22/08/2024 23:29, Andrii Nakryiko wrote:
> On Tue, Aug 20, 2024 at 8:24 AM Mykyta Yatsenko
> <mykyta.yatsenko5@gmail.com> wrote:
>> From: Mykyta Yatsenko <yatsenko@meta.com>
>>
>> Production BPF programs are increasing in number of instructions and states
>> to the point, where optimising verification process for them is necessary
>> to avoid running into instruction limit. Authors of those BPF programs
>> need to analyze verifier output, for example, collecting the most
>> frequent C source code lines to understand which part of the program has
>> the biggest verification cost.
>>
>> This patch introduces `--top-lines` and `--include-instructions` flags in
>> veristat.
>> `--top-lines=N` makes veristat output N the most popular C sorce code
>> lines, parsed from verification log. `--include-instructions` enables
>> printing BPF instructions along with C source code.
> Hm... I think --include-instructions needs a bit more thought to be
> useful. Just one assembly instruction isn't all that useful, we should
> be thinking in terms of blocks of assembly instruction, probably...
> But then not sure how to take that into account when calculating top N
> frequencies...
>
> Not sure about all that. For v2, let's drop the assembly instructions
> parts and try to get --top-lines logic right. We can then see how it
> works in practice and adjust and extend as necessary.
--include-instructions outputs a block of the instructions that go after
the source code line in the verifier buffer, an example output:
```
processed 5549 insns (limit 1000000) max_states_per_insn 8 total_states 
226 peak_states 198 mark_read 29

Top C source code lines:
; [Count: 114]  ctx->off = off + *opsize; @ xdp_synproxy_kern.c:260
1033: (71) r5 = *(u8 *)(r5 +0)        ; frame1: 
R5_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 
0xff)) cb
1034: (0c) w4 += w5                   ; frame1: 
R4_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=0x100fd,var_off=(0x0; 
0x1ffff)) 
R5_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 
0xff)) cb
1035: (63) *(u32 *)(r2 +28) = r4      ; frame1: R2=fp[0]-64 
R4_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=0x100fd,var_off=(0x0; 
0x1ffff)) cb
1036: (bc) w7 = w4                    ; frame1: 
R4=scalar(id=75,smin=smin32=0,smax=umax=smax32=umax32=0x100fd,var_off=(0x0; 
0x1ffff)) 
R7=scalar(id=75,smin=smin32=0,smax=umax=smax32=umax32=0x100fd,var_off=(0x0; 
0x1ffff)) cb
1037: (b4) w0 = 0                     ; frame1: R0_w=0 cb
; [Count: 108]  for (i = 0; i < 7; i++) @ xdp_synproxy_kern.c:269
1038: (04) w1 += -1                   ; frame1: R1_w=6 cb
1039: (bc) w4 = w7                    ; frame1: 
R4_w=scalar(id=75,smin=smin32=0,smax=umax=smax32=umax32=0x100fd,var_off=(0x0; 
0x1ffff)) 
R7=scalar(id=75,smin=smin32=0,smax=umax=smax32=umax32=0x100fd,var_off=(0x0; 
0x1ffff)) cb
1040: (56) if w1 != 0x0 goto pc+1
; [Count: 106]  if (data + sz >= ctx->data_end) @ xdp_synproxy_kern.c:211
1046: (bf) r5 = r6                    ; frame1: 
R5_w=pkt(id=72,r=0,smin=smin32=0,smax=umax=smax32=umax32=0xfffe,var_off=(0x0; 
0xffff)) 
R6_w=pkt(id=72,r=0,smin=smin32=0,smax=umax=smax32=umax32=0xfffe,var_off=(0x0; 
0xffff)) cb
1047: (07) r5 += 1                    ; frame1: 
R5_w=pkt(id=72,off=1,r=0,smin=smin32=0,smax=umax=smax32=umax32=0xfffe,var_off=(0x0; 
0xffff)) cb
1048: (79) r7 = *(u64 *)(r2 +8)       ; frame1: R2=fp[0]-64 
R7_w=pkt_end() cb
1049: (3d) if r5 >= r7 goto pc-9      ; frame1: 
R5_w=pkt(id=72,off=1,r=2,smin=smin32=0,smax=umax=smax32=umax32=0xfffe,var_off=(0x0; 
0xffff)) R7_w=pkt_end() cb

File                     Program Verdict  Duration (us)  Insns  States  
Peak states
-----------------------  ------------- -------  -------------  -----  
------  -----------
xdp_synproxy_kern.bpf.o  syncookie_tc success          37106   5549     
226          198
xdp_synproxy_kern.bpf.o  syncookie_xdp success          40206   5826     
228          201
-----------------------  ------------- -------  -------------  -----  
------  -----------
Done. Processed 1 files, 0 programs. Skipped 2 files, 0 programs.
```

Thanks for all the findings, I'll apply them for v2.

>> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
>> ---
>>   tools/testing/selftests/bpf/veristat.c | 160 +++++++++++++++++++++++++
>>   1 file changed, 160 insertions(+)
>>
> Ok, #1 problem is that --top-lines is useless without -vl2, so we
> should either check that this is specified. Or maybe better force
> verbose log internally without actually emitting it, probably it's
> better.
>
>> diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c
>> index 1ec5c4c47235..977ab54cba83 100644
>> --- a/tools/testing/selftests/bpf/veristat.c
>> +++ b/tools/testing/selftests/bpf/veristat.c
>> @@ -16,10 +16,12 @@
>>   #include <sys/stat.h>
>>   #include <bpf/libbpf.h>
>>   #include <bpf/btf.h>
>> +#include <bpf/hashmap.h>
> well, great for veristat in kernel repo, but this is libbpf-internal
> thing and I'd like to avoid relying on it in veristat to make Github
> sync simple.
>
>>   #include <libelf.h>
>>   #include <gelf.h>
>>   #include <float.h>
>>   #include <math.h>
>> +#include <linux/err.h>
>>
>>   #ifndef ARRAY_SIZE
>>   #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
>> @@ -179,8 +181,16 @@ static struct env {
>>          int files_skipped;
>>          int progs_processed;
>>          int progs_skipped;
>> +       int top_lines;
>> +       bool include_insn;
>>   } env;
>>
>> +struct line_cnt {
>> +       long cnt;
>> +       char *line;
>> +       char *insn;
>> +};
>> +
>>   static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
>>   {
>>          if (!env.verbose)
>> @@ -206,6 +216,8 @@ const char argp_program_doc[] =
>>   enum {
>>          OPT_LOG_FIXED = 1000,
>>          OPT_LOG_SIZE = 1001,
>> +       OPT_TOP_LINES = 1002,
>> +       OPT_INCLUDE_INSN = 1003,
>>   };
>>
>>   static const struct argp_option opts[] = {
>> @@ -228,6 +240,9 @@ static const struct argp_option opts[] = {
>>            "Force frequent BPF verifier state checkpointing (set BPF_F_TEST_STATE_FREQ program flag)" },
>>          { "test-reg-invariants", 'r', NULL, 0,
>>            "Force BPF verifier failure on register invariant violation (BPF_F_TEST_REG_INVARIANTS program flag)" },
>> +       { "top-lines", OPT_TOP_LINES, "N", 0, "Emit N the most frequent C source code lines." },
> "Emit N most frequent source code lines."
>
> Doesn't have to be C, in general.
>
> maybe let's call it "--top-src-lines" to be a bit more specific?
>
>> +       { "include-instructions", OPT_INCLUDE_INSN, NULL, OPTION_HIDDEN,
>> +         "If emitting the most frequent C source code lines, include their BPF instructions." },
>>          {},
>>   };
>>
>> @@ -337,6 +352,17 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
>>                          return -ENOMEM;
>>                  env.filename_cnt++;
>>                  break;
>> +       case OPT_TOP_LINES:
>> +               errno = 0;
>> +               env.top_lines = strtol(arg, NULL, 10);
>> +               if (errno) {
>> +                       fprintf(stderr, "invalid top lines N specifier: %s\n", arg);
>> +                       argp_usage(state);
>> +               }
>> +               break;
>> +       case OPT_INCLUDE_INSN:
>> +               env.include_insn = true;
>> +               break;
>>          default:
>>                  return ARGP_ERR_UNKNOWN;
>>          }
>> @@ -817,6 +843,24 @@ static void free_verif_stats(struct verif_stats *stats, size_t stat_cnt)
>>          free(stats);
>>   }
>>
>> +static int line_cnt_cmp(const void *a, const void *b)
>> +{
>> +       const struct line_cnt **a_cnt = (const struct line_cnt **)a;
>> +       const struct line_cnt **b_cnt = (const struct line_cnt **)b;
>> +
>> +       return (*b_cnt)->cnt - (*a_cnt)->cnt;
>> +}
>> +
>> +static size_t str_hash_fn(long key, void *ctx)
>> +{
>> +       return str_hash((void *)key);
>> +}
>> +
>> +static bool str_equal_fn(long a, long b, void *ctx)
>> +{
>> +       return strcmp((void *)a, (void *)b) == 0;
>> +}
>> +
>>   static char verif_log_buf[64 * 1024];
>>
>>   #define MAX_PARSED_LOG_LINES 100
>> @@ -854,6 +898,120 @@ static int parse_verif_log(char * const buf, size_t buf_sz, struct verif_stats *
>>          return 0;
>>   }
>>
>> +static char *parse_instructions(char *buf, char *buf_end)
>> +{
>> +       char *start = buf;
>> +
>> +       while (buf && buf < buf_end && *buf && *buf != ';') {
>> +               char *num_end = NULL;
>> +
>> +               strtol(buf, &num_end, 10);
>> +               if (!num_end || *num_end != ':')
>> +                       break;
>> +
>> +               buf = strchr(num_end, '\n');
>> +       }
>> +
>> +       return start == buf ? NULL : strndup(start, buf - start);
>> +}
>> +
>> +static int print_top_lines(char * const buf, size_t buf_sz)
>> +{
>> +       struct hashmap *lines_map;
>> +       struct line_cnt **lines_cnt = NULL;
>> +       struct hashmap_entry *entry;
>> +       char *buf_end;
>> +       char *line;
>> +       int err = 0;
>> +       int unique_lines;
>> +       int bkt;
>> +       int i;
>> +
>> +       if (!buf || !buf_sz)
>> +               return -EINVAL;
>> +
> let's make sure we don't call print_top_lines with not buffer instead ?
>
>> +       buf_end = buf + buf_sz - 1;
>> +       *buf_end = '\0';
> and buffer should be valid, so no need to zero-terminate it (verifier
> guarantees that)
>
>> +       lines_map = hashmap__new(str_hash_fn, str_equal_fn, NULL);
>> +       if (IS_ERR(lines_map))
>> +               return PTR_ERR(lines_map);
>> +
>> +       for (char *line_start = buf; line_start < buf_end;) {
>> +               char *line_end = strchr(line_start, '\n');
> any reason we can't use strtok_r() for this?
>
>> +
>> +               if (!line_end)
>> +                       line_end = buf_end;
>> +
>> +               if (*line_start == ';') {
> let's check that it starts with "; " with strncmp() and skip space as well?
>
>> +                       struct line_cnt *line_cnt;
>> +
>> +                       line_start++; /* skip semicolon */
>> +                       *line_end = '\0';
>> +                       if (hashmap__find(lines_map, line_start, &line_cnt)) {
>> +                               line_cnt->cnt++;
> so as I mentioned, I'd like to avoid the use of libbpf's hashmap. How
> about we just add each string's offset within the buffer into a
> (rather long sometimes) array of u32s. Then implement custom
> comparator that would compare actual strings within log buffer by its
> offset. Sort such indices this way, and then (reusing this comparator)
> implement "unique" operation just like std::unique. Then we'll only
> need to re-sort indices (but now taking their total counts), and emit
> first/last N items.
>
> Basically, keep it cheap by using offsets, but otherwise rely on NlogN
> sorting to avoid hashmaps.
>
>> +                       } else {
>> +                               char *insn = NULL;
>> +
>> +                               line_cnt = malloc(sizeof(struct line_cnt));
>> +                               if (!line_cnt) {
>> +                                       *line_end = '\n';
>> +                                       goto cleanup;
>> +                               }
>> +                               line = strdup(line_start);
>> +                               if (!line) {
>> +                                       *line_end = '\n';
>> +                                       free(line_cnt);
>> +                                       goto cleanup;
>> +                               }
>> +                               if (env.include_insn)
>> +                                       insn = parse_instructions(line_end + 1, buf_end);
>> +                               line_cnt->insn = insn;
>> +                               line_cnt->line = line;
>> +                               line_cnt->cnt = 1;
>> +                               err = hashmap__add(lines_map, line, line_cnt);
>> +                       }
>> +                       *line_end = '\n';
>> +
>> +                       if (err)
>> +                               goto cleanup;
>> +               }
>> +               line_start = line_end + 1;
>> +       }
>> +       unique_lines = hashmap__size(lines_map);
>> +       if (!unique_lines)
>> +               goto cleanup;
>> +
>> +       lines_cnt = calloc(unique_lines, sizeof(struct line_cnt *));
>> +       if (!lines_cnt)
>> +               goto cleanup;
>> +
>> +       i = 0;
>> +       hashmap__for_each_entry(lines_map, entry, bkt)
>> +               lines_cnt[i++] = (struct line_cnt *)entry->value;
>> +
>> +       qsort(lines_cnt, unique_lines, sizeof(struct line_cnt *), line_cnt_cmp);
>> +
>> +       printf("Top C source code lines:\n");
> nit: there is no need to say "C source code", it's just "source code"
>
>> +       for (i = 0; i  < min(unique_lines, env.top_lines); i++) {
>> +               printf("; [Count: %ld] %s\n", lines_cnt[i]->cnt, lines_cnt[i]->line);
> [Count: %ld] prefix is super verbose. Let's just emit a number of
> occurrences without any extra "Count" text.
>
> BTW, newer verifiers emit file location information now which looks
> like " @ test_vmlinux.c:82", it would be nice if we could detect that,
> parse it out separately (at the very last moment, during output) and
> reformat everything to something like:
>
> 123: (test_vmlinux.c:82) <the rest of source code line>
>
> Make sure to use something like %5d for frequency, so at least that
> part is nicely aligned
>
>> +               if (env.include_insn)
>> +                       printf("%s\n", lines_cnt[i]->insn);
>> +       }
>> +       printf("\n");
>> +
>> +cleanup:
>> +       hashmap__for_each_entry(lines_map, entry, bkt) {
>> +               struct line_cnt *line_cnt = (struct line_cnt *)entry->value;
>> +
>> +               free(line_cnt->insn);
>> +               free(line_cnt->line);
>> +               free(line_cnt);
>> +       }
> we really shouldn't be allocating so much for strings, we have one
> huge string and we should be dealing with indices into the buffer
>
> pw-bot: cr
>
>> +       hashmap__free(lines_map);
>> +       free(lines_cnt);
>> +       return err;
>> +}
>> +
>>   static int guess_prog_type_by_ctx_name(const char *ctx_name,
>>                                         enum bpf_prog_type *prog_type,
>>                                         enum bpf_attach_type *attach_type)
>> @@ -1048,6 +1206,8 @@ static int process_prog(const char *filename, struct bpf_object *obj, struct bpf
>>                         filename, prog_name, stats->stats[DURATION],
>>                         err ? "failure" : "success", buf);
>>          }
>> +       if (env.top_lines)
>> +               print_top_lines(buf, buf_sz);
>>
>>          if (verif_log_buf != buf)
>>                  free(buf);
>> --
>> 2.46.0
>>


      reply	other threads:[~2024-08-30 20:31 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-20 15:24 [PATCH bpf-next] selftests/bpf: emit top frequent C code lines in veristat Mykyta
2024-08-22 22:29 ` Andrii Nakryiko
2024-08-30 20:31   ` Mykyta Yatsenko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86ff6bd1-6b26-47c0-9f5b-294a8609c0d4@gmail.com \
    --to=mykyta.yatsenko5@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@meta.com \
    --cc=kernel-team@meta.com \
    --cc=yatsenko@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox