BPF List
 help / color / mirror / Atom feed
* [PATCH bpf-next] bpf: improve duplicate source code line detection
@ 2024-02-14 17:41 Andrii Nakryiko
  2024-02-15 21:10 ` patchwork-bot+netdevbpf
  0 siblings, 1 reply; 2+ messages in thread
From: Andrii Nakryiko @ 2024-02-14 17:41 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

Verifier log avoids printing the same source code line multiple times
when a consecutive block of BPF assembly instructions are covered by the
same original (C) source code line. This greatly improves verifier log
legibility.

Unfortunately, this check is imperfect and in production applications it
quite often happens that verifier log will have multiple duplicated
source lines emitted, for no apparently good reason. E.g., this is
excerpt from a real-world BPF application (with register states omitted
for clarity):

BEFORE
======
; for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5369: (07) r8 += 2                    ;
5370: (07) r7 += 16                   ;
; for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5371: (07) r9 += 1                    ;
5372: (79) r4 = *(u64 *)(r10 -32)     ;
; for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5373: (55) if r9 != 0xf goto pc+2
; if (i >= map->cnt) @ strobemeta_probe.bpf.c:396
5376: (79) r1 = *(u64 *)(r10 -40)     ;
5377: (79) r1 = *(u64 *)(r1 +8)       ;
; if (i >= map->cnt) @ strobemeta_probe.bpf.c:396
5378: (dd) if r1 s<= r9 goto pc-5     ;
; descr->key_lens[i] = 0; @ strobemeta_probe.bpf.c:398
5379: (b4) w1 = 0                     ;
5380: (6b) *(u16 *)(r8 -30) = r1      ;
; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); @ strobemeta_probe.bpf.c:400
5381: (79) r3 = *(u64 *)(r7 -8)       ;
5382: (7b) *(u64 *)(r10 -24) = r6     ;
; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); @ strobemeta_probe.bpf.c:400
5383: (bc) w6 = w6                    ;
; barrier_var(payload_off); @ strobemeta_probe.bpf.c:280
5384: (bf) r2 = r6                    ;
5385: (bf) r1 = r4                    ;

As can be seen, line 394 is emitted thrice, 396 is emitted twice, and
line 400 is duplicated as well. Note that there are no intermingling
other lines of source code in between these duplicates, so the issue is
not compiler reordering assembly instruction such that multiple original
source code lines are in effect.

It becomes more obvious what's going on if we look at *full* original line info
information (using btfdump for this, [0]):

  #2764: line: insn #5363 --> 394:3 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) {
  #2765: line: insn #5373 --> 394:21 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) {
  #2766: line: insn #5375 --> 394:47 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) {
  #2767: line: insn #5377 --> 394:3 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) {
  #2768: line: insn #5378 --> 414:10 @ ./././strobemeta_probe.bpf.c
            return off;

We can see that there are four line info records covering
instructions #5363 through #5377 (instruction indices are shifted due to
subprog instruction being appended to main program), all of them are
pointing to the same C source code line #394. But each of them points to
a different part of that line, which is denoted by differing column
numbers (3, 21, 47, 3).

But verifier log doesn't distinguish between parts of the same source code line
and doesn't emit this column number information, so for end user it's just a
repetitive visual noise. So let's improve the detection of repeated source code
line and avoid this.

With the changes in this patch, we get this output for the same piece of BPF
program log:

AFTER
=====
; for (int i = 0; i < STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5369: (07) r8 += 2                    ;
5370: (07) r7 += 16                   ;
5371: (07) r9 += 1                    ;
5372: (79) r4 = *(u64 *)(r10 -32)     ;
5373: (55) if r9 != 0xf goto pc+2
; if (i >= map->cnt) @ strobemeta_probe.bpf.c:396
5376: (79) r1 = *(u64 *)(r10 -40)     ;
5377: (79) r1 = *(u64 *)(r1 +8)       ;
5378: (dd) if r1 s<= r9 goto pc-5     ;
; descr->key_lens[i] = 0; @ strobemeta_probe.bpf.c:398
5379: (b4) w1 = 0                     ;
5380: (6b) *(u16 *)(r8 -30) = r1      ;
; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); @ strobemeta_probe.bpf.c:400
5381: (79) r3 = *(u64 *)(r7 -8)       ;
5382: (7b) *(u64 *)(r10 -24) = r6     ;
5383: (bc) w6 = w6                    ;
; barrier_var(payload_off); @ strobemeta_probe.bpf.c:280
5384: (bf) r2 = r6                    ;
5385: (bf) r1 = r4                    ;

All the duplication is gone and the log is cleaner and less distracting.

  [0] https://github.com/anakryiko/btfdump

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/log.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index cc789efc7f43..c702c3a42958 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -362,15 +362,28 @@ __printf(3, 4) void verbose_linfo(struct bpf_verifier_env *env,
 				  u32 insn_off,
 				  const char *prefix_fmt, ...)
 {
-	const struct bpf_line_info *linfo;
+	const struct bpf_line_info *linfo, *prev_linfo;
 	const struct btf *btf;
 	const char *s, *fname;
 
 	if (!bpf_verifier_log_needed(&env->log))
 		return;
 
+	prev_linfo = env->prev_linfo;
 	linfo = find_linfo(env, insn_off);
-	if (!linfo || linfo == env->prev_linfo)
+	if (!linfo || linfo == prev_linfo)
+		return;
+
+	/* It often happens that two separate linfo records point to the same
+	 * source code line, but have differing column numbers. Given verifier
+	 * log doesn't emit column information, from user perspective we just
+	 * end up emitting the same source code line twice unnecessarily.
+	 * So instead check that previous and current linfo record point to
+	 * the same file (file_name_offs match) and the same line number, and
+	 * avoid emitting duplicated source code line in such case.
+	 */
+	if (prev_linfo && linfo->file_name_off == prev_linfo->file_name_off &&
+	    BPF_LINE_INFO_LINE_NUM(linfo->line_col) == BPF_LINE_INFO_LINE_NUM(prev_linfo->line_col))
 		return;
 
 	if (prefix_fmt) {
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH bpf-next] bpf: improve duplicate source code line detection
  2024-02-14 17:41 [PATCH bpf-next] bpf: improve duplicate source code line detection Andrii Nakryiko
@ 2024-02-15 21:10 ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 2+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-02-15 21:10 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, martin.lau, kernel-team

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Wed, 14 Feb 2024 09:41:00 -0800 you wrote:
> Verifier log avoids printing the same source code line multiple times
> when a consecutive block of BPF assembly instructions are covered by the
> same original (C) source code line. This greatly improves verifier log
> legibility.
> 
> Unfortunately, this check is imperfect and in production applications it
> quite often happens that verifier log will have multiple duplicated
> source lines emitted, for no apparently good reason. E.g., this is
> excerpt from a real-world BPF application (with register states omitted
> for clarity):
> 
> [...]

Here is the summary with links:
  - [bpf-next] bpf: improve duplicate source code line detection
    https://git.kernel.org/bpf/bpf-next/c/57354f5fdee8

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-02-15 21:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-14 17:41 [PATCH bpf-next] bpf: improve duplicate source code line detection Andrii Nakryiko
2024-02-15 21:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox