Re: BPF_PROG, BPF_KPROBE, BPF_KSYSCALL and enum conversions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, david.faust@oracle.com,
	cupertino.miranda@oracle.com
Subject: Re: BPF_PROG, BPF_KPROBE, BPF_KSYSCALL and enum conversions
Date: Thu, 25 Apr 2024 20:58:22 +0200	[thread overview]
Message-ID: <87zfth5kxd.fsf@oracle.com> (raw)
In-Reply-To: <CAEf4BzYfkb+ZCT+qjQZ5OA=Wy_q2ojk5RGLqf+otZGKC+c1nvQ@mail.gmail.com> (Andrii Nakryiko's message of "Thu, 25 Apr 2024 11:40:37 -0700")


> On Thu, Apr 25, 2024 at 9:49 AM Jose E. Marchesi
> <jose.marchesi@oracle.com> wrote:
>>
>>
>> The BPF_PROG macro defined in tools/lib/bpf/bpf_tracing.h uses a clever
>> hack in order to provide a convenient way to define entry points for BPF
>> programs, that get their argument as elements in a single "context"
>> array argument.
>>
>> It allows to write something like:
>>
>>   SEC("struct_ops/cwnd_event")
>>   void BPF_PROG(cwnd_event, struct sock *sk, enum tcp_ca_event event)
>>   {
>>         bbr_cwnd_event(sk, event);
>>         dctcp_cwnd_event(sk, event);
>>         cubictcp_cwnd_event(sk, event);
>>   }
>>
>> That expands into a pair of functions:
>>
>>   void ____cwnd_event (unsigned long long *ctx, struct sock *sk, enum tcp_ca_event event)
>>   {
>>         bbr_cwnd_event(sk, event);
>>         dctcp_cwnd_event(sk, event);
>>         cubictcp_cwnd_event(sk, event);
>>   }
>>
>>   void cwnd_event (unsigned long long *ctx)
>>   {
>>         _Pragma("GCC diagnostic push")
>>         _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")
>>         return ____cwnd_event(ctx, (void*)ctx[0], (void*)ctx[1]);
>>         _Pragma("GCC diagnostic pop")
>>   }
>>
>> Note how the 64-bit unsigned integers in the incoming CTX get casted to
>> a void pointer, and then implicitly casted to whatever type of the
>> actual argument in the wrapped function.  In this case:
>>
>>   Arg1: unsigned long long -> void * -> struct sock *
>>   Arg2: unsigned long long -> void * -> enum tcp_ca_event
>>
>> The behavior of GCC and clang when facing such conversions differ:
>>
>>   pointer -> pointer
>>
>>     Allowed by the C standard.
>>     GCC: no warning nor error.
>>     clang: no warning nor error.
>>
>>   pointer -> integer type
>>
>>     [C standard says the result of this conversion is implementation
>>      defined, and it may lead to unaligned pointer etc.]
>>
>>     GCC: error: integer from pointer without a cast [-Wint-conversion]
>>     clang: error: incompatible pointer to integer conversion [-Wint-conversion]
>>
>>   pointer -> enumerated type
>>
>>     GCC: error: incompatible types in assigment (*)
>>     clang: error: incompatible pointer to integer conversion [-Wint-conversion]
>>
>> BPF_PROG works because the pointer to integer conversion leads to the
>> same value in 64-bit mode, much like when casting a pointer to
>> uintptr_t.  It also silences compiler errors by mean of the compiler
>> pragma that installs -Wno-int-conversion temporarily.
>>
>> However, the GCC error marked with (*) above when assigning a pointer to
>> an enumerated value is not associated with the -Wint-conversion warning,
>> and it is not possible to turn it off.
>>
>> This is preventing building the BPF kernel selftests with GCC.
>>
>> The magic in the BPF_PROG macro leads down to these macros:
>>
>>   #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
>>   #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
>>   #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
>>   etc
>>
>> An option would be to change all the usages of BPF_PROG that use
>> enumerated arguments in order to use integers instead.  But this is not
>> very nice for obvious reasons.
>>
>> Another option would be to omit the casts to (void *) from the
>> definitions above.  This would lead to conversions from 'unsigned long
>> long' to typed pointers, integer types and enumerated types.  As far as
>> I can tell this should imply no difference in the generated code in
>> 64-bit mode (is there any particular reason for this cast?).  Since the
>> pointer->enum conversion would not happen, errors in both compilers
>> would be successfully silenced with the -Wno-int-conversion pragma.
>>
>> This option would lead to:
>>
>>   #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), ctx[0]
>>   #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), ctx[1]
>>   #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), ctx[2]
>>   #define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), ctx[3]
>>   etc
>>
>> Then there is BPF_KPROBE, which is very much like BPF_PROG but the
>> context is an array of pointers to ptregs instead of an array of
>> unsigned long longs.
>>
>> The BPF_KPROBE arguments and handled by:
>>
>>   #define ___bpf_kprobe_args0()           ctx
>>   #define ___bpf_kprobe_args1(x)          ___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx)
>>   #define ___bpf_kprobe_args2(x, args...) ___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx)
>>   #define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx)
>>   etc
>>
>> There is currently only one BPF_KPROBE usage that uses an enumerated
>> value (handle__kprobe in progs/test_vmlinux.c) but a similar solution to
>> the above could be used, by casting the ptregs pointers to unsigned long
>> long:
>>
>>   #define ___bpf_kprobe_args0()           ctx
>>   #define ___bpf_kprobe_args1(x)          ___bpf_kprobe_args0(),(unsigned long long )PT_REGS_PARM1(ctx)
>>   #define ___bpf_kprobe_args2(x, args...) ___bpf_kprobe_args1(args),(unsigned long long)PT_REGS_PARM2(ctx)
>>   #define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args),(unsigned long long)PT_REGS_PARM3(ctx)
>>   etc
>>
>> Similar situation with BPF_KSYSCALL:
>>
>>   #define ___bpf_syswrap_args1(x)          ___bpf_syswrap_args0(), (void *)PT_REGS_PARM1_CORE_SYSCALL(regs)
>>   #define ___bpf_syswrap_args2(x, args...) ___bpf_syswrap_args1(args), (void *)PT_REGS_PARM2_CORE_SYSCALL(regs)
>>   etc
>>
>> There is currently no usage of BPF_KSYSCALL with enumerated types, but
>> the same change would lead to:
>>
>>   #define ___bpf_syswrap_args1(x)          ___bpf_syswrap_args0(),(unsigned long long)PT_REGS_PARM1_CORE_SYSCALL(regs)
>>   #define ___bpf_syswrap_args2(x, args...) ___bpf_syswrap_args1(args),(unsigned long long )PT_REGS_PARM2_CORE_SYSCALL(regs)
>>   etc
>>
>> Opinions?
>>
>
> I don't remember why I did (void *), but I think I was just banging my
> head against the compiler until I made it work, and once it worked, I
> didn't try to improve it further :) If casting to (unsigned long long)
> works just as well as (void *) and helps in GCC case, let's convert.

Ok, I will test a patch on that direction.

> Just please don't miss ___bpf_syscall_args* and ___bpf_kretprobe_args1
> as well.

k

     prev parent reply	other threads:[~2024-04-25 18:58 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25 16:49 BPF_PROG, BPF_KPROBE, BPF_KSYSCALL and enum conversions Jose E. Marchesi
2024-04-25 18:40 ` Andrii Nakryiko
2024-04-25 18:58   ` Jose E. Marchesi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zfth5kxd.fsf@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=cupertino.miranda@oracle.com \
    --cc=david.faust@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.