public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
Date: Wed, 24 Aug 2022 21:10:34 -0700	[thread overview]
Message-ID: <60ac8d9e-a8d5-634a-7f19-75cf393637d2@fb.com> (raw)
In-Reply-To: <CAADnVQLVuMX4je_4qMruPiDvGf+Uzn80Q1iFcmmunzd9hxFL8g@mail.gmail.com>



On 8/24/22 3:35 PM, Alexei Starovoitov wrote:
> On Wed, Aug 24, 2022 at 12:05 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 8/18/22 1:44 PM, Alexei Starovoitov wrote:
>>> On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
>>>>
>>>>
>>>> On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
>>>>> On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
>>>>>>
>>>>>> In C, struct value can be passed as a function argument.
>>>>>> For small structs, struct value may be passed in
>>>>>> one or more registers. For trampoline based bpf programs,
>>>>>> This would cause complication since one-to-one mapping between
>>>>>> function argument and arch argument register is not valid
>>>>>> any more.
>>>>>>
>>>>>> To support struct value argument and make bpf programs
>>>>>> easy to write, the bpf program function parameter is
>>>>>> changed from struct type to a pointer to struct type.
>>>>>> The following is a simplified example.
>>>>>>
>>>>>> In one of later selftests, we have a bpf_testmod function:
>>>>>>        struct bpf_testmod_struct_arg_2 {
>>>>>>            long a;
>>>>>>            long b;
>>>>>>        };
>>>>>>        noinline int
>>>>>>        bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
>>>>>>            bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
>>>>>>            return bpf_testmod_test_struct_arg_result;
>>>>>>        }
>>>>>>
>>>>>> When a bpf program is attached to the bpf_testmod function
>>>>>> bpf_testmod_test_struct_arg_2(), the bpf program may look like
>>>>>>        SEC("fentry/bpf_testmod_test_struct_arg_2")
>>>>>>        int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
>>>>>>        {
>>>>>>            t2_a = a;
>>>>>>            t2_b_a = b->a;
>>>>>>            t2_b_b = b->b;
>>>>>>            t2_c = c;
>>>>>>            return 0;
>>>>>>        }
>>>>>>
>>>>>> Basically struct value becomes a pointer to the struct.
>>>>>> The trampoline stack will be increased to store the stack values and
>>>>>> the pointer to these values will be saved in the stack slot corresponding
>>>>>> to that argument. For x86_64, the struct size is limited up to 16 bytes
>>>>>> so the struct can fit in one or two registers. The struct size of more
>>>>>> than 16 bytes is not supported now as our current use case is
>>>>>> for sockptr_t in the argument. We could handle such large struct's
>>>>>> in the future if we have concrete use cases.
>>>>>>
>>>>>> The main changes are in save_regs() and restore_regs(). The following
>>>>>> illustrated the trampoline asm codes for save_regs() and restore_regs().
>>>>>> save_regs():
>>>>>>        /* first argument */
>>>>>>        mov    DWORD PTR [rbp-0x18],edi
>>>>>>        /* second argument: struct, save actual values and put the pointer to the slot */
>>>>>>        lea    rax,[rbp-0x40]
>>>>>>        mov    QWORD PTR [rbp-0x10],rax
>>>>>>        mov    QWORD PTR [rbp-0x40],rsi
>>>>>>        mov    QWORD PTR [rbp-0x38],rdx
>>>>>>        /* third argument */
>>>>>>        mov    DWORD PTR [rbp-0x8],esi
>>>>>> restore_regs():
>>>>>>        mov    edi,DWORD PTR [rbp-0x18]
>>>>>>        mov    rsi,QWORD PTR [rbp-0x40]
>>>>>>        mov    rdx,QWORD PTR [rbp-0x38]
>>>>>>        mov    esi,DWORD PTR [rbp-0x8]
>>>>>
>>>>> Not sure whether it was discussed before, but
>>>>> why cannot we adjust the bpf side instead?
>>>>> Technically struct passing between bpf progs was never
>>>>> officially supported. llvm generates something.
>>>>> Probably always passes by reference, but we can adjust
>>>>> that behavior without breaking any programs because
>>>>> we don't have bpf libraries. Programs are fully contained
>>>>> in one or few files. libbpf can do static linking, but
>>>>> without any actual libraries the chance of breaking
>>>>> backward compat is close to zero.
>>>>
>>>> Agree. At this point, we don't need to worry about
>>>> compatibility between bpf program and bpf program libraries.
>>>>
>>>>> Can we teach llvm to pass sizeof(struct) <= 16 in
>>>>> two bpf registers?
>>>>
>>>> Yes, we can. I just hacked llvm and was able to
>>>> do that.
>>>>
>>>>> Then we wouldn't need to have a discrepancy between
>>>>> kernel function prototype and bpf fentry prog proto.
>>>>> Both will have struct by value in the same spot.
>>>>> The trampoline generation will be simpler for x86 and
>>>>> its runtime faster too.
>>>>
>>>> I tested x86 and arm64 both supports two registers
>>>> for a 16 byte struct.
>>>>
>>>>> The other architectures that pass small structs by reference
>>>>> can do a bit more work in the trampoline: copy up to 16 byte
>>>>> and bpf prog side will see it as they were passed in 'registers'.
>>>>> wdyt?
>>>>
>>>> I know systemz and Hexagon will pass by reference for any
>>>> struct size >= 8 bytes. Didn't complete check others.
>>>>
>>>> But since x86 and arm64 supports direct value passing
>>>> with two registers, we should be okay. As you mentioned,
>>>> we could support systemz/hexagon style of struct passing
>>>> by copying the values to the stack.
>>>>
>>>>
>>>> But I have a problem how to define a user friendly
>>>> macro like BPF_PROG for user to use.
>>>>
>>>> Let us say, we have a program like below:
>>>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>>>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
>>>> int c) {
>>>> ...
>>>> }
>>>>
>>>> We have BPF_PROG macro definition here:
>>>>
>>>> #define BPF_PROG(name, args...)     \
>>>> name(unsigned long long *ctx);     \
>>>> static __always_inline typeof(name(0))     \
>>>> ____##name(unsigned long long *ctx, ##args);     \
>>>> typeof(name(0)) name(unsigned long long *ctx)     \
>>>> {     \
>>>>           _Pragma("GCC diagnostic push")      \
>>>>           _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
>>>>           return ____##name(___bpf_ctx_cast(args));      \
>>>>           _Pragma("GCC diagnostic pop")      \
>>>> }     \
>>>> static __always_inline typeof(name(0))     \
>>>> ____##name(unsigned long long *ctx, ##args)
>>>>
>>>> Some we have static function definition
>>>>
>>>> int ____test_struct_arg_1(unsigned long long *ctx, struct
>>>> bpf_testmod_struct_arg_2 *a, int b, int c);
>>>>
>>>> But the function call inside the function test_struct_arg_1()
>>>> is
>>>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
>>>>
>>>> We have two problems here:
>>>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>>>> does not match the static function declaration.
>>>> This is not problem if everything is int/ptr type.
>>>> If one of argument is structure type, we will have
>>>> type conversion problem. Let us this can be resolved
>>>> somehow through some hack.
>>>>
>>>> More importantly, because some structure may take two
>>>> registers,
>>>>      ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>>>> may not be correct. In my above example, if the
>>>> structure size is 16 bytes,
>>>> then the actual call should be
>>>>      ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
>>>> So we need to provide how many extra registers are needed
>>>> beyond ##args in the macro. I have not tried how to
>>>> resolve this but this extra information in the macro
>>>> definite is not user friendly.
>>>>
>>>> Not sure what is the best way to handle this issue (##args is not precise
>>>> and needs addition registers for >8 struct arguments).
>>>
>>> The kernel is using this trick to cast 8 byte structs to u64:
>>> /* cast any integer, pointer, or small struct to u64 */
>>> #define UINTTYPE(size) \
>>>           __typeof__(__builtin_choose_expr(size == 1,  (u8)1, \
>>>                      __builtin_choose_expr(size == 2, (u16)2, \
>>>                      __builtin_choose_expr(size == 4, (u32)3, \
>>>                      __builtin_choose_expr(size == 8, (u64)4, \
>>>                                            (void)5)))))
>>> #define __CAST_TO_U64(x) ({ \
>>>           typeof(x) __src = (x); \
>>>           UINTTYPE(sizeof(x)) __dst; \
>>>           memcpy(&__dst, &__src, sizeof(__dst)); \
>>>           (u64)__dst; })
>>>
>>> casting 16 byte struct to two u64 can be similar.
>>> Ideally we would declare bpf prog as:
>>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
>>> note there is no '*'. It's struct by value.
>>> The main challenge is how to do the math in the BPF_PROG macros.
>>> Currently it's doing:
>>> #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
>>> #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
>>> #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
>>> #define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), (void *)ctx[3]
>>>
>>> The ctx[index] is one-to-one with argument position.
>>> That 'index' needs to be calculated.
>>> Maybe something like UINTTYPE() that applies to previous arguments?
>>> #define REG_CNT(arg) \
>>>           __builtin_choose_expr(sizeof(arg) == 1,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 2,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 4,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 8,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 16,  2, \
>>>                                            (void)0)))))
>>>
>>> #define ___bpf_reg_cnt0()            0
>>> #define ___bpf_reg_cnt1(x)          ___bpf_reg_cnt0() + REG_CNT(x)
>>> #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
>>> #define ___bpf_reg_cnt(args...)    ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)
>>>
>>> This way the macro will calculate the index inside ctx[] array.
>>>
>>> and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
>>> Instead of:
>>> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
>>> it will be
>>> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), \
>>>     __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
>>>                           *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])
>>
>> I tried this approach. The only problem is sizeof(x) <= 8 may also be
>> a structure. Since essentially we will have a type conversion like
>>      (struct <name))(void *)ctx[...]
>> and this won't work.
> 
> Right. Just sizeof(x) <= 8 won't work.
> 
>> So ideally we want something like
>> __builtin_choose_expr(is_struct_type(x), *(typeof(x) *)
>> &ctx[___bpf_reg_cnt(args)]
>>       (void *)ctx[___bpf_reg_cnt(args)])
>> here is_struct_type(x) tells whether the type is a struct type
>> or typedef of a struct. Currently we don't have a such a macro/builtin yet.
> 
> Got it.
> Maybe we can do *(typeof(x) *) &ctx[___bpf_reg_cnt(args)]
> unconditionally for all args?
> Only endianness will be an issue.

Yes, endianness will be an issue.

> 
>> Note that in order to make sizeof(x) or is_struct_type(x) work, we
>> need to separate type and argument name like
>>
>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int,
>> b, int, c)
> 
> agree.
> 
>> Which will make the macro incompatible with existing BPF_PROG macro.
> 
> right. we need a new macro regardless.
> 
>>>
>>> x - is one of the arguments.
>>> args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
>>> *(typeof(x) *)& should type cast to struct of 16 bytes.
>>>
>>> Rough idea, of course.
>>>
>>> Another alternative is instead of:
>>> #define BPF_PROG(name, args...)
>>> name(unsigned long long *ctx);
>>> do:
>>> #define BPF_PROG(name, args...)
>>> struct XX {
>>>     macro inserts all 'args' here separated by ; so it becomes a proper struct
>>> };
>>> name(struct XX *ctx);
>>>
>>> and then instead of doing ___bpf_ctx_castN for each argument
>>> do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
>>> The problem with this approach that small args like char, short, int needs to
>>> be declared in struct XX with __align__(8).
>>
>> This should work. But since we will change context type from
>> "unsigned long long *" to "struct XX *", the code pattern will look like
>>
>> BPF_PROG2_DECL(test_struct_arg_1);
>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>> int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a,
>> int, b, int, c)
>>
>> Where BPF_PROG2_DECL will provide a forward declaration like
>> #define BPF_PROG2_DECL(name) struct _____##name;
>>
>> and BPF_PROG2 will look like (not handling zero argument yere)
>>
>> #define BPF_PROG2(name, args...)                                      \
>> name(struct _____##name *ctx);                                        \
>> struct _____##name {                                                  \
>>          ___bpf_ctx_field(args)                                         \
>> };                                                                    \
>> static __always_inline typeof(name(0))                                \
>> ____##name(struct _____##name *ctx, ___bpf_ctx_decl(args));           \
>> typeof(name(0)) name(struct _____##name *ctx)                         \
>> {                                                                     \
>>          return ____##name(ctx, ___bpf_ctx_arg(args));                  \
>> }                                                                     \
>> static __always_inline typeof(name(0))                                \
>> ____##name(struct _____##name *ctx, ___bpf_ctx_decl(args))
>>
>> where __bpf_ctx_field(args) will generate
>>      struct bpf_testmod_struct_arg_2 a;
>>      int b;
>>      int c;
>>
>> ___bpf_ctx_arg(args) will generate
>>      ctx->a, ctx->b, ctx->c
>>
>> and ___bpf_ctx_decl(args) will generate proper argument prototypes
>> the same way as in BPF_PROG macro.
> 
> Great that 2nd approach works :)
> If 1st approach can be made to work we won't need
> additional line BPF_PROG2_DECL(test_struct_arg_1);
> right?

Right. I don't like addition macro like BPF_PROG2_DECL as well.
I just checked Andrii's approach. It may work and we may not need
the addition macro any more.

> 
> Either way we can start with 2nd approach and improve on it later.

  reply	other threads:[~2022-08-25  4:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 1/6] bpf: Add struct argument info in btf_func_model Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 2/6] bpf: x86: Rename stack_size to regs_off in {save,restore}_regs() Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments Yonghong Song
2022-08-14 20:24   ` Jiri Olsa
2022-08-15  5:29     ` Yonghong Song
2022-08-15  7:29       ` Jiri Olsa
2022-08-15 15:25         ` Yonghong Song
2022-08-15 22:44   ` Alexei Starovoitov
2022-08-18  4:56     ` Yonghong Song
2022-08-18 20:44       ` Alexei Starovoitov
2022-08-24 19:04         ` Yonghong Song
2022-08-24 22:35           ` Alexei Starovoitov
2022-08-25  4:10             ` Yonghong Song [this message]
2022-08-24 19:05         ` Andrii Nakryiko
2022-08-25  4:04           ` Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 4/6] bpf: arm64: No support of struct argument Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 5/6] bpf: Populate struct argument info in btf_func_model Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60ac8d9e-a8d5-634a-7f19-75cf393637d2@fb.com \
    --to=yhs@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox