From: Yonghong Song <yhs@fb.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
Date: Wed, 24 Aug 2022 12:04:55 -0700 [thread overview]
Message-ID: <18c8145c-f6fb-865e-ebd7-2c0c694fdb13@fb.com> (raw)
In-Reply-To: <20220818204428.whsirz2m6prikg7n@MacBook-Pro-3.local>
On 8/18/22 1:44 PM, Alexei Starovoitov wrote:
> On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
>>
>>
>> On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
>>> On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
>>>>
>>>> In C, struct value can be passed as a function argument.
>>>> For small structs, struct value may be passed in
>>>> one or more registers. For trampoline based bpf programs,
>>>> This would cause complication since one-to-one mapping between
>>>> function argument and arch argument register is not valid
>>>> any more.
>>>>
>>>> To support struct value argument and make bpf programs
>>>> easy to write, the bpf program function parameter is
>>>> changed from struct type to a pointer to struct type.
>>>> The following is a simplified example.
>>>>
>>>> In one of later selftests, we have a bpf_testmod function:
>>>> struct bpf_testmod_struct_arg_2 {
>>>> long a;
>>>> long b;
>>>> };
>>>> noinline int
>>>> bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
>>>> bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
>>>> return bpf_testmod_test_struct_arg_result;
>>>> }
>>>>
>>>> When a bpf program is attached to the bpf_testmod function
>>>> bpf_testmod_test_struct_arg_2(), the bpf program may look like
>>>> SEC("fentry/bpf_testmod_test_struct_arg_2")
>>>> int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
>>>> {
>>>> t2_a = a;
>>>> t2_b_a = b->a;
>>>> t2_b_b = b->b;
>>>> t2_c = c;
>>>> return 0;
>>>> }
>>>>
>>>> Basically struct value becomes a pointer to the struct.
>>>> The trampoline stack will be increased to store the stack values and
>>>> the pointer to these values will be saved in the stack slot corresponding
>>>> to that argument. For x86_64, the struct size is limited up to 16 bytes
>>>> so the struct can fit in one or two registers. The struct size of more
>>>> than 16 bytes is not supported now as our current use case is
>>>> for sockptr_t in the argument. We could handle such large struct's
>>>> in the future if we have concrete use cases.
>>>>
>>>> The main changes are in save_regs() and restore_regs(). The following
>>>> illustrated the trampoline asm codes for save_regs() and restore_regs().
>>>> save_regs():
>>>> /* first argument */
>>>> mov DWORD PTR [rbp-0x18],edi
>>>> /* second argument: struct, save actual values and put the pointer to the slot */
>>>> lea rax,[rbp-0x40]
>>>> mov QWORD PTR [rbp-0x10],rax
>>>> mov QWORD PTR [rbp-0x40],rsi
>>>> mov QWORD PTR [rbp-0x38],rdx
>>>> /* third argument */
>>>> mov DWORD PTR [rbp-0x8],esi
>>>> restore_regs():
>>>> mov edi,DWORD PTR [rbp-0x18]
>>>> mov rsi,QWORD PTR [rbp-0x40]
>>>> mov rdx,QWORD PTR [rbp-0x38]
>>>> mov esi,DWORD PTR [rbp-0x8]
>>>
>>> Not sure whether it was discussed before, but
>>> why cannot we adjust the bpf side instead?
>>> Technically struct passing between bpf progs was never
>>> officially supported. llvm generates something.
>>> Probably always passes by reference, but we can adjust
>>> that behavior without breaking any programs because
>>> we don't have bpf libraries. Programs are fully contained
>>> in one or few files. libbpf can do static linking, but
>>> without any actual libraries the chance of breaking
>>> backward compat is close to zero.
>>
>> Agree. At this point, we don't need to worry about
>> compatibility between bpf program and bpf program libraries.
>>
>>> Can we teach llvm to pass sizeof(struct) <= 16 in
>>> two bpf registers?
>>
>> Yes, we can. I just hacked llvm and was able to
>> do that.
>>
>>> Then we wouldn't need to have a discrepancy between
>>> kernel function prototype and bpf fentry prog proto.
>>> Both will have struct by value in the same spot.
>>> The trampoline generation will be simpler for x86 and
>>> its runtime faster too.
>>
>> I tested x86 and arm64 both supports two registers
>> for a 16 byte struct.
>>
>>> The other architectures that pass small structs by reference
>>> can do a bit more work in the trampoline: copy up to 16 byte
>>> and bpf prog side will see it as they were passed in 'registers'.
>>> wdyt?
>>
>> I know systemz and Hexagon will pass by reference for any
>> struct size >= 8 bytes. Didn't complete check others.
>>
>> But since x86 and arm64 supports direct value passing
>> with two registers, we should be okay. As you mentioned,
>> we could support systemz/hexagon style of struct passing
>> by copying the values to the stack.
>>
>>
>> But I have a problem how to define a user friendly
>> macro like BPF_PROG for user to use.
>>
>> Let us say, we have a program like below:
>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
>> int c) {
>> ...
>> }
>>
>> We have BPF_PROG macro definition here:
>>
>> #define BPF_PROG(name, args...) \
>> name(unsigned long long *ctx); \
>> static __always_inline typeof(name(0)) \
>> ____##name(unsigned long long *ctx, ##args); \
>> typeof(name(0)) name(unsigned long long *ctx) \
>> { \
>> _Pragma("GCC diagnostic push") \
>> _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
>> return ____##name(___bpf_ctx_cast(args)); \
>> _Pragma("GCC diagnostic pop") \
>> } \
>> static __always_inline typeof(name(0)) \
>> ____##name(unsigned long long *ctx, ##args)
>>
>> Some we have static function definition
>>
>> int ____test_struct_arg_1(unsigned long long *ctx, struct
>> bpf_testmod_struct_arg_2 *a, int b, int c);
>>
>> But the function call inside the function test_struct_arg_1()
>> is
>> ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
>>
>> We have two problems here:
>> ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>> does not match the static function declaration.
>> This is not problem if everything is int/ptr type.
>> If one of argument is structure type, we will have
>> type conversion problem. Let us this can be resolved
>> somehow through some hack.
>>
>> More importantly, because some structure may take two
>> registers,
>> ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>> may not be correct. In my above example, if the
>> structure size is 16 bytes,
>> then the actual call should be
>> ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
>> So we need to provide how many extra registers are needed
>> beyond ##args in the macro. I have not tried how to
>> resolve this but this extra information in the macro
>> definite is not user friendly.
>>
>> Not sure what is the best way to handle this issue (##args is not precise
>> and needs addition registers for >8 struct arguments).
>
> The kernel is using this trick to cast 8 byte structs to u64:
> /* cast any integer, pointer, or small struct to u64 */
> #define UINTTYPE(size) \
> __typeof__(__builtin_choose_expr(size == 1, (u8)1, \
> __builtin_choose_expr(size == 2, (u16)2, \
> __builtin_choose_expr(size == 4, (u32)3, \
> __builtin_choose_expr(size == 8, (u64)4, \
> (void)5)))))
> #define __CAST_TO_U64(x) ({ \
> typeof(x) __src = (x); \
> UINTTYPE(sizeof(x)) __dst; \
> memcpy(&__dst, &__src, sizeof(__dst)); \
> (u64)__dst; })
>
> casting 16 byte struct to two u64 can be similar.
> Ideally we would declare bpf prog as:
> SEC("fentry/bpf_testmod_test_struct_arg_1")
> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
> note there is no '*'. It's struct by value.
> The main challenge is how to do the math in the BPF_PROG macros.
> Currently it's doing:
> #define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0]
> #define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1]
> #define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2]
> #define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3]
>
> The ctx[index] is one-to-one with argument position.
> That 'index' needs to be calculated.
> Maybe something like UINTTYPE() that applies to previous arguments?
> #define REG_CNT(arg) \
> __builtin_choose_expr(sizeof(arg) == 1, 1, \
> __builtin_choose_expr(sizeof(arg) == 2, 1, \
> __builtin_choose_expr(sizeof(arg) == 4, 1, \
> __builtin_choose_expr(sizeof(arg) == 8, 1, \
> __builtin_choose_expr(sizeof(arg) == 16, 2, \
> (void)0)))))
>
> #define ___bpf_reg_cnt0() 0
> #define ___bpf_reg_cnt1(x) ___bpf_reg_cnt0() + REG_CNT(x)
> #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
> #define ___bpf_reg_cnt(args...) ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)
>
> This way the macro will calculate the index inside ctx[] array.
>
> and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
> Instead of:
> ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2]
> it will be
> ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), \
> __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
> *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])
I tried this approach. The only problem is sizeof(x) <= 8 may also be
a structure. Since essentially we will have a type conversion like
(struct <name))(void *)ctx[...]
and this won't work.
So ideally we want something like
__builtin_choose_expr(is_struct_type(x), *(typeof(x) *)
&ctx[___bpf_reg_cnt(args)]
(void *)ctx[___bpf_reg_cnt(args)])
here is_struct_type(x) tells whether the type is a struct type
or typedef of a struct. Currently we don't have a such a macro/builtin yet.
Note that in order to make sizeof(x) or is_struct_type(x) work, we
need to separate type and argument name like
int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int,
b, int, c)
Which will make the macro incompatible with existing BPF_PROG macro.
>
> x - is one of the arguments.
> args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
> *(typeof(x) *)& should type cast to struct of 16 bytes.
>
> Rough idea, of course.
>
> Another alternative is instead of:
> #define BPF_PROG(name, args...)
> name(unsigned long long *ctx);
> do:
> #define BPF_PROG(name, args...)
> struct XX {
> macro inserts all 'args' here separated by ; so it becomes a proper struct
> };
> name(struct XX *ctx);
>
> and then instead of doing ___bpf_ctx_castN for each argument
> do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
> The problem with this approach that small args like char, short, int needs to
> be declared in struct XX with __align__(8).
This should work. But since we will change context type from
"unsigned long long *" to "struct XX *", the code pattern will look like
BPF_PROG2_DECL(test_struct_arg_1);
SEC("fentry/bpf_testmod_test_struct_arg_1")
int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a,
int, b, int, c)
Where BPF_PROG2_DECL will provide a forward declaration like
#define BPF_PROG2_DECL(name) struct _____##name;
and BPF_PROG2 will look like (not handling zero argument yere)
#define BPF_PROG2(name, args...) \
name(struct _____##name *ctx); \
struct _____##name { \
___bpf_ctx_field(args) \
}; \
static __always_inline typeof(name(0)) \
____##name(struct _____##name *ctx, ___bpf_ctx_decl(args)); \
typeof(name(0)) name(struct _____##name *ctx) \
{ \
return ____##name(ctx, ___bpf_ctx_arg(args)); \
} \
static __always_inline typeof(name(0)) \
____##name(struct _____##name *ctx, ___bpf_ctx_decl(args))
where __bpf_ctx_field(args) will generate
struct bpf_testmod_struct_arg_2 a;
int b;
int c;
___bpf_ctx_arg(args) will generate
ctx->a, ctx->b, ctx->c
and ___bpf_ctx_decl(args) will generate proper argument prototypes
the same way as in BPF_PROG macro.
>
> Both approaches may be workable?
next prev parent reply other threads:[~2022-08-24 19:06 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
2022-08-12 5:24 ` [PATCH bpf-next v2 1/6] bpf: Add struct argument info in btf_func_model Yonghong Song
2022-08-12 5:24 ` [PATCH bpf-next v2 2/6] bpf: x86: Rename stack_size to regs_off in {save,restore}_regs() Yonghong Song
2022-08-12 5:24 ` [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments Yonghong Song
2022-08-14 20:24 ` Jiri Olsa
2022-08-15 5:29 ` Yonghong Song
2022-08-15 7:29 ` Jiri Olsa
2022-08-15 15:25 ` Yonghong Song
2022-08-15 22:44 ` Alexei Starovoitov
2022-08-18 4:56 ` Yonghong Song
2022-08-18 20:44 ` Alexei Starovoitov
2022-08-24 19:04 ` Yonghong Song [this message]
2022-08-24 22:35 ` Alexei Starovoitov
2022-08-25 4:10 ` Yonghong Song
2022-08-24 19:05 ` Andrii Nakryiko
2022-08-25 4:04 ` Yonghong Song
2022-08-12 5:24 ` [PATCH bpf-next v2 4/6] bpf: arm64: No support of struct argument Yonghong Song
2022-08-12 5:24 ` [PATCH bpf-next v2 5/6] bpf: Populate struct argument info in btf_func_model Yonghong Song
2022-08-12 5:24 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18c8145c-f6fb-865e-ebd7-2c0c694fdb13@fb.com \
--to=yhs@fb.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox