From: "Björn Töpel" <bjorn@kernel.org>
To: Pu Lehui <pulehui@huawei.com>, Pu Lehui <pulehui@huaweicloud.com>,
bpf@vger.kernel.org, linux-riscv@lists.infradead.org,
netdev@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Palmer Dabbelt <palmer@dabbelt.com>,
Conor Dooley <conor@kernel.org>,
Luke Nelson <luke.r.nels@gmail.com>
Subject: Re: [PATCH bpf-next 4/4] riscv, bpf: Mixing bpf2bpf and tailcalls
Date: Tue, 30 Jan 2024 17:03:49 +0100 [thread overview]
Message-ID: <87le86q04a.fsf@all.your.base.are.belong.to.us> (raw)
In-Reply-To: <5a30caa3-3351-41e7-a77f-91e5959b2da6@huawei.com>
Pu Lehui <pulehui@huawei.com> writes:
> On 2024/1/30 21:28, Björn Töpel wrote:
>> Pu Lehui <pulehui@huawei.com> writes:
>>
>>> On 2024/1/30 16:29, Björn Töpel wrote:
>>>> Pu Lehui <pulehui@huaweicloud.com> writes:
>>>>
>>>>> On 2023/9/28 17:59, Björn Töpel wrote:
>>>>>> Pu Lehui <pulehui@huaweicloud.com> writes:
>>>>>>
>>>>>>> From: Pu Lehui <pulehui@huawei.com>
>>>>>>>
>>>>>>> In the current RV64 JIT, if we just don't initialize the TCC in subprog,
>>>>>>> the TCC can be propagated from the parent process to the subprocess, but
>>>>>>> the TCC of the parent process cannot be restored when the subprocess
>>>>>>> exits. Since the RV64 TCC is initialized before saving the callee saved
>>>>>>> registers into the stack, we cannot use the callee saved register to
>>>>>>> pass the TCC, otherwise the original value of the callee saved register
>>>>>>> will be destroyed. So we implemented mixing bpf2bpf and tailcalls
>>>>>>> similar to x86_64, i.e. using a non-callee saved register to transfer
>>>>>>> the TCC between functions, and saving that register to the stack to
>>>>>>> protect the TCC value. At the same time, we also consider the scenario
>>>>>>> of mixing trampoline.
>>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> The RISC-V JIT tries to minimize the stack usage, e.g. it doesn't have a
>>>>>> fixed pro/epilogue like some of the other JITs. I think we can do better
>>>>>> here, so that the pass-TCC-via-register can be used, and the additional
>>>>>> stack access can be avoided.
>>>>>>
>>>>>> Today, the TCC is passed via a register (a6) and can be viewed as a
>>>>>> "state" variable/transparent argument/return value. As you point out, we
>>>>>> loose this when we do a call. On (any) calls we move the TCC to a
>>>>>> callee-saved register.
>>>>>>
>>>>>> WDYT about the following scheme:
>>>>>>
>>>>>> 1 Pickup the arm64 bpf2bpf/tailmix mechanism of just clearing the TCC
>>>>>> for the main program.
>>>>>> 2 For BPF helper calls, move TCC to s6, perform the call, and restore
>>>>>> a6. Dito for kfunc calls (BPF_PSEUDO_KFUNC_CALL).
>>>>>> 3 For all other calls, a6 is passed transparently.
>>>>>>
>>>>>> For 2 bpf_jit_get_func_addr() can be used to determine if the callee is
>>>>>> a BPF helper or not.
>>>>>>
>>>>>> In summary; Determine in the JIT if we're leaving BPF-land, and need to
>>>>>> move the TCC to a callee-saved reg, or not, and save us a bunch of stack
>>>>>> store/loads.
>>>>>>
>>>>>
>>>>> Valuable scheme. But we need to consider TCC back propagation. Let me
>>>>> show an example of calling subprog with TCC stored in A6:
>>>>>
>>>>> prog1(TCC==1){
>>>>> subprog1(TCC==1)
>>>>> -> tailcall1(TCC==0)
>>>>> -> subprog2(TCC==0)
>>>>> subprog3(TCC==0) <--- should be TCC==1
>>>>> -\-> tailcall2 <--- can't be called
>>>>> }
>>>
>>> Let's back with this example again. Imagine that the tailcall chain is a
>>> list limited to 33 elements. When the list has 32 elements, we call
>>> subprog1 and then tailcall1. At this time, the list elements count
>>> becomes 33. Then we call subprog2 and return prog1. At this time, the
>>> list removes 1 element and becomes 32 elements. At this time, there
>>> still can perform 1 tailcall.
>>>
>>> I've attached a diagram that shows mixing tailcall and subprogs is
>>> nearly a "call". It can return to caller function.
>>
>> Hmm. Let me put my Q in another way.
>>
>> The kernel calls into BPF_PROG_RUN() (~a BPF context). Would it ever be
>> OK to do more than 33 tail calls, regardless of subprogs or not?
>>
>> In your example, TCC is 1. You are allowed to perform one tail call. In
>> your example prog1 performs two.
>>
>> My view of TCC has always been ~a counter of the number of tailcalls~.
>>
>> With your example expanded:
>> prog1(TCC==33){
>> subprog1(TCC==33)
>> -> tailcall1(TCC==33) -> tailcall1(TCC==32) -> tailcall1(TCC==31) -> ... // 33 times
>> // Lehui says TCC should be 33 again.
>> // Björn says "it's the number of tailcalls", and subprog3 cannot perform a tail call
>> subprog3(TCC==?)
>
> Yes, my view is take this something like a stack,while you take this as
> a fixed global value.
>
> prog1(TCC==33){
> subprog1(TCC==33)
> -> tailcall1(TCC==33) -> tailcall1(TCC==32) ->
> tailcall1(TCC==31) -> ... // 33 times -> subprog2(TCC==0)
> subprog3(TCC==33)
> -> tailcall1(TCC==33) -> tailcall1(TCC==32) -> tailcall1(TCC==31) ->
> ... // 33 times
>
>>
>> My view has, again, been than TCC is a run-time count of the number
>> tailcalls (fentry/fexit patch bpf-programs included).
>>
>> What does x86 and arm64 do?
>
> When subprog return back to caller bpf program, they both restore TCC to
> the value when enter into subprog. The ARM64 uses the callee saved
> register to store the TCC. When the ARM64 exits, the TCC is restored to
> the value when it enter. The while x86 uses the stack to do the same thing.
Ok! Thanks for clarifying. I'll continue reviewing the v2 of your
series!
BTW, I wonder if we can trigger this [1] on RV64 -- i.e. calling the
main prog, will reset the tcc count.
[1] https://lore.kernel.org/bpf/20240104142226.87869-1-hffilwlqm@gmail.com/
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2024-01-30 16:04 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-19 3:57 [PATCH bpf-next 0/4] Mixing bpf2bpf and tailcalls for RV64 Pu Lehui
2023-09-19 3:57 ` [PATCH bpf-next 1/4] riscv, bpf: Remove redundant ctx->offset initialization Pu Lehui
2023-09-19 3:57 ` [PATCH bpf-next 2/4] riscv, bpf: Using kvcalloc to allocate cache buffer Pu Lehui
2023-09-19 3:57 ` [PATCH bpf-next 3/4] riscv, bpf: Add RV_TAILCALL_OFFSET macro to format tailcall offset Pu Lehui
2023-09-19 3:57 ` [PATCH bpf-next 4/4] riscv, bpf: Mixing bpf2bpf and tailcalls Pu Lehui
2023-09-19 10:04 ` Conor Dooley
2023-09-19 10:54 ` Conor Dooley
2023-09-19 11:23 ` Pu Lehui
2023-09-19 11:50 ` Conor Dooley
2023-09-19 12:01 ` Pu Lehui
2023-09-28 9:59 ` Björn Töpel
2023-09-28 10:39 ` Pu Lehui
2024-01-16 14:21 ` Pu Lehui
2024-01-30 8:29 ` Björn Töpel
2024-01-30 9:14 ` Pu Lehui
2024-01-30 13:28 ` Björn Töpel
2024-01-30 14:10 ` Pu Lehui
2024-01-30 16:03 ` Björn Töpel [this message]
2024-01-31 9:18 ` Pu Lehui
2024-01-30 3:26 ` Pu Lehui
2023-09-26 13:30 ` [PATCH bpf-next 0/4] Mixing bpf2bpf and tailcalls for RV64 Björn Töpel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87le86q04a.fsf@all.your.base.are.belong.to.us \
--to=bjorn@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=conor@kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=luke.r.nels@gmail.com \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=palmer@dabbelt.com \
--cc=pulehui@huawei.com \
--cc=pulehui@huaweicloud.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox