* Re: [PATCH] bpf, arm64: inline bpf_get_current_task/_btf() helpers
2024-06-19 13:13 [PATCH] bpf, arm64: inline bpf_get_current_task/_btf() helpers Puranjay Mohan
@ 2024-06-20 2:41 ` Xu Kuohai
2024-06-20 19:07 ` Andrii Nakryiko
2024-06-21 21:30 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Xu Kuohai @ 2024-06-20 2:41 UTC (permalink / raw)
To: Puranjay Mohan, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
Hao Luo, Jiri Olsa, Catalin Marinas, Will Deacon, bpf,
linux-arm-kernel, linux-kernel
Cc: puranjay12
On 6/19/2024 9:13 PM, Puranjay Mohan wrote:
> On ARM64, the pointer to task_struct is always available in the sp_el0
> register and therefore the calls to bpf_get_current_task() and
> bpf_get_current_task_btf() can be inlined into a single MRS instruction.
>
> Here is the difference before and after this change:
>
> Before:
>
> ; struct task_struct *task = bpf_get_current_task_btf();
> 54: mov x10, #0xffffffffffff7978 // #-34440
> 58: movk x10, #0x802b, lsl #16
> 5c: movk x10, #0x8000, lsl #32
> 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0
> 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret
>
> After:
>
> ; struct task_struct *task = bpf_get_current_task_btf();
> 54: mrs x7, sp_el0
>
> This shows around 1% performance improvement in artificial microbenchmark.
>
I think it would be better if more detailed data could be provided.
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> ---
> arch/arm64/net/bpf_jit_comp.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 720336d28856..b838dab3bd26 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -1244,6 +1244,13 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
> break;
> }
>
> + /* Implement helper call to bpf_get_current_task/_btf() inline */
> + if (insn->src_reg == 0 && (insn->imm == BPF_FUNC_get_current_task ||
> + insn->imm == BPF_FUNC_get_current_task_btf)) {
> + emit(A64_MRS_SP_EL0(r0), ctx);
> + break;
> + }
> +
> ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass,
> &func_addr, &func_addr_fixed);
> if (ret < 0)
> @@ -2581,6 +2588,8 @@ bool bpf_jit_inlines_helper_call(s32 imm)
> {
> switch (imm) {
> case BPF_FUNC_get_smp_processor_id:
> + case BPF_FUNC_get_current_task:
> + case BPF_FUNC_get_current_task_btf:
> return true;
> default:
> return false;
Acked-by: Xu Kuohai <xukuohai@huawei.com>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] bpf, arm64: inline bpf_get_current_task/_btf() helpers
2024-06-19 13:13 [PATCH] bpf, arm64: inline bpf_get_current_task/_btf() helpers Puranjay Mohan
2024-06-20 2:41 ` Xu Kuohai
@ 2024-06-20 19:07 ` Andrii Nakryiko
2024-06-21 21:30 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Andrii Nakryiko @ 2024-06-20 19:07 UTC (permalink / raw)
To: Puranjay Mohan
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Xu Kuohai, Catalin Marinas, Will Deacon, bpf, linux-arm-kernel,
linux-kernel, puranjay12
On Wed, Jun 19, 2024 at 6:25 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>
> On ARM64, the pointer to task_struct is always available in the sp_el0
> register and therefore the calls to bpf_get_current_task() and
> bpf_get_current_task_btf() can be inlined into a single MRS instruction.
>
> Here is the difference before and after this change:
>
> Before:
>
> ; struct task_struct *task = bpf_get_current_task_btf();
> 54: mov x10, #0xffffffffffff7978 // #-34440
> 58: movk x10, #0x802b, lsl #16
> 5c: movk x10, #0x8000, lsl #32
> 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0
> 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret
>
> After:
>
> ; struct task_struct *task = bpf_get_current_task_btf();
> 54: mrs x7, sp_el0
>
> This shows around 1% performance improvement in artificial microbenchmark.
>
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> ---
> arch/arm64/net/bpf_jit_comp.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
These are frequently used helpers, similarly to
get_smp_processor_id(), so I think it makes sense to optimize them.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 720336d28856..b838dab3bd26 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -1244,6 +1244,13 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
> break;
> }
>
> + /* Implement helper call to bpf_get_current_task/_btf() inline */
> + if (insn->src_reg == 0 && (insn->imm == BPF_FUNC_get_current_task ||
> + insn->imm == BPF_FUNC_get_current_task_btf)) {
> + emit(A64_MRS_SP_EL0(r0), ctx);
> + break;
> + }
> +
> ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass,
> &func_addr, &func_addr_fixed);
> if (ret < 0)
> @@ -2581,6 +2588,8 @@ bool bpf_jit_inlines_helper_call(s32 imm)
> {
> switch (imm) {
> case BPF_FUNC_get_smp_processor_id:
> + case BPF_FUNC_get_current_task:
> + case BPF_FUNC_get_current_task_btf:
> return true;
> default:
> return false;
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] bpf, arm64: inline bpf_get_current_task/_btf() helpers
2024-06-19 13:13 [PATCH] bpf, arm64: inline bpf_get_current_task/_btf() helpers Puranjay Mohan
2024-06-20 2:41 ` Xu Kuohai
2024-06-20 19:07 ` Andrii Nakryiko
@ 2024-06-21 21:30 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-06-21 21:30 UTC (permalink / raw)
To: Puranjay Mohan
Cc: ast, daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, xukuohai,
catalin.marinas, will, bpf, linux-arm-kernel, linux-kernel,
puranjay12
Hello:
This patch was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:
On Wed, 19 Jun 2024 13:13:34 +0000 you wrote:
> On ARM64, the pointer to task_struct is always available in the sp_el0
> register and therefore the calls to bpf_get_current_task() and
> bpf_get_current_task_btf() can be inlined into a single MRS instruction.
>
> Here is the difference before and after this change:
>
> Before:
>
> [...]
Here is the summary with links:
- bpf, arm64: inline bpf_get_current_task/_btf() helpers
https://git.kernel.org/bpf/bpf-next/c/2bb138cb20a6
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread