From: Puranjay Mohan <puranjay12@gmail.com>
To: "Björn Töpel" <bjorn@kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Eduard Zingerman" <eddyz87@gmail.com>,
"Song Liu" <song@kernel.org>,
"Yonghong Song" <yonghong.song@linux.dev>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@kernel.org>,
"Stanislav Fomichev" <sdf@google.com>,
"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
"Paul Walmsley" <paul.walmsley@sifive.com>,
"Palmer Dabbelt" <palmer@dabbelt.com>,
"Albert Ou" <aou@eecs.berkeley.edu>,
bpf@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org, "Pu Lehui" <pulehui@huawei.com>
Subject: Re: [PATCH bpf-next] riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrs
Date: Fri, 05 Apr 2024 14:33:34 +0000 [thread overview]
Message-ID: <mb61pjzlb98w1.fsf@gmail.com> (raw)
In-Reply-To: <87wmpbnbce.fsf@all.your.base.are.belong.to.us>
Björn Töpel <bjorn@kernel.org> writes:
> Puranjay Mohan <puranjay12@gmail.com> writes:
>
>> Support an instruction for resolving absolute addresses of per-CPU
>> data from their per-CPU offsets. This instruction is internal-only and
>> users are not allowed to use them directly. They will only be used for
>> internal inlining optimizations for now between BPF verifier and BPF
>> JITs.
>>
>> RISC-V uses generic per-cpu implementation where the offsets for CPUs
>> are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores
>> the address of the task_struct in TP register. The first element in
>> tast_struct is struct thread_info, and we can get the cpu number by
> ^
> k ;-)
I need to start using some kind of spell-check in vim :D.
>> reading from the TP register + offsetof(struct thread_info, cpu).
>>
>> Once we have the cpu number in a register we read the offset for that
>> cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this
>> offset to the destination register.
>
> Just to clarify for readers; BPF programs are run with migrate disable,
> which means that on RT we can be preempted, which means that per-cpu
> operations are trickier (disabling interrupts/preemption).
>
> However, this BPF instruction is about calculating the per-cpu address,
> so the look up can be inlined.
>
> It's not a per-cpu *operation*.
Will add this information to the commit message.
>> To measure the improvement from this change, the benchmark in [1] was
>> used on Qemu:
>>
>> Before:
>> glob-arr-inc : 1.127 ± 0.013M/s
>> arr-inc : 1.121 ± 0.004M/s
>> hash-inc : 0.681 ± 0.052M/s
>>
>> After:
>> glob-arr-inc : 1.138 ± 0.011M/s
>> arr-inc : 1.366 ± 0.006M/s
>> hash-inc : 0.676 ± 0.001M/s
>>
>> [1] https://github.com/anakryiko/linux/commit/8dec900975ef
>>
>> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
>> ---
>> arch/riscv/net/bpf_jit_comp64.c | 24 ++++++++++++++++++++++++
>> 1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
>> index 15e482f2c657..e95bd1d459a4 100644
>> --- a/arch/riscv/net/bpf_jit_comp64.c
>> +++ b/arch/riscv/net/bpf_jit_comp64.c
>> @@ -12,6 +12,7 @@
>> #include <linux/stop_machine.h>
>> #include <asm/patch.h>
>> #include <asm/cfi.h>
>> +#include <asm/percpu.h>
>> #include "bpf_jit.h"
>>
>> #define RV_FENTRY_NINSNS 2
>> @@ -1089,6 +1090,24 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>> emit_or(RV_REG_T1, rd, RV_REG_T1, ctx);
>> emit_mv(rd, RV_REG_T1, ctx);
>> break;
>> + } else if (insn_is_mov_percpu_addr(insn)) {
>> + if (rd != rs)
>> + emit_mv(rd, rs, ctx);
>> +#ifdef CONFIG_SMP
>> + /* Load current CPU number in T1 */
>> + emit_ld(RV_REG_T1, offsetof(struct thread_info, cpu), RV_REG_TP,
>> + ctx);
>> + /* << 3 because offsets are 8 bytes */
>> + emit_slli(RV_REG_T1, RV_REG_T1, 3, ctx);
>> + /* Load address of __per_cpu_offset array in T2 */
>> + emit_imm(RV_REG_T2, (u64)&__per_cpu_offset, ctx);
>
> Did you try using emit_addr() here? I'd guess that'll be fewer
> instructions, no?
Yes, I should have used that, the address would always be in the range
of auipc+addi right? I will try and see.
Thanks,
Puranjay
WARNING: multiple messages have this Message-ID (diff)
From: Puranjay Mohan <puranjay12@gmail.com>
To: "Björn Töpel" <bjorn@kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Eduard Zingerman" <eddyz87@gmail.com>,
"Song Liu" <song@kernel.org>,
"Yonghong Song" <yonghong.song@linux.dev>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@kernel.org>,
"Stanislav Fomichev" <sdf@google.com>,
"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
"Paul Walmsley" <paul.walmsley@sifive.com>,
"Palmer Dabbelt" <palmer@dabbelt.com>,
"Albert Ou" <aou@eecs.berkeley.edu>,
bpf@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org, "Pu Lehui" <pulehui@huawei.com>
Subject: Re: [PATCH bpf-next] riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrs
Date: Fri, 05 Apr 2024 14:33:34 +0000 [thread overview]
Message-ID: <mb61pjzlb98w1.fsf@gmail.com> (raw)
In-Reply-To: <87wmpbnbce.fsf@all.your.base.are.belong.to.us>
Björn Töpel <bjorn@kernel.org> writes:
> Puranjay Mohan <puranjay12@gmail.com> writes:
>
>> Support an instruction for resolving absolute addresses of per-CPU
>> data from their per-CPU offsets. This instruction is internal-only and
>> users are not allowed to use them directly. They will only be used for
>> internal inlining optimizations for now between BPF verifier and BPF
>> JITs.
>>
>> RISC-V uses generic per-cpu implementation where the offsets for CPUs
>> are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores
>> the address of the task_struct in TP register. The first element in
>> tast_struct is struct thread_info, and we can get the cpu number by
> ^
> k ;-)
I need to start using some kind of spell-check in vim :D.
>> reading from the TP register + offsetof(struct thread_info, cpu).
>>
>> Once we have the cpu number in a register we read the offset for that
>> cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this
>> offset to the destination register.
>
> Just to clarify for readers; BPF programs are run with migrate disable,
> which means that on RT we can be preempted, which means that per-cpu
> operations are trickier (disabling interrupts/preemption).
>
> However, this BPF instruction is about calculating the per-cpu address,
> so the look up can be inlined.
>
> It's not a per-cpu *operation*.
Will add this information to the commit message.
>> To measure the improvement from this change, the benchmark in [1] was
>> used on Qemu:
>>
>> Before:
>> glob-arr-inc : 1.127 ± 0.013M/s
>> arr-inc : 1.121 ± 0.004M/s
>> hash-inc : 0.681 ± 0.052M/s
>>
>> After:
>> glob-arr-inc : 1.138 ± 0.011M/s
>> arr-inc : 1.366 ± 0.006M/s
>> hash-inc : 0.676 ± 0.001M/s
>>
>> [1] https://github.com/anakryiko/linux/commit/8dec900975ef
>>
>> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
>> ---
>> arch/riscv/net/bpf_jit_comp64.c | 24 ++++++++++++++++++++++++
>> 1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
>> index 15e482f2c657..e95bd1d459a4 100644
>> --- a/arch/riscv/net/bpf_jit_comp64.c
>> +++ b/arch/riscv/net/bpf_jit_comp64.c
>> @@ -12,6 +12,7 @@
>> #include <linux/stop_machine.h>
>> #include <asm/patch.h>
>> #include <asm/cfi.h>
>> +#include <asm/percpu.h>
>> #include "bpf_jit.h"
>>
>> #define RV_FENTRY_NINSNS 2
>> @@ -1089,6 +1090,24 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>> emit_or(RV_REG_T1, rd, RV_REG_T1, ctx);
>> emit_mv(rd, RV_REG_T1, ctx);
>> break;
>> + } else if (insn_is_mov_percpu_addr(insn)) {
>> + if (rd != rs)
>> + emit_mv(rd, rs, ctx);
>> +#ifdef CONFIG_SMP
>> + /* Load current CPU number in T1 */
>> + emit_ld(RV_REG_T1, offsetof(struct thread_info, cpu), RV_REG_TP,
>> + ctx);
>> + /* << 3 because offsets are 8 bytes */
>> + emit_slli(RV_REG_T1, RV_REG_T1, 3, ctx);
>> + /* Load address of __per_cpu_offset array in T2 */
>> + emit_imm(RV_REG_T2, (u64)&__per_cpu_offset, ctx);
>
> Did you try using emit_addr() here? I'd guess that'll be fewer
> instructions, no?
Yes, I should have used that, the address would always be in the range
of auipc+addi right? I will try and see.
Thanks,
Puranjay
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2024-04-05 14:33 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-05 12:43 [PATCH bpf-next] riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrs Puranjay Mohan
2024-04-05 12:43 ` Puranjay Mohan
2024-04-05 14:16 ` Björn Töpel
2024-04-05 14:16 ` Björn Töpel
2024-04-05 14:33 ` Puranjay Mohan [this message]
2024-04-05 14:33 ` Puranjay Mohan
2024-04-05 18:04 ` Andrii Nakryiko
2024-04-05 18:04 ` Andrii Nakryiko
2024-04-08 7:40 ` Björn Töpel
2024-04-08 7:40 ` Björn Töpel
2024-04-18 22:24 ` Andrii Nakryiko
2024-04-18 22:24 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mb61pjzlb98w1.fsf@gmail.com \
--to=puranjay12@gmail.com \
--cc=andrii@kernel.org \
--cc=aou@eecs.berkeley.edu \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=martin.lau@linux.dev \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=pulehui@huawei.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.