From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
To: Yonghong Song <yonghong.song@linux.dev>,
Vadim Fedorenko <vadfed@meta.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
Mykola Lysenko <mykolal@fb.com>, Jakub Kicinski <kuba@kernel.org>
Cc: x86@kernel.org, bpf@vger.kernel.org,
Martin KaFai Lau <martin.lau@linux.dev>
Subject: Re: [PATCH bpf-next v5 1/4] bpf: add bpf_get_cpu_cycles kfunc
Date: Wed, 13 Nov 2024 17:52:42 +0000 [thread overview]
Message-ID: <27ee9031-3304-49a5-ac82-0fbe50294646@linux.dev> (raw)
In-Reply-To: <3c10fd70-ef6d-4762-b5a4-7ed912d97693@linux.dev>
On 13/11/2024 17:38, Yonghong Song wrote:
>
>
>
> On 11/8/24 4:41 PM, Vadim Fedorenko wrote:
>> New kfunc to return ARCH-specific timecounter. For x86 BPF JIT converts
>> it into rdtsc ordered call. Other architectures will get JIT
>> implementation too if supported. The fallback is to
>> __arch_get_hw_counter().
>>
>> Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
>> ---
>> v4 -> v5:
>> * use if instead of ifdef with IS_ENABLED
>> v3 -> v4:
>> * change name of the helper to bpf_get_cpu_cycles (Andrii)
>> * Hide the helper behind CONFIG_GENERIC_GETTIMEOFDAY to avoid exposing
>> it on architectures which do not have vDSO functions and data
>> * reduce the scope of check of inlined functions in verifier to only 2,
>> which are actually inlined.
>> v2 -> v3:
>> * change name of the helper to bpf_get_cpu_cycles_counter to explicitly
>> mention what counter it provides (Andrii)
>> * move kfunc definition to bpf.h to use it in JIT.
>> * introduce another kfunc to convert cycles into nanoseconds as more
>> meaningful time units for generic tracing use case (Andrii)
>> v1 -> v2:
>> * Fix incorrect function return value type to u64
>> * Introduce bpf_jit_inlines_kfunc_call() and use it in
>> mark_fastcall_pattern_for_call() to avoid clobbering in case of
>> running programs with no JIT (Eduard)
>> * Avoid rewriting instruction and check function pointer directly
>> in JIT (Alexei)
>> * Change includes to fix compile issues on non x86 architectures
>> ---
>> arch/x86/net/bpf_jit_comp.c | 28 ++++++++++++++++++++++++++++
>> arch/x86/net/bpf_jit_comp32.c | 14 ++++++++++++++
>> include/linux/bpf.h | 5 +++++
>> include/linux/filter.h | 1 +
>> kernel/bpf/core.c | 11 +++++++++++
>> kernel/bpf/helpers.c | 13 +++++++++++++
>> kernel/bpf/verifier.c | 30 +++++++++++++++++++++++++++++-
>> 7 files changed, 101 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index 06b080b61aa5..4f78ed93ee7f 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -2126,6 +2126,26 @@ st: if (is_imm8(insn->off))
>> case BPF_JMP | BPF_CALL: {
>> u8 *ip = image + addrs[i - 1];
>> + if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL &&
>> + imm32 == BPF_CALL_IMM(bpf_get_cpu_cycles)) {
>> + /* Save RDX because RDTSC will use EDX:EAX to return
>> u64 */
>> + emit_mov_reg(&prog, true, AUX_REG, BPF_REG_3);
>> + if (boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))
>> + EMIT_LFENCE();
>> + EMIT2(0x0F, 0x31);
>> +
>> + /* shl RDX, 32 */
>> + maybe_emit_1mod(&prog, BPF_REG_3, true);
>> + EMIT3(0xC1, add_1reg(0xE0, BPF_REG_3), 32);
>> + /* or RAX, RDX */
>> + maybe_emit_mod(&prog, BPF_REG_0, BPF_REG_3, true);
>> + EMIT2(0x09, add_2reg(0xC0, BPF_REG_0, BPF_REG_3));
>> + /* restore RDX from R11 */
>> + emit_mov_reg(&prog, true, BPF_REG_3, AUX_REG);
>> +
>> + break;
>> + }
>> +
>> func = (u8 *) __bpf_call_base + imm32;
>> if (tail_call_reachable) {
>> LOAD_TAIL_CALL_CNT_PTR(bpf_prog->aux->stack_depth);
>> @@ -3652,3 +3672,11 @@ u64 bpf_arch_uaddress_limit(void)
>> {
>> return 0;
>> }
>> +
>> +/* x86-64 JIT can inline kfunc */
>> +bool bpf_jit_inlines_kfunc_call(s32 imm)
>> +{
>> + if (imm == BPF_CALL_IMM(bpf_get_cpu_cycles))
>> + return true;
>> + return false;
>> +}
>> diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/
>> bpf_jit_comp32.c
>> index de0f9e5f9f73..e6097a371b69 100644
>> --- a/arch/x86/net/bpf_jit_comp32.c
>> +++ b/arch/x86/net/bpf_jit_comp32.c
>> @@ -2094,6 +2094,13 @@ static int do_jit(struct bpf_prog *bpf_prog,
>> int *addrs, u8 *image,
>> if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
>> int err;
>> + if (imm32 == BPF_CALL_IMM(bpf_get_cpu_cycles)) {
>> + if (boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))
>> + EMIT3(0x0F, 0xAE, 0xE8);
>> + EMIT2(0x0F, 0x31);
>> + break;
>> + }
>> +
>> err = emit_kfunc_call(bpf_prog,
>> image + addrs[i],
>> insn, &prog);
>> @@ -2621,3 +2628,10 @@ bool bpf_jit_supports_kfunc_call(void)
>> {
>> return true;
>> }
>> +
>> +bool bpf_jit_inlines_kfunc_call(s32 imm)
>> +{
>> + if (imm == BPF_CALL_IMM(bpf_get_cpu_cycles))
>> + return true;
>> + return false;
>> +}
>
> [...]
>
>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>> index 395221e53832..5c6c0383ebf4 100644
>> --- a/kernel/bpf/helpers.c
>> +++ b/kernel/bpf/helpers.c
>> @@ -23,6 +23,9 @@
>> #include <linux/btf_ids.h>
>> #include <linux/bpf_mem_alloc.h>
>> #include <linux/kasan.h>
>> +#if IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY)
>> +#include <vdso/datapage.h>
>> +#endif
>> #include "../../lib/kstrtox.h"
>> @@ -3023,6 +3026,13 @@ __bpf_kfunc int bpf_copy_from_user_str(void
>> *dst, u32 dst__sz, const void __user
>> return ret + 1;
>> }
>> +#if IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY)
>> +__bpf_kfunc u64 bpf_get_cpu_cycles(void)
>> +{
>> + return __arch_get_hw_counter(1, NULL);
>
> Some comment to explain what '1' mean in the above?
That's arch-specific value which translates to HW implemented counter on
all architectures which have vDSO gettimeofday() implementation.
For x86 it translates to VDSO_CLOCKMODE_TSC, while for aarch64/RISC-V
it's VDSO_CLOCKMODE_ARCHTIMER. Actually, for RISC-V the value of the
first parameter doesn't matter at all, for aarch64 it should be 0.
The only arch which is more strict about this parameter is x86, but it
has it's own special name...
>
>> +}
>> +#endif
>> +
>> __bpf_kfunc_end_defs();
>> BTF_KFUNCS_START(generic_btf_ids)
>> @@ -3115,6 +3125,9 @@ BTF_ID_FLAGS(func, bpf_get_kmem_cache)
>> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_new, KF_ITER_NEW | KF_SLEEPABLE)
>> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT |
>> KF_RET_NULL | KF_SLEEPABLE)
>> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY |
>> KF_SLEEPABLE)
>> +#if IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY)
>> +BTF_ID_FLAGS(func, bpf_get_cpu_cycles, KF_FASTCALL)
>> +#endif
>> BTF_KFUNCS_END(common_btf_ids)
>> static const struct btf_kfunc_id_set common_kfunc_set = {
> [...]
next prev parent reply other threads:[~2024-11-13 17:52 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-09 0:41 [PATCH bpf-next v5 1/4] bpf: add bpf_get_cpu_cycles kfunc Vadim Fedorenko
2024-11-09 0:41 ` [PATCH bpf-next v5 2/4] bpf: add bpf_cpu_cycles_to_ns helper Vadim Fedorenko
2024-11-12 23:03 ` Eduard Zingerman
2024-11-09 0:41 ` [PATCH bpf-next v5 3/4] selftests/bpf: add selftest to check rdtsc jit Vadim Fedorenko
2024-11-12 23:17 ` Eduard Zingerman
2024-11-09 0:41 ` [PATCH bpf-next v5 4/4] selftests/bpf: add usage example for cpu cycles kfuncs Vadim Fedorenko
2024-11-12 5:50 ` [PATCH bpf-next v5 1/4] bpf: add bpf_get_cpu_cycles kfunc Andrii Nakryiko
2024-11-12 21:43 ` Vadim Fedorenko
2024-11-12 23:59 ` Andrii Nakryiko
2024-11-12 21:21 ` Eduard Zingerman
2024-11-12 21:39 ` Vadim Fedorenko
2024-11-12 21:53 ` Eduard Zingerman
2024-11-12 22:19 ` Eduard Zingerman
2024-11-12 22:27 ` Alexei Starovoitov
2024-11-12 23:08 ` Vadim Fedorenko
2024-11-13 0:09 ` Alexei Starovoitov
2024-11-13 0:20 ` Vadim Fedorenko
2024-11-13 17:38 ` Yonghong Song
2024-11-13 17:52 ` Vadim Fedorenko [this message]
2024-11-13 18:42 ` Yonghong Song
2024-11-13 22:28 ` Vadim Fedorenko
2024-11-13 23:02 ` Yonghong Song
2024-11-14 1:05 ` Vadim Fedorenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27ee9031-3304-49a5-ac82-0fbe50294646@linux.dev \
--to=vadim.fedorenko@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=kuba@kernel.org \
--cc=martin.lau@linux.dev \
--cc=mykolal@fb.com \
--cc=tglx@linutronix.de \
--cc=vadfed@meta.com \
--cc=x86@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox