From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Eduard Zingerman <eddyz87@gmail.com>
Cc: Vadim Fedorenko <vadfed@meta.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Mykola Lysenko <mykolal@fb.com>, Jakub Kicinski <kuba@kernel.org>,
X86 ML <x86@kernel.org>, bpf <bpf@vger.kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>
Subject: Re: [PATCH bpf-next v5 1/4] bpf: add bpf_get_cpu_cycles kfunc
Date: Tue, 12 Nov 2024 23:08:38 +0000 [thread overview]
Message-ID: <ee3362bd-316e-47e5-83d9-8e00651c122a@linux.dev> (raw)
In-Reply-To: <CAADnVQ+bYuda8bWtY9vtxh9WGUOBz+5hvS6V9X00i5gtHhLt1Q@mail.gmail.com>
On 12/11/2024 22:27, Alexei Starovoitov wrote:
> On Tue, Nov 12, 2024 at 2:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>>
>> On Tue, 2024-11-12 at 13:53 -0800, Eduard Zingerman wrote:
>>> On Tue, 2024-11-12 at 21:39 +0000, Vadim Fedorenko wrote:
>>>
>>> [...]
>>>
>>>>>> + if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL &&
>>>>>> + imm32 == BPF_CALL_IMM(bpf_get_cpu_cycles)) {
>>>>>> + /* Save RDX because RDTSC will use EDX:EAX to return u64 */
>>>>>> + emit_mov_reg(&prog, true, AUX_REG, BPF_REG_3);
>>>>>> + if (boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))
>>>>>> + EMIT_LFENCE();
>>>>>> + EMIT2(0x0F, 0x31);
>>>>>> +
>>>>>> + /* shl RDX, 32 */
>>>>>> + maybe_emit_1mod(&prog, BPF_REG_3, true);
>>>>>> + EMIT3(0xC1, add_1reg(0xE0, BPF_REG_3), 32);
>>>>>> + /* or RAX, RDX */
>>>>>> + maybe_emit_mod(&prog, BPF_REG_0, BPF_REG_3, true);
>>>>>> + EMIT2(0x09, add_2reg(0xC0, BPF_REG_0, BPF_REG_3));
>>>>>> + /* restore RDX from R11 */
>>>>>> + emit_mov_reg(&prog, true, BPF_REG_3, AUX_REG);
>>>>>
>>>>> Note: The default implementation of this kfunc uses __arch_get_hw_counter(),
>>>>> which is implemented as `(u64)rdtsc_ordered() & S64_MAX`.
>>>>> Here we don't do `& S64_MAX`.
>>>>> The masking in __arch_get_hw_counter() was added by this commit:
>>>>> 77750f78b0b3 ("x86/vdso: Fix gettimeofday masking").
>>>>
>>>> I think we already discussed it with Alexey in v1, we don't really need
>>>> any masking here for BPF case. We can use values provided by CPU
>>>> directly. It will never happen that within one BPF program we will have
>>>> inlined and non-inlined implementation of this helper, hence the values
>>>> to compare will be of the same source.
>>>>
>>>>> Also, the default implementation does not issue `lfence`.
>>>>> Not sure if this makes any real-world difference.
>>>>
>>>> Well, it actually does. rdtsc_ordered is translated into `lfence; rdtsc`
>>>> or `rdtscp` (which is rdtsc + lfence + u32 cookie) depending on the cpu
>>>> features.
>>>
>>> I see the following disassembly:
>>>
>>> 0000000000008980 <bpf_get_cpu_cycles>:
>>> ; {
>>> 8980: f3 0f 1e fa endbr64
>>> 8984: e8 00 00 00 00 callq 0x8989 <bpf_get_cpu_cycles+0x9>
>>> 0000000000008985: R_X86_64_PLT32 __fentry__-0x4
>>> ; asm volatile(ALTERNATIVE_2("rdtsc",
>>> 8989: 0f 31 rdtsc
>>> 898b: 90 nop
>>> 898c: 90 nop
>>> 898d: 90 nop
>>> ; return EAX_EDX_VAL(val, low, high);
>>> 898e: 48 c1 e2 20 shlq $0x20, %rdx
>>> 8992: 48 09 d0 orq %rdx, %rax
>>> 8995: 48 b9 ff ff ff ff ff ff ff 7f movabsq $0x7fffffffffffffff, %rcx # imm = 0x7FFFFFFFFFFFFFFF
>>> ; return (u64)rdtsc_ordered() & S64_MAX;
>>> 899f: 48 21 c8 andq %rcx, %rax
>>> ; return __arch_get_hw_counter(1, NULL);
>>> 89a2: 2e e9 00 00 00 00 jmp 0x89a8 <bpf_get_cpu_cycles+0x28>
>>>
>>> Is it patched when kernel is loaded to replace nops with lfence?
>>> By real-world difference I meant difference between default
>>> implementation and inlined assembly.
>>
>> Talked with Vadim off-list, he explained that 'rttsc nop nop nop' is
>> indeed patched at kernel load. Regarding S64_MAX patching we just hope
>> this should never be an issue for BPF use-case.
>> So, no more questions from my side.
>
> since s64 question came up twice it should be a comment.
sure, will do it.
>
> nop nop as well.
do you mean why there are nop;nop instructions in the kernel's assembly?
>
> pw-bot: cr
next prev parent reply other threads:[~2024-11-12 23:08 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-09 0:41 [PATCH bpf-next v5 1/4] bpf: add bpf_get_cpu_cycles kfunc Vadim Fedorenko
2024-11-09 0:41 ` [PATCH bpf-next v5 2/4] bpf: add bpf_cpu_cycles_to_ns helper Vadim Fedorenko
2024-11-12 23:03 ` Eduard Zingerman
2024-11-09 0:41 ` [PATCH bpf-next v5 3/4] selftests/bpf: add selftest to check rdtsc jit Vadim Fedorenko
2024-11-12 23:17 ` Eduard Zingerman
2024-11-09 0:41 ` [PATCH bpf-next v5 4/4] selftests/bpf: add usage example for cpu cycles kfuncs Vadim Fedorenko
2024-11-12 5:50 ` [PATCH bpf-next v5 1/4] bpf: add bpf_get_cpu_cycles kfunc Andrii Nakryiko
2024-11-12 21:43 ` Vadim Fedorenko
2024-11-12 23:59 ` Andrii Nakryiko
2024-11-12 21:21 ` Eduard Zingerman
2024-11-12 21:39 ` Vadim Fedorenko
2024-11-12 21:53 ` Eduard Zingerman
2024-11-12 22:19 ` Eduard Zingerman
2024-11-12 22:27 ` Alexei Starovoitov
2024-11-12 23:08 ` Vadim Fedorenko [this message]
2024-11-13 0:09 ` Alexei Starovoitov
2024-11-13 0:20 ` Vadim Fedorenko
2024-11-13 17:38 ` Yonghong Song
2024-11-13 17:52 ` Vadim Fedorenko
2024-11-13 18:42 ` Yonghong Song
2024-11-13 22:28 ` Vadim Fedorenko
2024-11-13 23:02 ` Yonghong Song
2024-11-14 1:05 ` Vadim Fedorenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee3362bd-316e-47e5-83d9-8e00651c122a@linux.dev \
--to=vadim.fedorenko@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=kuba@kernel.org \
--cc=martin.lau@linux.dev \
--cc=mykolal@fb.com \
--cc=tglx@linutronix.de \
--cc=vadfed@meta.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.