From: Yonghong Song <yhs@fb.com>
To: John Fastabend <john.fastabend@gmail.com>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>,
Lorenz Bauer <lmb@cloudflare.com>
Subject: Re: [PATCH bpf-next] docs/bpf: add llvm_reloc.rst to explain llvm bpf relocations
Date: Mon, 24 May 2021 20:39:49 -0700 [thread overview]
Message-ID: <c77ba8ee-d56c-0db6-4741-5527dc7053a7@fb.com> (raw)
In-Reply-To: <60abfd5d94d7_135f6208cd@john-XPS-13-9370.notmuch>
On 5/24/21 12:24 PM, John Fastabend wrote:
> Yonghong Song wrote:
>>
>>
>> On 5/24/21 10:23 AM, Andrii Nakryiko wrote:
>>> On Sat, May 22, 2021 at 9:39 AM Yonghong Song <yhs@fb.com> wrote:
>>>>
>>>> LLVM upstream commit https://reviews.llvm.org/D102712
>>>> made some changes to bpf relocations to make them
>>>> llvm linker lld friendly. The scope of
>>>> existing relocations R_BPF_64_{64,32} is narrowed
>>>> and new relocations R_BPF_64_{ABS32,ABS64,NODYLD32}
>>>> are introduced.
>>>>
>>>> Let us add some documentation about llvm bpf
>>>> relocations so people can understand how to resolve
>>>> them properly in their respective tools.
>>>>
>>>> Cc: John Fastabend <john.fastabend@gmail.com>
>>>> Cc: Lorenz Bauer <lmb@cloudflare.com>
>>>> Signed-off-by: Yonghong Song <yhs@fb.com>
>>>> ---
>>>> Documentation/bpf/index.rst | 1 +
>>>> Documentation/bpf/llvm_reloc.rst | 168 +++++++++++++++++++++++++++++++
>>>> 2 files changed, 169 insertions(+)
>>>> create mode 100644 Documentation/bpf/llvm_reloc.rst
>>>>
>>>> diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
>>>> index a702f67dd45f..93e8cf12a6d4 100644
>>>> --- a/Documentation/bpf/index.rst
>>>> +++ b/Documentation/bpf/index.rst
>>>> @@ -84,6 +84,7 @@ Other
>>>> :maxdepth: 1
>>>>
>>>> ringbuf
>>>> + llvm_reloc
>>>>
>
> Thanks Yonghong, I found this helpful. I still had to crack
> open llvm code though to follow along. A couple small suggestions
> below, may or may not be useful. Overall looks good.
>
>>>> .. Links:
>>>> .. _networking-filter: ../networking/filter.rst
>>>> diff --git a/Documentation/bpf/llvm_reloc.rst b/Documentation/bpf/llvm_reloc.rst
>>>> new file mode 100644
>>>> index 000000000000..bc62bce591b1
>>>> --- /dev/null
>>>> +++ b/Documentation/bpf/llvm_reloc.rst
>>>> @@ -0,0 +1,168 @@
>>>> +.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
>>>> +
>>>> +====================
>>>> +BPF LLVM Relocations
>>>> +====================
>>>> +
>>>> +This document describes LLVM BPF backend relocation types.
>>>> +
>>>> +Relocation Record
>>>> +=================
>>>> +
>>>> +LLVM BPF backend records each relocation with the following 16-byte
>>>> +ELF structure::
>>>> +
>>>> + typedef struct
>>>> + {
>>>> + Elf64_Addr r_offset; // Offset from the beginning of section.
>>>> + Elf64_Xword r_info; // Relocation type and symbol index.
>>>> + } Elf64_Rel;
>>>> +
>>>> +For static function/variable references, the symbol often refers to
>>>> +the section itself which has a value of 0. To identify actual static
>>>> +function/variable, its section offset or some computation result
>>>> +based on section offset is written to the original insn/data buffer,
>>>> +which is called ``IA`` (implicit addend) below. For global
>>>> +function/variables, the symbol refers to actual global and the implicit
>>>> +addend is 0.
>
> Above was too terse for me to follow without looking into some clang
> examples. Maybe an example right here would help not sure? Maybe expand
> the text a bit? I don't have a really good suggestion.
Just send a new revision with an example. Hope it will make it easy to
understand the above ``IA`` concept.
>
>>>> +
>>>> +Different Relocation Types
>>>> +==========================
>>>> +
>>>> +Six relocation types are supported. The following is an overview and
>>>> +``S`` represents the value of the symbol in the symbol table::
>>>> +
>>>> + Enum ELF Reloc Type Description BitSize Offset Calculation
>>>> + 0 R_BPF_NONE None
>>>> + 1 R_BPF_64_64 ld_imm64 insn 32 r_offset + 4 S + IA
>>>
>>> There are cases where we set all 64-bits of ld_imm64 (e.g., extern
>>> ksym, global variables). Or those will be a different relocation now
>>> (R_BPF_64_ABS64?). If not, I think BitSize 64 is more correct here.
>>
>> It is still R_BPF_64_64. In llvm, we have restriction that section
>> offset must be <= UINT32_MAX, and that is why only 32bit is used
>> to find the actual symbol in symbol table. 32bit permits 4GB section
>> which should enough in practice for a bpf program.
>
> ^^^ maybe add this note in the doc somewhere? I had similar questions.
Added in the new revision.
>
>>
>> libbpf or tools can write to full 64bits of imm values of ld_imm64 insn.
>>
>> The name is a little bit misleading, but it has become part of ABI
>> and lives in /usr/include/elf.h and we are not able to change it
>> any more.
>>
>>>
>>> Looking at LLVM diff I haven't found a test for global variables (at
>>> least I didn't realize it was there), so double-checking here (and it
>>> might be a good idea to have an explicit test for global variables?)
>>
>> We have llvm/test/CodeGen/BPF/reloc.ll and
>> llvm/test/CodeGen/BPF/reloc-btf.ll covering R_BPF_64_ABS64. But I think
>> I can enhance
>> llvm/test/CodeGen/BPF/reloc-2.ll to cover an explicit global variable case.
>
> ^^^ maybe cross-reference llvm tests from kernel docs side? I often look at
> these when I get something unexpected/unknown maybe others would find
> it helpful, but not know where to look?
The llvm patch has not merged. We need to merge libbpf patch first.
Otherwise, nightly libbpf CI will fail. But this doc includes a link
to the LLVM patch and you can just go to that llvm patch to find
examples!
>
>>
>>>
>>>> + 2 R_BPF_64_ABS64 normal data 64 r_offset S + IA
>>>> + 3 R_BPF_64_ABS32 normal data 32 r_offset S + IA
>>>> + 4 R_BPF_64_NODYLD32 .BTF[.ext] data 32 r_offset S + IA
>>>> + 10 R_BPF_64_32 call insn 32 r_offset + 4 (S + IA) / 8 - 1
>>>> +
>>>> +For example, ``R_BPF_64_64`` relocation type is used for ``ld_imm64`` instruction.
>>>> +The actual to-be-relocated data is stored at ``r_offset + 4`` and the read/write
>>>> +data bitsize is 32 (4 bytes). The relocation can be resolved with
>>>> +the symbol value plus implicit addend.
>>>> +
>>>> +In another case, ``R_BPF_64_ABS64`` relocation type is used for normal 64-bit data.
>>>> +The actual to-be-relocated data is stored at ``r_offset`` and the read/write data
>>>> +bitsize is 64 (8 bytes). The relocation can be resolved with
>>>> +the symbol value plus implicit addend.
>>>> +
>>>> +Both ``R_BPF_64_ABS32`` and ``R_BPF_64_NODYLD32`` types are for 32-bit data.
>>>> +But ``R_BPF_64_NODYLD32`` specifically refers to relocations in ``.BTF`` and
>>>> +``.BTF.ext`` sections. For cases like bcc where llvm ``ExecutionEngine RuntimeDyld``
>>>> +is involved, ``R_BPF_64_NODYLD32`` types of relocations should not be resolved
>>>> +to actual function/variable address. Otherwise, ``.BTF`` and ``.BTF.ext``
>>>> +become unusable by bcc and kernel.
>>>> +
>>>> +Type ``R_BPF_64_32`` is used for call instruction. The call target section
>>>> +offset is stored at ``r_offset + 4`` (32bit) and calculated as
>>>> +``(S + IA) / 8 - 1``.
>>>> +
>>>> +Examples
>>>> +========
>>>> +
>
> I liked the examples.
Great. Just added one more in the new revision!
prev parent reply other threads:[~2021-05-25 3:40 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-22 16:39 [PATCH bpf-next] docs/bpf: add llvm_reloc.rst to explain llvm bpf relocations Yonghong Song
2021-05-22 16:44 ` Yonghong Song
2021-05-24 8:33 ` Lorenz Bauer
2021-05-24 15:06 ` Yonghong Song
2021-05-24 17:23 ` Andrii Nakryiko
2021-05-24 18:01 ` Yonghong Song
2021-05-24 19:20 ` Andrii Nakryiko
2021-05-24 19:24 ` John Fastabend
2021-05-25 3:39 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c77ba8ee-d56c-0db6-4741-5527dc7053a7@fb.com \
--to=yhs@fb.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=kernel-team@fb.com \
--cc=lmb@cloudflare.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox