From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
<dwarves@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
Andrii Nakryiko <andriin@fb.com>, Mark Wielaard <mark@klomp.org>,
Nick Desaulniers <ndesaulniers@google.com>,
Sedat Dilek <sedat.dilek@gmail.com>
Subject: Re: [PATCH dwarves] btf_encoder: sanitize non-regular int base type
Date: Sat, 6 Feb 2021 23:10:50 -0800 [thread overview]
Message-ID: <d92ba9c6-5493-065e-e66e-ce8324f20f15@fb.com> (raw)
In-Reply-To: <CAEf4Bzb-Rqz=+pJYaVNzr8jEEAHQ-ZForsfRpNo4e=t84BRWKg@mail.gmail.com>
On 2/6/21 10:36 PM, Andrii Nakryiko wrote:
> On Sat, Feb 6, 2021 at 11:21 AM Yonghong Song <yhs@fb.com> wrote:
>>
>> clang with dwarf5 may generate non-regular int base type,
>> i.e., not a signed/unsigned char/short/int/longlong/__int128.
>> Such base types are often used to describe
>> how an actual parameter or variable is generated. For example,
>>
>> 0x000015cf: DW_TAG_base_type
>> DW_AT_name ("DW_ATE_unsigned_1")
>> DW_AT_encoding (DW_ATE_unsigned)
>> DW_AT_byte_size (0x00)
>>
>> 0x00010ed9: DW_TAG_formal_parameter
>> DW_AT_location (DW_OP_lit0,
>> DW_OP_not,
>> DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1",
>> DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8",
>> DW_OP_stack_value)
>> DW_AT_abstract_origin (0x00013984 "branch")
>>
>> What it does is with a literal "0", did a "not" operation, and the converted to
>> one-bit unsigned int and then 8-bit unsigned int.
>>
>> Another example,
>>
>> 0x000e97e4: DW_TAG_base_type
>> DW_AT_name ("DW_ATE_unsigned_24")
>> DW_AT_encoding (DW_ATE_unsigned)
>> DW_AT_byte_size (0x03)
>>
>> 0x000f88f8: DW_TAG_variable
>> DW_AT_location (indexed (0x3c) loclist = 0x00008fb0:
>> [0xffffffff82808812, 0xffffffff82808817):
>> DW_OP_breg0 RAX+0,
>> DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>> DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
>> DW_OP_stack_value,
>> DW_OP_piece 0x1,
>> DW_OP_breg0 RAX+0,
>> DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>> DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>> DW_OP_lit8,
>> DW_OP_shr,
>> DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>> DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
>> DW_OP_stack_value,
>> DW_OP_piece 0x3
>> ......
>>
>> At one point, a right shift by 8 happens and the result is converted to
>> 32-bit unsigned int and then to 24-bit unsigned int.
>>
>> BTF does not need any of these DW_OP_* information and such non-regular int
>> types will cause libbpf to emit errors.
>> Let us sanitize them to generate BTF acceptable to libbpf and kernel.
>>
>> Cc: Sedat Dilek <sedat.dilek@gmail.com>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>> libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/libbtf.c b/libbtf.c
>> index 9f76283..93fe185 100644
>> --- a/libbtf.c
>> +++ b/libbtf.c
>> @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
>> struct btf *btf = btfe->btf;
>> const struct btf_type *t;
>> uint8_t encoding = 0;
>> + uint16_t byte_sz;
>> int32_t id;
>>
>> if (bt->is_signed) {
>> @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
>> return -1;
>> }
>>
>> - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding);
>> + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where
>> + * {num} is not power of 2 and may exceed 128. Such attributes
>> + * are mostly used to record operation for an actual parameter
>> + * or variable.
>> + * For example,
>> + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0:
>> + * [0xffffffff82808812, 0xffffffff82808817):
>> + * DW_OP_breg0 RAX+0,
>> + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>> + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
>> + * DW_OP_stack_value,
>> + * DW_OP_piece 0x1,
>> + * DW_OP_breg0 RAX+0,
>> + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>> + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>> + * DW_OP_lit8,
>> + * DW_OP_shr,
>> + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>> + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
>> + * DW_OP_stack_value, DW_OP_piece 0x3
>> + * DW_AT_name ("ebx")
>> + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c")
>> + *
>> + * In the above example, at some point, one unsigned_32 value
>> + * is right shifted by 8 and the result is converted to unsigned_32
>> + * and then unsigned_24.
>> + *
>> + * BTF does not need such DW_OP_* information so let us sanitize
>> + * these non-regular int types to avoid libbpf/kernel complaints.
>> + */
>> + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size);
>> + if (!byte_sz || (byte_sz & (byte_sz - 1))) {
>> + name = "sanitized_int";
>
> DWARF never stops causing issues :( How about making this name stand
> out a bit more: __SANITIZED_FAKE_INT__ ? Similar in style to
> __ARRAY_INDEX_TYPE__?
Good idea. __SANITIZED_FAKE_INT__ can make it easy to understand
this is some kind of workaround.
Will send v2 soon.
>
> Otherwise looks good to me, even though it's a bit sketchy to just
> "fix up" any integer that doesn't conform to our idea of "normal
> integer". But as I said, DWARF is DWARF...
>
> Acked-by: Andrii Nakryiko <andrii@kernel.org>
>
>> + byte_sz = 4;
>> + }
>> +
>> + id = btf__add_int(btf, name, byte_sz, encoding);
>> if (id < 0) {
>> btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type");
>> } else {
>> --
>> 2.24.1
>>
prev parent reply other threads:[~2021-02-07 7:11 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-06 19:13 [PATCH dwarves] btf_encoder: sanitize non-regular int base type Yonghong Song
2021-02-07 6:36 ` Andrii Nakryiko
2021-02-07 7:10 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d92ba9c6-5493-065e-e66e-ce8324f20f15@fb.com \
--to=yhs@fb.com \
--cc=acme@kernel.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andriin@fb.com \
--cc=bpf@vger.kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=mark@klomp.org \
--cc=ndesaulniers@google.com \
--cc=sedat.dilek@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox