From: Yonghong Song <yhs@fb.com>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: Lorenz Bauer <oss@lmb.io>,
andrii@kernel.org, bpf@vger.kernel.org, david.faust@oracle.com
Subject: Re: Signedness of char in BTF
Date: Thu, 21 Jul 2022 15:52:40 -0700 [thread overview]
Message-ID: <d56865b1-30dd-8761-2c12-ae5f66778de1@fb.com> (raw)
In-Reply-To: <875yjqayyz.fsf@oracle.com>
On 7/21/22 3:21 PM, Jose E. Marchesi wrote:
>
> Hi Yonghong.
>
>> On 7/21/22 7:54 AM, Jose E. Marchesi wrote:
>>>
>>>> Hi Yonghong and Andrii,
>>>>
>>>> I have some questions re: signedness of chars in BTF. According to [1]
>>>> BTF_INT_ENCODING() may be one of SIGNED, CHAR or BOOL.
>>> I have always assumed that the bits in `encoding' are non-exclusive
>>> i.e. it is a bitmap, not an enumerated.
>>
>> Based on current BTF design, it is enumerated. So signed char
>> is 'signed 1-byte int', unsigned char is 'unsigned 1-byte int'
>> and 'char' could be BTF_INT_CHAR but since in debuginfo
>> any 'char' has a signedness bit, so it is folded into
>> 'signed 1-byte int' or 'unsigned 1-byte int'.
>
> Ok, we will change GCC so it does the same thing.
>
> What about BOOL? I don't think we ever use that bit. Does LLVM
> generate it for any case?
The llvm and pahole generate BTF_INT_BOOL when the dwarf type has
attribute DW_ATE_boolean.
But BTF_INT_BOOL is actually used in libbpf to differentiate
configuration values (CONFIG_* = 'y' vs. CONFIG_* = <value>)
In llvm,
uint8_t BTFEncoding;
switch (Encoding) {
case dwarf::DW_ATE_boolean:
BTFEncoding = BTF::INT_BOOL;
break;
case dwarf::DW_ATE_signed:
case dwarf::DW_ATE_signed_char:
BTFEncoding = BTF::INT_SIGNED;
break;
case dwarf::DW_ATE_unsigned:
case dwarf::DW_ATE_unsigned_char:
BTFEncoding = 0;
break;
default:
llvm_unreachable("Unknown BTFTypeInt Encoding");
}
For a concrete example,
[$ ~/tmp1] cat t.c
int test(_Bool g) {
return g;
}
[$ ~/tmp1] clang -target bpf -O2 -g -c t.c
[$ ~/tmp1] bpftool btf dump file t.o
[1] INT '_Bool' size=1 bits_offset=0 nr_bits=8 encoding=BOOL
[2] FUNC_PROTO '(anon)' ret_type_id=3 vlen=1
'g' type_id=1
[3] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[4] FUNC 'test' type_id=2 linkage=global
[$ ~/tmp1]
>
>>>> If I read [2] correctly the signedness of char is implementation
>>>> defined. Does this mean that I need to know which implementation
>>>> generated the BTF to interpret CHAR correctly?
>>>>
>>>> Somewhat related, how to I make clang emit BTF_INT_CHAR in the first
>>>> place? I've tried with clang-14, but only ever get
>>>>
>>>> [6] INT 'unsigned char' size=1 bits_offset=0 nr_bits=8 encoding=(none)
>>>> [6] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
>>> Hm, in GCC we currently generate:
>>> [1] int 'unsigned char'(0x00000001U#B) size=0x00000001U#B
>>> offset=0x00UB#b bits=0x08UB#b CHAR
>>> [2] int 'char'(0x00000001U#B) size=0x00000001U#B offset=0x00UB#b bits=0x08UB#b SIGNED CHAR
>>> Which turns out is not correct?
>>> We used a signed type for `char' because that was what the LLVM BPF
>>> toolchain uses, but then we assumed we had to emit the CHAR bit as
>>> well... wrong assumption apparently (I just tried with clang 15 and it
>>> doesn't set the CHAR bits for neither `char' nor `unsigned char').
>>> But then what is the CHAR bit for?
>>
>> This is not generated by llvm or pahole but apparently it may still
>> have some meaning when printing the value, a 'char c' may have
>> a dump like 'c' instead of '0x63'. In kernel/bpf/btf.c, we have
>>
>> /*
>> * BTF_INT_CHAR encoding never seems to be set for
>> * char arrays, so if size is 1 and element is
>> * printable as a char, we'll do that.
>> */
>> if (elem_size == 1)
>> encoding = BTF_INT_CHAR;
>>
>>>
>>>> The kernel seems to agree that CHAR isn't a thing [3].
>>>>
>>>> Thanks!
>>>> Lorenz
>>>>
>>>> 1: https://www.kernel.org/doc/html/latest/bpf/btf.html#btf-kind-int
>>>> 2: https://stackoverflow.com/a/2054941/19544965
>>>> 3:
>>>> https://sourcegraph.com/github.com/torvalds/linux@353f7988dd8413c47718f7ca79c030b6fb62cfe5/-/blob/kernel/bpf/btf.c?L2928-2934
next prev parent reply other threads:[~2022-07-21 22:52 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-21 14:31 Signedness of char in BTF Lorenz Bauer
2022-07-21 14:54 ` Jose E. Marchesi
2022-07-21 18:44 ` Yonghong Song
2022-07-21 22:21 ` Jose E. Marchesi
2022-07-21 22:52 ` Yonghong Song [this message]
2022-07-22 11:25 ` Jose E. Marchesi
2022-07-22 15:59 ` Yonghong Song
2022-08-02 17:28 ` Jose E. Marchesi
2022-07-21 18:35 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d56865b1-30dd-8761-2c12-ae5f66778de1@fb.com \
--to=yhs@fb.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=david.faust@oracle.com \
--cc=jose.marchesi@oracle.com \
--cc=oss@lmb.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox