BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: Lorenz Bauer <oss@lmb.io>,
	andrii@kernel.org, bpf@vger.kernel.org, david.faust@oracle.com
Subject: Re: Signedness of char in BTF
Date: Thu, 21 Jul 2022 15:52:40 -0700	[thread overview]
Message-ID: <d56865b1-30dd-8761-2c12-ae5f66778de1@fb.com> (raw)
In-Reply-To: <875yjqayyz.fsf@oracle.com>



On 7/21/22 3:21 PM, Jose E. Marchesi wrote:
> 
> Hi Yonghong.
> 
>> On 7/21/22 7:54 AM, Jose E. Marchesi wrote:
>>>
>>>> Hi Yonghong and Andrii,
>>>>
>>>> I have some questions re: signedness of chars in BTF. According to [1]
>>>> BTF_INT_ENCODING() may be one of SIGNED, CHAR or BOOL.
>>> I have always assumed that the bits in `encoding' are non-exclusive
>>> i.e. it is a bitmap, not an enumerated.
>>
>> Based on current BTF design, it is enumerated. So signed char
>> is 'signed 1-byte int', unsigned char is 'unsigned 1-byte int'
>> and 'char' could be BTF_INT_CHAR but since in debuginfo
>> any 'char' has a signedness bit, so it is folded into
>> 'signed 1-byte int' or 'unsigned 1-byte int'.
> 
> Ok, we will change GCC so it does the same thing.
> 
> What about BOOL?  I don't think we ever use that bit.  Does LLVM
> generate it for any case?

The llvm and pahole generate BTF_INT_BOOL when the dwarf type has
attribute DW_ATE_boolean.
But BTF_INT_BOOL is actually used in libbpf to differentiate
configuration values (CONFIG_* = 'y' vs. CONFIG_* = <value>)

In llvm,
   uint8_t BTFEncoding;
   switch (Encoding) {
   case dwarf::DW_ATE_boolean:
     BTFEncoding = BTF::INT_BOOL;
     break;
   case dwarf::DW_ATE_signed:
   case dwarf::DW_ATE_signed_char:
     BTFEncoding = BTF::INT_SIGNED;
     break;
   case dwarf::DW_ATE_unsigned:
   case dwarf::DW_ATE_unsigned_char:
     BTFEncoding = 0;
     break;
   default:
     llvm_unreachable("Unknown BTFTypeInt Encoding");
   }

For a concrete example,

[$ ~/tmp1] cat t.c
int test(_Bool g) {
    return g;
}
[$ ~/tmp1] clang -target bpf -O2 -g -c t.c
[$ ~/tmp1] bpftool btf dump file t.o
[1] INT '_Bool' size=1 bits_offset=0 nr_bits=8 encoding=BOOL
[2] FUNC_PROTO '(anon)' ret_type_id=3 vlen=1
         'g' type_id=1
[3] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[4] FUNC 'test' type_id=2 linkage=global
[$ ~/tmp1]

> 
>>>> If I read [2] correctly the signedness of char is implementation
>>>> defined. Does this mean that I need to know which implementation
>>>> generated the BTF to interpret CHAR correctly?
>>>>
>>>> Somewhat related, how to I make clang emit BTF_INT_CHAR in the first
>>>> place? I've tried with clang-14, but only ever get
>>>>
>>>>       [6] INT 'unsigned char' size=1 bits_offset=0 nr_bits=8 encoding=(none)
>>>>       [6] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
>>> Hm, in GCC we currently generate:
>>> [1] int 'unsigned char'(0x00000001U#B) size=0x00000001U#B
>>> offset=0x00UB#b bits=0x08UB#b CHAR
>>> [2] int 'char'(0x00000001U#B) size=0x00000001U#B offset=0x00UB#b bits=0x08UB#b SIGNED CHAR
>>> Which turns out is not correct?
>>> We used a signed type for `char' because that was what the LLVM BPF
>>> toolchain uses, but then we assumed we had to emit the CHAR bit as
>>> well... wrong assumption apparently (I just tried with clang 15 and it
>>> doesn't set the CHAR bits for neither `char' nor `unsigned char').
>>> But then what is the CHAR bit for?
>>
>> This is not generated by llvm or pahole but apparently it may still
>> have some meaning when printing the value, a 'char c' may have
>> a dump like 'c' instead of '0x63'. In kernel/bpf/btf.c, we have
>>
>>                  /*
>>                   * BTF_INT_CHAR encoding never seems to be set for
>>                   * char arrays, so if size is 1 and element is
>>                   * printable as a char, we'll do that.
>>                   */
>>                  if (elem_size == 1)
>>                          encoding = BTF_INT_CHAR;
>>
>>>
>>>> The kernel seems to agree that CHAR isn't a thing [3].
>>>>
>>>> Thanks!
>>>> Lorenz
>>>>
>>>> 1: https://www.kernel.org/doc/html/latest/bpf/btf.html#btf-kind-int
>>>> 2: https://stackoverflow.com/a/2054941/19544965
>>>> 3:
>>>> https://sourcegraph.com/github.com/torvalds/linux@353f7988dd8413c47718f7ca79c030b6fb62cfe5/-/blob/kernel/bpf/btf.c?L2928-2934

  reply	other threads:[~2022-07-21 22:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-21 14:31 Signedness of char in BTF Lorenz Bauer
2022-07-21 14:54 ` Jose E. Marchesi
2022-07-21 18:44   ` Yonghong Song
2022-07-21 22:21     ` Jose E. Marchesi
2022-07-21 22:52       ` Yonghong Song [this message]
2022-07-22 11:25         ` Jose E. Marchesi
2022-07-22 15:59           ` Yonghong Song
2022-08-02 17:28           ` Jose E. Marchesi
2022-07-21 18:35 ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d56865b1-30dd-8761-2c12-ae5f66778de1@fb.com \
    --to=yhs@fb.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=david.faust@oracle.com \
    --cc=jose.marchesi@oracle.com \
    --cc=oss@lmb.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox