From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Brian Foster <bfoster@redhat.com>,
Kees Cook <keescook@chromium.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-bcachefs@vger.kernel.org,
Alexander Potapenko <glider@google.com>,
Marco Elver <elver@google.com>,
Dmitry Vyukov <dvyukov@google.com>,
kasan-dev@googlegroups.com, Nathan Chancellor <nathan@kernel.org>,
Nick Desaulniers <ndesaulniers@google.com>,
Bill Wendling <morbo@google.com>,
Justin Stitt <justinstitt@google.com>,
llvm@lists.linux.dev
Subject: Re: Use of zero-length arrays in bcachefs structures inner fields
Date: Fri, 24 May 2024 12:04:11 -0400 [thread overview]
Message-ID: <7236a148-c513-4053-9778-0bce6657e358@efficios.com> (raw)
In-Reply-To: <944d79b5-177d-43ea-a130-25bd62fc787f@efficios.com>
On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
>
> On 2024-05-24 11:28, Kent Overstreet wrote:
>> On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
>>> Hi Kent,
>>>
>>> Looking around in the bcachefs code for possible causes of this KMSAN
>>> bug report:
>>>
>>> https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
>>>
>>> I notice the following pattern in the bcachefs structures: zero-length
>>> arrays members are inserted in structures (not always at the end),
>>> seemingly to achieve a result similar to what could be done with a
>>> union:
>>>
>>> fs/bcachefs/bcachefs_format.h:
>>>
>>> struct bkey_packed {
>>> __u64 _data[0];
>>>
>>> /* Size of combined key and value, in u64s */
>>> __u8 u64s;
>>> [...]
>>> };
>>>
>>> likewise:
>>>
>>> struct bkey_i {
>>> __u64 _data[0];
>>>
>>> struct bkey k;
>>> struct bch_val v;
>>> };
>>>
>>> (and there are many more examples of this pattern in bcachefs)
>>>
>>> AFAIK, the C11 standard states that array declarator constant expression
>>>
>>> Effectively, we can verify that this code triggers an undefined behavior
>>> with:
>>>
>>> #include <stdio.h>
>>>
>>> struct z {
>>> int x[0];
>>> int y;
>>> int z;
>>> } __attribute__((packed));
>>>
>>> int main(void)
>>> {
>>> struct z a;
>>>
>>> a.y = 1;
>>> printf("%d\n", a.x[0]);
>>> }
>>> delimited by [ ] shall have a value greater than zero.
>>
>> Yet another example of the C people going absolutely nutty with
>> everything being undefined. Look, this isn't ok, we need to get work
>> done, and I've already wasted entirely too much time on ZLA vs. flex
>> array member nonsense.
>>
>> There's a bunch of legit uses for zero length arrays, and your example,
>> where we're not even _assigning_ to x, is just batshit. Someone needs to
>> get his head examined.
Notice how a.y is first set to 1, then a.x[0] is loaded, expecting to
alias with a.y.
This is the same aliasing pattern found in bcachefs, for instance here:
bcachefs_format.h:
struct jset {
[...]
__u8 encrypted_start[0];
__le16 _read_clock; /* no longer used */
__le16 _write_clock;
/* Sequence number of oldest dirty journal entry */
__le64 last_seq;
struct jset_entry start[0];
__u64 _data[];
} __packed __aligned(8);
where struct jset last_seq field is set by jset_validate():
jset->last_seq = jset->seq;
and where journal_read_bucket() uses the encrypted_start member as input:
ret = bch2_encrypt(c, JSET_CSUM_TYPE(j), journal_nonce(j),
j->encrypted_start,
vstruct_end(j) - (void *) j->encrypted_start);
Regards,
Mathieu
>>
>>> So I wonder if the issue reported by KMSAN could be caused by this
>>> pattern ?
>>
>> Possibly; the KMSAN errors I've been looking at do look suspicious. But
>> it sounds like we need a real fix that involves defining proper
>> semantics, not compiler folks giving up and saying 'aiee!'.
>>
>> IOW, clang/KMSAN are broken if they simply choke on a zero length array
>> being present.
>
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
next prev parent reply other threads:[~2024-05-24 16:03 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 17:53 Use of zero-length arrays in bcachefs structures inner fields Mathieu Desnoyers
2024-05-24 15:28 ` Kent Overstreet
2024-05-24 15:35 ` Mathieu Desnoyers
2024-05-24 16:04 ` Mathieu Desnoyers [this message]
2024-05-24 17:30 ` Kent Overstreet
2024-05-28 11:36 ` Alexander Potapenko
2024-05-28 15:02 ` Kent Overstreet
2024-06-03 9:12 ` Alexander Potapenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7236a148-c513-4053-9778-0bce6657e358@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=bfoster@redhat.com \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=glider@google.com \
--cc=justinstitt@google.com \
--cc=kasan-dev@googlegroups.com \
--cc=keescook@chromium.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=morbo@google.com \
--cc=nathan@kernel.org \
--cc=ndesaulniers@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox