public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
* Use of zero-length arrays in bcachefs structures inner fields
@ 2024-05-23 17:53 Mathieu Desnoyers
  2024-05-24 15:28 ` Kent Overstreet
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2024-05-23 17:53 UTC (permalink / raw)
  To: Kent Overstreet, Brian Foster; +Cc: Kees Cook, linux-kernel, linux-bcachefs

Hi Kent,

Looking around in the bcachefs code for possible causes of this KMSAN
bug report:

https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/

I notice the following pattern in the bcachefs structures: zero-length
arrays members are inserted in structures (not always at the end),
seemingly to achieve a result similar to what could be done with a
union:

fs/bcachefs/bcachefs_format.h:

struct bkey_packed {
         __u64           _data[0];

         /* Size of combined key and value, in u64s */
         __u8            u64s;
[...]
};

likewise:

struct bkey_i {
         __u64                   _data[0];

         struct bkey     k;
         struct bch_val  v;
};

(and there are many more examples of this pattern in bcachefs)

AFAIK, the C11 standard states that array declarator constant expression
delimited by [ ] shall have a value greater than zero.

Effectively, we can verify that this code triggers an undefined behavior
with:

#include <stdio.h>

struct z {
         int x[0];
         int y;
         int z;
} __attribute__((packed));

int main(void)
{
         struct z a;

         a.y = 1;
         printf("%d\n", a.x[0]);
}

clang-15 -fsanitize=undefined -o a a.c
./a
a.c:14:17: runtime error: index 0 out of bounds for type 'int[0]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior a.c:14:17 in
1

Also, gcc warns that ISO C forbids zero-size arrays when compiling
with -pedantic:

gcc -std=c11 -pedantic -o a a.c
a.c:4:13: warning: ISO C forbids zero-size array ‘x’ [-Wpedantic]
     4 |         int x[0];

And clang states that this is only supported as an extension, even though
accessing it seems to be classified as an undefined behavior by UBSAN.

clang-15 -std=c11 -pedantic -o a a.c
a.c:4:8: warning: zero size arrays are an extension [-Wzero-length-array]
         int x[0];

So I wonder if the issue reported by KMSAN could be caused by this
pattern ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Use of zero-length arrays in bcachefs structures inner fields
  2024-05-23 17:53 Use of zero-length arrays in bcachefs structures inner fields Mathieu Desnoyers
@ 2024-05-24 15:28 ` Kent Overstreet
  2024-05-24 15:35   ` Mathieu Desnoyers
  0 siblings, 1 reply; 8+ messages in thread
From: Kent Overstreet @ 2024-05-24 15:28 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Brian Foster, Kees Cook, linux-kernel, linux-bcachefs

On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> Hi Kent,
> 
> Looking around in the bcachefs code for possible causes of this KMSAN
> bug report:
> 
> https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
> 
> I notice the following pattern in the bcachefs structures: zero-length
> arrays members are inserted in structures (not always at the end),
> seemingly to achieve a result similar to what could be done with a
> union:
> 
> fs/bcachefs/bcachefs_format.h:
> 
> struct bkey_packed {
>         __u64           _data[0];
> 
>         /* Size of combined key and value, in u64s */
>         __u8            u64s;
> [...]
> };
> 
> likewise:
> 
> struct bkey_i {
>         __u64                   _data[0];
> 
>         struct bkey     k;
>         struct bch_val  v;
> };
> 
> (and there are many more examples of this pattern in bcachefs)
> 
> AFAIK, the C11 standard states that array declarator constant expression
> 
> Effectively, we can verify that this code triggers an undefined behavior
> with:
> 
> #include <stdio.h>
> 
> struct z {
>         int x[0];
>         int y;
>         int z;
> } __attribute__((packed));
> 
> int main(void)
> {
>         struct z a;
> 
>         a.y = 1;
>         printf("%d\n", a.x[0]);
> }
> delimited by [ ] shall have a value greater than zero.

Yet another example of the C people going absolutely nutty with
everything being undefined. Look, this isn't ok, we need to get work
done, and I've already wasted entirely too much time on ZLA vs. flex
array member nonsense.

There's a bunch of legit uses for zero length arrays, and your example,
where we're not even _assigning_ to x, is just batshit. Someone needs to
get his head examined.

> So I wonder if the issue reported by KMSAN could be caused by this
> pattern ?

Possibly; the KMSAN errors I've been looking at do look suspicious. But
it sounds like we need a real fix that involves defining proper
semantics, not compiler folks giving up and saying 'aiee!'.

IOW, clang/KMSAN are broken if they simply choke on a zero length array
being present.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Use of zero-length arrays in bcachefs structures inner fields
  2024-05-24 15:28 ` Kent Overstreet
@ 2024-05-24 15:35   ` Mathieu Desnoyers
  2024-05-24 16:04     ` Mathieu Desnoyers
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2024-05-24 15:35 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Brian Foster, Kees Cook, linux-kernel, linux-bcachefs,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, kasan-dev,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	llvm

[ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]

On 2024-05-24 11:28, Kent Overstreet wrote:
> On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
>> Hi Kent,
>>
>> Looking around in the bcachefs code for possible causes of this KMSAN
>> bug report:
>>
>> https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
>>
>> I notice the following pattern in the bcachefs structures: zero-length
>> arrays members are inserted in structures (not always at the end),
>> seemingly to achieve a result similar to what could be done with a
>> union:
>>
>> fs/bcachefs/bcachefs_format.h:
>>
>> struct bkey_packed {
>>          __u64           _data[0];
>>
>>          /* Size of combined key and value, in u64s */
>>          __u8            u64s;
>> [...]
>> };
>>
>> likewise:
>>
>> struct bkey_i {
>>          __u64                   _data[0];
>>
>>          struct bkey     k;
>>          struct bch_val  v;
>> };
>>
>> (and there are many more examples of this pattern in bcachefs)
>>
>> AFAIK, the C11 standard states that array declarator constant expression
>>
>> Effectively, we can verify that this code triggers an undefined behavior
>> with:
>>
>> #include <stdio.h>
>>
>> struct z {
>>          int x[0];
>>          int y;
>>          int z;
>> } __attribute__((packed));
>>
>> int main(void)
>> {
>>          struct z a;
>>
>>          a.y = 1;
>>          printf("%d\n", a.x[0]);
>> }
>> delimited by [ ] shall have a value greater than zero.
> 
> Yet another example of the C people going absolutely nutty with
> everything being undefined. Look, this isn't ok, we need to get work
> done, and I've already wasted entirely too much time on ZLA vs. flex
> array member nonsense.
> 
> There's a bunch of legit uses for zero length arrays, and your example,
> where we're not even _assigning_ to x, is just batshit. Someone needs to
> get his head examined.
> 
>> So I wonder if the issue reported by KMSAN could be caused by this
>> pattern ?
> 
> Possibly; the KMSAN errors I've been looking at do look suspicious. But
> it sounds like we need a real fix that involves defining proper
> semantics, not compiler folks giving up and saying 'aiee!'.
> 
> IOW, clang/KMSAN are broken if they simply choke on a zero length array
> being present.

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Use of zero-length arrays in bcachefs structures inner fields
  2024-05-24 15:35   ` Mathieu Desnoyers
@ 2024-05-24 16:04     ` Mathieu Desnoyers
  2024-05-24 17:30       ` Kent Overstreet
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2024-05-24 16:04 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Brian Foster, Kees Cook, linux-kernel, linux-bcachefs,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, kasan-dev,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	llvm

On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> 
> On 2024-05-24 11:28, Kent Overstreet wrote:
>> On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
>>> Hi Kent,
>>>
>>> Looking around in the bcachefs code for possible causes of this KMSAN
>>> bug report:
>>>
>>> https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
>>>
>>> I notice the following pattern in the bcachefs structures: zero-length
>>> arrays members are inserted in structures (not always at the end),
>>> seemingly to achieve a result similar to what could be done with a
>>> union:
>>>
>>> fs/bcachefs/bcachefs_format.h:
>>>
>>> struct bkey_packed {
>>>          __u64           _data[0];
>>>
>>>          /* Size of combined key and value, in u64s */
>>>          __u8            u64s;
>>> [...]
>>> };
>>>
>>> likewise:
>>>
>>> struct bkey_i {
>>>          __u64                   _data[0];
>>>
>>>          struct bkey     k;
>>>          struct bch_val  v;
>>> };
>>>
>>> (and there are many more examples of this pattern in bcachefs)
>>>
>>> AFAIK, the C11 standard states that array declarator constant expression
>>>
>>> Effectively, we can verify that this code triggers an undefined behavior
>>> with:
>>>
>>> #include <stdio.h>
>>>
>>> struct z {
>>>          int x[0];
>>>          int y;
>>>          int z;
>>> } __attribute__((packed));
>>>
>>> int main(void)
>>> {
>>>          struct z a;
>>>
>>>          a.y = 1;
>>>          printf("%d\n", a.x[0]);
>>> }
>>> delimited by [ ] shall have a value greater than zero.
>>
>> Yet another example of the C people going absolutely nutty with
>> everything being undefined. Look, this isn't ok, we need to get work
>> done, and I've already wasted entirely too much time on ZLA vs. flex
>> array member nonsense.
>>
>> There's a bunch of legit uses for zero length arrays, and your example,
>> where we're not even _assigning_ to x, is just batshit. Someone needs to
>> get his head examined.

Notice how a.y is first set to 1, then a.x[0] is loaded, expecting to
alias with a.y.

This is the same aliasing pattern found in bcachefs, for instance here:

bcachefs_format.h:

struct jset {
[...]
         __u8                    encrypted_start[0];

         __le16                  _read_clock; /* no longer used */
         __le16                  _write_clock;

         /* Sequence number of oldest dirty journal entry */
         __le64                  last_seq;


         struct jset_entry       start[0];
         __u64                   _data[];
} __packed __aligned(8);

where struct jset last_seq field is set by jset_validate():

		jset->last_seq = jset->seq;

and where journal_read_bucket() uses the encrypted_start member as input:

                 ret = bch2_encrypt(c, JSET_CSUM_TYPE(j), journal_nonce(j),
                              j->encrypted_start,
                              vstruct_end(j) - (void *) j->encrypted_start);

Regards,

Mathieu


>>
>>> So I wonder if the issue reported by KMSAN could be caused by this
>>> pattern ?
>>
>> Possibly; the KMSAN errors I've been looking at do look suspicious. But
>> it sounds like we need a real fix that involves defining proper
>> semantics, not compiler folks giving up and saying 'aiee!'.
>>
>> IOW, clang/KMSAN are broken if they simply choke on a zero length array
>> being present.
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Use of zero-length arrays in bcachefs structures inner fields
  2024-05-24 16:04     ` Mathieu Desnoyers
@ 2024-05-24 17:30       ` Kent Overstreet
  2024-05-28 11:36         ` Alexander Potapenko
  0 siblings, 1 reply; 8+ messages in thread
From: Kent Overstreet @ 2024-05-24 17:30 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Brian Foster, Kees Cook, linux-kernel, linux-bcachefs,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, kasan-dev,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	llvm

On Fri, May 24, 2024 at 12:04:11PM -0400, Mathieu Desnoyers wrote:
> On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> > [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> > 
> > On 2024-05-24 11:28, Kent Overstreet wrote:
> > > On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> > > > Hi Kent,
> > > > 
> > > > Looking around in the bcachefs code for possible causes of this KMSAN
> > > > bug report:
> > > > 
> > > > https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
> > > > 
> > > > I notice the following pattern in the bcachefs structures: zero-length
> > > > arrays members are inserted in structures (not always at the end),
> > > > seemingly to achieve a result similar to what could be done with a
> > > > union:
> > > > 
> > > > fs/bcachefs/bcachefs_format.h:
> > > > 
> > > > struct bkey_packed {
> > > >          __u64           _data[0];
> > > > 
> > > >          /* Size of combined key and value, in u64s */
> > > >          __u8            u64s;
> > > > [...]
> > > > };
> > > > 
> > > > likewise:
> > > > 
> > > > struct bkey_i {
> > > >          __u64                   _data[0];
> > > > 
> > > >          struct bkey     k;
> > > >          struct bch_val  v;
> > > > };
> > > > 
> > > > (and there are many more examples of this pattern in bcachefs)
> > > > 
> > > > AFAIK, the C11 standard states that array declarator constant expression
> > > > 
> > > > Effectively, we can verify that this code triggers an undefined behavior
> > > > with:
> > > > 
> > > > #include <stdio.h>
> > > > 
> > > > struct z {
> > > >          int x[0];
> > > >          int y;
> > > >          int z;
> > > > } __attribute__((packed));
> > > > 
> > > > int main(void)
> > > > {
> > > >          struct z a;
> > > > 
> > > >          a.y = 1;
> > > >          printf("%d\n", a.x[0]);
> > > > }
> > > > delimited by [ ] shall have a value greater than zero.
> > > 
> > > Yet another example of the C people going absolutely nutty with
> > > everything being undefined. Look, this isn't ok, we need to get work
> > > done, and I've already wasted entirely too much time on ZLA vs. flex
> > > array member nonsense.
> > > 
> > > There's a bunch of legit uses for zero length arrays, and your example,
> > > where we're not even _assigning_ to x, is just batshit. Someone needs to
> > > get his head examined.
> 
> Notice how a.y is first set to 1, then a.x[0] is loaded, expecting to
> alias with a.y.
> 
> This is the same aliasing pattern found in bcachefs, for instance here:
> 
> bcachefs_format.h:
> 
> struct jset {
> [...]
>         __u8                    encrypted_start[0];
> 
>         __le16                  _read_clock; /* no longer used */
>         __le16                  _write_clock;
> 
>         /* Sequence number of oldest dirty journal entry */
>         __le64                  last_seq;
> 
> 
>         struct jset_entry       start[0];
>         __u64                   _data[];
> } __packed __aligned(8);
> 
> where struct jset last_seq field is set by jset_validate():
> 
> 		jset->last_seq = jset->seq;
> 
> and where journal_read_bucket() uses the encrypted_start member as input:
> 
>                 ret = bch2_encrypt(c, JSET_CSUM_TYPE(j), journal_nonce(j),
>                              j->encrypted_start,
>                              vstruct_end(j) - (void *) j->encrypted_start);

Except we're just using it as a marker for an offset into the struct,
the same "aliasing" issue would apply if we were just using offsetof()
to calculate the offsets directly.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Use of zero-length arrays in bcachefs structures inner fields
  2024-05-24 17:30       ` Kent Overstreet
@ 2024-05-28 11:36         ` Alexander Potapenko
  2024-05-28 15:02           ` Kent Overstreet
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Potapenko @ 2024-05-28 11:36 UTC (permalink / raw)
  To: Kent Overstreet, Mathieu Desnoyers
  Cc: Brian Foster, Kees Cook, linux-kernel, linux-bcachefs,
	Marco Elver, Dmitry Vyukov, kasan-dev, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt, llvm

On Fri, May 24, 2024 at 7:30 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> On Fri, May 24, 2024 at 12:04:11PM -0400, Mathieu Desnoyers wrote:
> > On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> > > [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> > >
> > > On 2024-05-24 11:28, Kent Overstreet wrote:
> > > > On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> > > > > Hi Kent,
> > > > >
> > > > > Looking around in the bcachefs code for possible causes of this KMSAN
> > > > > bug report:
> > > > >
> > > > > https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
> > > > >
> > > > > I notice the following pattern in the bcachefs structures: zero-length
> > > > > arrays members are inserted in structures (not always at the end),
> > > > > seemingly to achieve a result similar to what could be done with a
> > > > > union:
> > > > >
> > > > > fs/bcachefs/bcachefs_format.h:
> > > > >
> > > > > struct bkey_packed {
> > > > >          __u64           _data[0];
> > > > >
> > > > >          /* Size of combined key and value, in u64s */
> > > > >          __u8            u64s;
> > > > > [...]
> > > > > };
> > > > >
> > > > > likewise:
> > > > >
> > > > > struct bkey_i {
> > > > >          __u64                   _data[0];
> > > > >
> > > > >          struct bkey     k;
> > > > >          struct bch_val  v;
> > > > > };

I took a glance at the LLVM IR for fs/bcachefs/bset.c, and it defines
struct bkey_packed and bkey_i as:

    %struct.bkey_packed = type { [0 x i64], i8, i8, i8, [0 x i8], [37 x i8] }
    %struct.bkey_i = type { [0 x i64], %struct.bkey, %struct.bch_val }

, which more or less looks as expected, so I don't think it could be
causing problems with KMSAN right now.
Moreover, there are cases in e.g. include/linux/skbuff.h where
zero-length arrays are used for the same purpose, and KMSAN handles
them just fine.

Yet I want to point out that even GCC discourages the use of
zero-length arrays in the middle of a struct:
https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html, so Clang is not
unique here.

Regarding the original KMSAN bug, as noted in
https://lore.kernel.org/all/0000000000009f9447061833d477@google.com/T/,
we might be missing the event of copying data from the disk to
bcachefs structs.
I'd appreciate help from someone knowledgeable about how disk I/O is
implemented in the kernel.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Use of zero-length arrays in bcachefs structures inner fields
  2024-05-28 11:36         ` Alexander Potapenko
@ 2024-05-28 15:02           ` Kent Overstreet
  2024-06-03  9:12             ` Alexander Potapenko
  0 siblings, 1 reply; 8+ messages in thread
From: Kent Overstreet @ 2024-05-28 15:02 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Mathieu Desnoyers, Brian Foster, Kees Cook, linux-kernel,
	linux-bcachefs, Marco Elver, Dmitry Vyukov, kasan-dev,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	llvm

On Tue, May 28, 2024 at 01:36:11PM +0200, Alexander Potapenko wrote:
> On Fri, May 24, 2024 at 7:30 PM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> >
> > On Fri, May 24, 2024 at 12:04:11PM -0400, Mathieu Desnoyers wrote:
> > > On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> > > > [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> > > >
> > > > On 2024-05-24 11:28, Kent Overstreet wrote:
> > > > > On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> > > > > > Hi Kent,
> > > > > >
> > > > > > Looking around in the bcachefs code for possible causes of this KMSAN
> > > > > > bug report:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
> > > > > >
> > > > > > I notice the following pattern in the bcachefs structures: zero-length
> > > > > > arrays members are inserted in structures (not always at the end),
> > > > > > seemingly to achieve a result similar to what could be done with a
> > > > > > union:
> > > > > >
> > > > > > fs/bcachefs/bcachefs_format.h:
> > > > > >
> > > > > > struct bkey_packed {
> > > > > >          __u64           _data[0];
> > > > > >
> > > > > >          /* Size of combined key and value, in u64s */
> > > > > >          __u8            u64s;
> > > > > > [...]
> > > > > > };
> > > > > >
> > > > > > likewise:
> > > > > >
> > > > > > struct bkey_i {
> > > > > >          __u64                   _data[0];
> > > > > >
> > > > > >          struct bkey     k;
> > > > > >          struct bch_val  v;
> > > > > > };
> 
> I took a glance at the LLVM IR for fs/bcachefs/bset.c, and it defines
> struct bkey_packed and bkey_i as:
> 
>     %struct.bkey_packed = type { [0 x i64], i8, i8, i8, [0 x i8], [37 x i8] }
>     %struct.bkey_i = type { [0 x i64], %struct.bkey, %struct.bch_val }
> 
> , which more or less looks as expected, so I don't think it could be
> causing problems with KMSAN right now.
> Moreover, there are cases in e.g. include/linux/skbuff.h where
> zero-length arrays are used for the same purpose, and KMSAN handles
> them just fine.
> 
> Yet I want to point out that even GCC discourages the use of
> zero-length arrays in the middle of a struct:
> https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html, so Clang is not
> unique here.
> 
> Regarding the original KMSAN bug, as noted in
> https://lore.kernel.org/all/0000000000009f9447061833d477@google.com/T/,
> we might be missing the event of copying data from the disk to
> bcachefs structs.
> I'd appreciate help from someone knowledgeable about how disk I/O is
> implemented in the kernel.

If that was missing I'd expect everything to be breaking. What's the
helper that marks memory as initialized?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Use of zero-length arrays in bcachefs structures inner fields
  2024-05-28 15:02           ` Kent Overstreet
@ 2024-06-03  9:12             ` Alexander Potapenko
  0 siblings, 0 replies; 8+ messages in thread
From: Alexander Potapenko @ 2024-06-03  9:12 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Mathieu Desnoyers, Brian Foster, Kees Cook, linux-kernel,
	linux-bcachefs, Marco Elver, Dmitry Vyukov, kasan-dev,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	llvm

On Tue, May 28, 2024 at 5:02 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> On Tue, May 28, 2024 at 01:36:11PM +0200, Alexander Potapenko wrote:
> > On Fri, May 24, 2024 at 7:30 PM Kent Overstreet
> > <kent.overstreet@linux.dev> wrote:
> > >
> > > On Fri, May 24, 2024 at 12:04:11PM -0400, Mathieu Desnoyers wrote:
> > > > On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> > > > > [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> > > > >
> > > > > On 2024-05-24 11:28, Kent Overstreet wrote:
> > > > > > On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> > > > > > > Hi Kent,
> > > > > > >
> > > > > > > Looking around in the bcachefs code for possible causes of this KMSAN
> > > > > > > bug report:
> > > > > > >
> > > > > > > https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
> > > > > > >
> > > > > > > I notice the following pattern in the bcachefs structures: zero-length
> > > > > > > arrays members are inserted in structures (not always at the end),
> > > > > > > seemingly to achieve a result similar to what could be done with a
> > > > > > > union:
> > > > > > >
> > > > > > > fs/bcachefs/bcachefs_format.h:
> > > > > > >
> > > > > > > struct bkey_packed {
> > > > > > >          __u64           _data[0];
> > > > > > >
> > > > > > >          /* Size of combined key and value, in u64s */
> > > > > > >          __u8            u64s;
> > > > > > > [...]
> > > > > > > };
> > > > > > >
> > > > > > > likewise:
> > > > > > >
> > > > > > > struct bkey_i {
> > > > > > >          __u64                   _data[0];
> > > > > > >
> > > > > > >          struct bkey     k;
> > > > > > >          struct bch_val  v;
> > > > > > > };
> >
> > I took a glance at the LLVM IR for fs/bcachefs/bset.c, and it defines
> > struct bkey_packed and bkey_i as:
> >
> >     %struct.bkey_packed = type { [0 x i64], i8, i8, i8, [0 x i8], [37 x i8] }
> >     %struct.bkey_i = type { [0 x i64], %struct.bkey, %struct.bch_val }
> >
> > , which more or less looks as expected, so I don't think it could be
> > causing problems with KMSAN right now.
> > Moreover, there are cases in e.g. include/linux/skbuff.h where
> > zero-length arrays are used for the same purpose, and KMSAN handles
> > them just fine.
> >
> > Yet I want to point out that even GCC discourages the use of
> > zero-length arrays in the middle of a struct:
> > https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html, so Clang is not
> > unique here.
> >
> > Regarding the original KMSAN bug, as noted in
> > https://lore.kernel.org/all/0000000000009f9447061833d477@google.com/T/,
> > we might be missing the event of copying data from the disk to
> > bcachefs structs.
> > I'd appreciate help from someone knowledgeable about how disk I/O is
> > implemented in the kernel.
>
> If that was missing I'd expect everything to be breaking. What's the
> helper that marks memory as initialized?

There's kmsan_unpoison_memory()
(https://elixir.bootlin.com/linux/latest/source/include/linux/kmsan-checks.h#L37).
include/linux/kmsan.h also has several more specific helpers for
various subsystems - we probably need something like that.
I was expecting kmsan_handle_dma() to cover disk IO as well, but
apparently I was wrong.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-06-03  9:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-23 17:53 Use of zero-length arrays in bcachefs structures inner fields Mathieu Desnoyers
2024-05-24 15:28 ` Kent Overstreet
2024-05-24 15:35   ` Mathieu Desnoyers
2024-05-24 16:04     ` Mathieu Desnoyers
2024-05-24 17:30       ` Kent Overstreet
2024-05-28 11:36         ` Alexander Potapenko
2024-05-28 15:02           ` Kent Overstreet
2024-06-03  9:12             ` Alexander Potapenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox