public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
	linux-bcachefs@vger.kernel.org, kernel test robot <lkp@intel.com>,
	linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org
Subject: Re: [PATCH] bcachefs: Refactor bkey_i to use a flexible array
Date: Tue, 17 Oct 2023 10:12:05 -0400	[thread overview]
Message-ID: <ZS6WNXABUDTh8NZA@bfoster> (raw)
In-Reply-To: <202310161249.43FB42A6@keescook>

On Mon, Oct 16, 2023 at 02:18:19PM -0700, Kees Cook wrote:
> On Mon, Oct 16, 2023 at 08:41:58AM -0400, Brian Foster wrote:
> > On Fri, Oct 13, 2023 at 04:44:21PM -0700, Kees Cook wrote:
> > > On Fri, Oct 13, 2023 at 07:26:11AM -0400, Brian Foster wrote:
> > > > On Tue, Oct 10, 2023 at 04:56:12PM -0700, Kees Cook wrote:
...
> > > > 
> > > > Hi Kees,
> > > > 
> > > > I'm curious if this is something that could be buried in bch_val given
> > > > it's already kind of a fake structure..? If not, my only nitty comment
> > > 
> > > I was thinking it would be best to keep the flexible array has "high" in
> > > the struct as possible, as in the future more refactoring will be needed
> > > to avoid having flex arrays overlap with other members in composite
> > > structures. So instead of pushing into bch_val, I left it at the highest
> > > level possible, bch_i, as that's the struct being used by the memcpy().
> > > 
> > > Eventually proper unions will be needed instead of overlapping bch_i
> > > with other types, as in:
> > > 
> > > struct btree_root {
> > >         struct btree            *b;
> > > 
> > >         /* On disk root - see async splits: */
> > >         __BKEY_PADDED(key, BKEY_BTREE_PTR_VAL_U64s_MAX);
> > >         u8                      level;
> > >         u8                      alive;
> > >         s8                      error;
> > > };
> > > 
> > > But that's all for the future. Right now I wanted to deal with the more
> > > pressing matter of a 0-sized array not being zero sized. :)
> > > 
> > 
> > Ok, but I'm not really following how one approach vs. the other relates
> > to this particular example of an embedded bkey_i. I'm probably just not
> > familiar enough with the current issues with 0-sized arrays and the
> > anticipated path forward. Can you elaborate for somebody who is more
> > focused on trying to manage the design/complexity of these various key
> > data structures? For example, what's the practical difference here (for
> > future work) if the flex array lives in bch_val vs. bkey_i?
> 
> I was looking strictly at the layer it was happening: the function that
> calls memcpy() is working on a bkey_i, so I figured that was the place
> for it currently.
> 
> > Note that I don't necessarily have a strong opinion on this atm, but if
> > there's a "for future reasons" aspect to this approach I'd like to at
> > least understand it a little better. ;)
> 
> The future work here is about making sure flexible arrays don't overlap
> with non-flexible array members[1], and that will require giving some
> thought to how the structures are arranged.
> 
...
> The use of "struct header" effectively says "we have some number of bytes,
> but we don't know *what* it is yet". Is it cmd_one, or cmd_two? Instead
> of combining the flex array with the header, we can either split it off:
> 
> struct header {
> 	u32 long flags;
> 	struct other things;
> 	size_t byte_count;
> };
> 
> struct cmd_unknown {
> 	struct header;
> 	u8 bytes[];
> };
> 
> Or we can merge all the structs together:
> 
> struct everything {
> 	u32 long flags;
> 	struct other things;
> 	size_t byte_count;
> 	union {
> 		struct cmd_one {
> 			u64 detail;
> 			u8 bits;
> 		};
> 		struct cmd_two {
> 			u32 foo, bar;
> 			u64 baz;
> 		};
> 		struct unknown {
> 			u8 bytes[];
> 		};
> 	};
> };
> 
> In the first style, the flexible array is distinctly separate. In the
> second style the overlap is explicitly shown via the union.
> 
> I expect it will take a long time to make the kernel "flex array overlap
> clean", so while I don't feel any rush, I've been generally trying to
> avoid seeing new instances of ambiguous overlap _added_ to the kernel. :)
> 

Got it. This helps explain potential wonkiness of a variant with the
flex array buried in bch_val.

> bcachefs is in a unique places where because it's been out of tree
> its code patterns aren't "new", but it's just been "added" upstream.
> *shrug* So we'll deal with the existing warnings we've already got,
> and prepare for the future warnings as we can.
> 

Ack.

> Hopefully that helps!
> 

Indeed it does, thanks!

Brian

> -Kees
> 
> [1] See "-Wflex-array-member-not-at-end":
>     https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
> [2] Going all the way back to 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
> 
> > 
> > > > is that memcpy(k->bytes[], ...) makes it kind of read like we're copying
> > > > in opaque key data rather than value data, so perhaps a slightly more
> > > > descriptive field name would be helpful. But regardless I'd wait until
> > > > Kent has a chance to comment before changing anything..
> > > 
> > > How about "v_bytes" instead of "bytes"? Or if it really is preferred,
> > > I can move the flex array into bch_val -- it just seems like the wrong
> > > layer...
> > > 
> > 
> > Yeah.. v_bytes, value_bytes, etc. etc. Anything that avoids misleading
> > code when using the field is good with me. Thanks.
> > 
> > Brian
> > 
> > > -Kees
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > >  
> > > > >  #define KEY(_inode, _offset, _size)					\
> > > > > diff --git a/fs/bcachefs/extents.h b/fs/bcachefs/extents.h
> > > > > index 7ee8d031bb6c..6248e17bbac5 100644
> > > > > --- a/fs/bcachefs/extents.h
> > > > > +++ b/fs/bcachefs/extents.h
> > > > > @@ -642,7 +642,7 @@ static inline void bch2_bkey_append_ptr(struct bkey_i *k, struct bch_extent_ptr
> > > > >  
> > > > >  		ptr.type = 1 << BCH_EXTENT_ENTRY_ptr;
> > > > >  
> > > > > -		memcpy((void *) &k->v + bkey_val_bytes(&k->k),
> > > > > +		memcpy(&k->bytes[bkey_val_bytes(&k->k)],
> > > > >  		       &ptr,
> > > > >  		       sizeof(ptr));
> > > > >  		k->k.u64s++;
> > > > > -- 
> > > > > 2.34.1
> > > > > 
> > > > 
> > > 
> > > -- 
> > > Kees Cook
> > > 
> > 
> 
> -- 
> Kees Cook
> 


  reply	other threads:[~2023-10-17 14:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10 23:56 [PATCH] bcachefs: Refactor bkey_i to use a flexible array Kees Cook
2023-10-13 11:26 ` Brian Foster
2023-10-13 23:44   ` Kees Cook
2023-10-16 12:41     ` Brian Foster
2023-10-16 21:18       ` Kees Cook
2023-10-17 14:12         ` Brian Foster [this message]
2023-10-18 22:04     ` Kent Overstreet
2023-10-18 22:36       ` Kees Cook
2023-10-18 23:08         ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZS6WNXABUDTh8NZA@bfoster \
    --to=bfoster@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox