All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
	linux-bcachefs@vger.kernel.org, kernel test robot <lkp@intel.com>,
	linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org
Subject: Re: [PATCH] bcachefs: Refactor bkey_i to use a flexible array
Date: Tue, 17 Oct 2023 10:12:05 -0400	[thread overview]
Message-ID: <ZS6WNXABUDTh8NZA@bfoster> (raw)
In-Reply-To: <202310161249.43FB42A6@keescook>

On Mon, Oct 16, 2023 at 02:18:19PM -0700, Kees Cook wrote:
> On Mon, Oct 16, 2023 at 08:41:58AM -0400, Brian Foster wrote:
> > On Fri, Oct 13, 2023 at 04:44:21PM -0700, Kees Cook wrote:
> > > On Fri, Oct 13, 2023 at 07:26:11AM -0400, Brian Foster wrote:
> > > > On Tue, Oct 10, 2023 at 04:56:12PM -0700, Kees Cook wrote:
...
> > > > 
> > > > Hi Kees,
> > > > 
> > > > I'm curious if this is something that could be buried in bch_val given
> > > > it's already kind of a fake structure..? If not, my only nitty comment
> > > 
> > > I was thinking it would be best to keep the flexible array has "high" in
> > > the struct as possible, as in the future more refactoring will be needed
> > > to avoid having flex arrays overlap with other members in composite
> > > structures. So instead of pushing into bch_val, I left it at the highest
> > > level possible, bch_i, as that's the struct being used by the memcpy().
> > > 
> > > Eventually proper unions will be needed instead of overlapping bch_i
> > > with other types, as in:
> > > 
> > > struct btree_root {
> > >         struct btree            *b;
> > > 
> > >         /* On disk root - see async splits: */
> > >         __BKEY_PADDED(key, BKEY_BTREE_PTR_VAL_U64s_MAX);
> > >         u8                      level;
> > >         u8                      alive;
> > >         s8                      error;
> > > };
> > > 
> > > But that's all for the future. Right now I wanted to deal with the more
> > > pressing matter of a 0-sized array not being zero sized. :)
> > > 
> > 
> > Ok, but I'm not really following how one approach vs. the other relates
> > to this particular example of an embedded bkey_i. I'm probably just not
> > familiar enough with the current issues with 0-sized arrays and the
> > anticipated path forward. Can you elaborate for somebody who is more
> > focused on trying to manage the design/complexity of these various key
> > data structures? For example, what's the practical difference here (for
> > future work) if the flex array lives in bch_val vs. bkey_i?
> 
> I was looking strictly at the layer it was happening: the function that
> calls memcpy() is working on a bkey_i, so I figured that was the place
> for it currently.
> 
> > Note that I don't necessarily have a strong opinion on this atm, but if
> > there's a "for future reasons" aspect to this approach I'd like to at
> > least understand it a little better. ;)
> 
> The future work here is about making sure flexible arrays don't overlap
> with non-flexible array members[1], and that will require giving some
> thought to how the structures are arranged.
> 
...
> The use of "struct header" effectively says "we have some number of bytes,
> but we don't know *what* it is yet". Is it cmd_one, or cmd_two? Instead
> of combining the flex array with the header, we can either split it off:
> 
> struct header {
> 	u32 long flags;
> 	struct other things;
> 	size_t byte_count;
> };
> 
> struct cmd_unknown {
> 	struct header;
> 	u8 bytes[];
> };
> 
> Or we can merge all the structs together:
> 
> struct everything {
> 	u32 long flags;
> 	struct other things;
> 	size_t byte_count;
> 	union {
> 		struct cmd_one {
> 			u64 detail;
> 			u8 bits;
> 		};
> 		struct cmd_two {
> 			u32 foo, bar;
> 			u64 baz;
> 		};
> 		struct unknown {
> 			u8 bytes[];
> 		};
> 	};
> };
> 
> In the first style, the flexible array is distinctly separate. In the
> second style the overlap is explicitly shown via the union.
> 
> I expect it will take a long time to make the kernel "flex array overlap
> clean", so while I don't feel any rush, I've been generally trying to
> avoid seeing new instances of ambiguous overlap _added_ to the kernel. :)
> 

Got it. This helps explain potential wonkiness of a variant with the
flex array buried in bch_val.

> bcachefs is in a unique places where because it's been out of tree
> its code patterns aren't "new", but it's just been "added" upstream.
> *shrug* So we'll deal with the existing warnings we've already got,
> and prepare for the future warnings as we can.
> 

Ack.

> Hopefully that helps!
> 

Indeed it does, thanks!

Brian

> -Kees
> 
> [1] See "-Wflex-array-member-not-at-end":
>     https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
> [2] Going all the way back to 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
> 
> > 
> > > > is that memcpy(k->bytes[], ...) makes it kind of read like we're copying
> > > > in opaque key data rather than value data, so perhaps a slightly more
> > > > descriptive field name would be helpful. But regardless I'd wait until
> > > > Kent has a chance to comment before changing anything..
> > > 
> > > How about "v_bytes" instead of "bytes"? Or if it really is preferred,
> > > I can move the flex array into bch_val -- it just seems like the wrong
> > > layer...
> > > 
> > 
> > Yeah.. v_bytes, value_bytes, etc. etc. Anything that avoids misleading
> > code when using the field is good with me. Thanks.
> > 
> > Brian
> > 
> > > -Kees
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > >  
> > > > >  #define KEY(_inode, _offset, _size)					\
> > > > > diff --git a/fs/bcachefs/extents.h b/fs/bcachefs/extents.h
> > > > > index 7ee8d031bb6c..6248e17bbac5 100644
> > > > > --- a/fs/bcachefs/extents.h
> > > > > +++ b/fs/bcachefs/extents.h
> > > > > @@ -642,7 +642,7 @@ static inline void bch2_bkey_append_ptr(struct bkey_i *k, struct bch_extent_ptr
> > > > >  
> > > > >  		ptr.type = 1 << BCH_EXTENT_ENTRY_ptr;
> > > > >  
> > > > > -		memcpy((void *) &k->v + bkey_val_bytes(&k->k),
> > > > > +		memcpy(&k->bytes[bkey_val_bytes(&k->k)],
> > > > >  		       &ptr,
> > > > >  		       sizeof(ptr));
> > > > >  		k->k.u64s++;
> > > > > -- 
> > > > > 2.34.1
> > > > > 
> > > > 
> > > 
> > > -- 
> > > Kees Cook
> > > 
> > 
> 
> -- 
> Kees Cook
> 


  reply	other threads:[~2023-10-17 14:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10 23:56 [PATCH] bcachefs: Refactor bkey_i to use a flexible array Kees Cook
2023-10-13 11:26 ` Brian Foster
2023-10-13 23:44   ` Kees Cook
2023-10-16 12:41     ` Brian Foster
2023-10-16 21:18       ` Kees Cook
2023-10-17 14:12         ` Brian Foster [this message]
2023-10-18 22:04     ` Kent Overstreet
2023-10-18 22:36       ` Kees Cook
2023-10-18 23:08         ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZS6WNXABUDTh8NZA@bfoster \
    --to=bfoster@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.