public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Hendrik Farr <kernel@jfarr.cc>
To: Kees Cook <kees@kernel.org>
Cc: Thorsten Blum <thorsten.blum@toblux.com>,
	kent.overstreet@linux.dev, regressions@lists.linux.dev,
	linux-bcachefs@vger.kernel.org, linux-hardening@vger.kernel.org,
	linux-kernel@vger.kernel.org, ardb@kernel.org, morbo@google.com
Subject: Re: [REGRESSION][BISECTED] erroneous buffer overflow detected in bch2_xattr_validate
Date: Mon, 7 Oct 2024 17:10:53 +0200	[thread overview]
Message-ID: <ZwP5_bY-AmN91_YB@archlinux> (raw)
In-Reply-To: <ZwNb-_UPL9BPSg9N@archlinux>

On 07 05:56:46, Jan Hendrik Farr wrote:
> > I want to separate several easily confused issues. Instead of just
> > saying __bdos, let's clearly refer to what calculation within bdos is
> > being used. There are 3 choices currently:
> > - alloc_size attribute
> > - counted_by attribute
> > - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
> > 
> > Additionally there are (for all intents and purposes) 2 size
> > determinations to be made by __bos and __bdos, via argument 2:
> > - containing object size (type 0) ("maximum size")
> > - specific object size (type 1) ("minimum size")
> 
> "maximum" vs "minimum" size would by type 0 vs type 2, but I think you
> do mean type 0 and type 1 as those are the types currently used by
> __struct_size and __member_size. Those are both "maximum" sizes.
> 
> > 
> > For example, consider:
> > 
> > struct posix_acl *acl = malloc(1024);
> > acl->a_count = 1;
> > 
> > what should these return:
> > 
> > 	__bos(acl, 0)
> > 	__bos(acl, 1)
> > 	__bdos(acl, 0)
> > 	__bdos(acl, 1)
> > 	__bos(acl->a_entries, 0)
> > 	__bos(acl->a_entries, 1)
> > 	__bdos(acl->a_entries, 0)
> > 	__bdos(acl->a_entries, 1)
> > 
> 
> I gathered some data from clang and gcc on all for all these cases and
> additionally varied whether the allocation size is a compile time known
> constant, runtime known, or not known. I also varied whether
> __counted_by was used.
> 
> Source code: [1]
> 
> 
> Abbreviations:
> 
> FAM      = flexible array member
> -1       = SIZE_MAX
> p->a_ent = p->a_entries
> comp.    = allocation size is compile time known
> run.     = allocation size is compile time known
> none     = allocation size is unknown
> count    = __counted_by attribute in use
> correct  = What I think the correct answers should be. In some places I
> have two answers. In that case the second number is what the kernel
> currently expects.
> 
> 
> And here's the data:
> 
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bos(p, 0)       |  x  |    |    |     | 1024 | 1024 | 1024
> bos(p, 0)       |     | x  |    |     |  -1  |  -1  | -1
> bos(p, 0)       |     |    | x  |     |  -1  |  -1  | -1
> bos(p, 0)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> bos(p, 0)       |     | x  |    |  x  |  -1  |  -1  | -1
> bos(p, 0)       |     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bos(p, 1)       |  x  |    |    |     | 1024 | 1024 | 1024
> bos(p, 1)       |     | x  |    |     |  -1  |  -1  | -1
> bos(p, 1)       |     |    | x  |     |  -1  |  -1  | -1
> bos(p, 1)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> bos(p, 1)       |     | x  |    |  x  |  -1  |  -1  | -1
> bos(p, 1)       |     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 0)      |  x  |    |    |     | 1024 | 1024 | 1024
> bdos(p, 0)      |     | x  |    |     | 1024 | 1024 | 1024
> bdos(p, 0)      |     |    | x  |     |  -1  |  -1  | -1
> bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 1)      |  x  |    |    |     | 1024 | 1024 | 1024
> bdos(p, 1)      |     | x  |    |     | 1024 | 1024 | 1024
> bdos(p, 1)      |     |    | x  |     |  -1  |  -1  | -1
> bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 0)|  x  |    |    |     |  996 | 996  | 996
> bos(p->a_ent, 0)|     | x  |    |     |  -1  |  -1  | -1
> bos(p->a_ent, 0)|     |    | x  |     |  -1  |  -1  | -1
> bos(p->a_ent, 0)|  x  |    |    |  x  |  996 | 996  | 996
> bos(p->a_ent, 0)|     | x  |    |  x  |  -1  |  -1  | -1
> bos(p->a_ent, 0)|     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> bos(p->a_ent, 1)|     | x  |    |     |  -1  |  -1  | -1
> bos(p->a_ent, 1)|     |    | x  |     |  -1  |  -1  | -1
> bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> bos(p->a_ent, 1)|     | x  |    |  x  |  -1  |  -1  | -1
> bos(p->a_ent, 1)|     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p->a_ent,0)|  x  |    |    |     |  996 | 996  | 996
> bdos(p->a_ent,0)|     | x  |    |     |  996 | 996  | 996
> bdos(p->a_ent,0)|     |    | x  |     |  -1  |  -1  | -1
> bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  8


These previous three should probably actually be like this:
bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  15
bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  15
bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  15

They should include the allowed padding after the FAM, as this is a type
0 bdos. Not really an issue here, as the kernel expects 8 here.


> ----------------|-----|----|----|-----|------|------|-----
> bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     |    | x  |     |  -1  |  -1  | -1
> bdos(p->a_ent,1)|  x  |    |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,1)|     | x  |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,1)|     |    | x  |  x  |   8  |  8   |  8
> ----------------|-----|----|----|-----|------|------|-----
> 
> bos only uses the allocation size to give it's answers. It only works if
> it is a compile time known constant. bos also does not utilize the
> __counted_by attribute.
> 
> bdos on the other hand allows the allocation size to be runtime known.
> It also makes use of the __counted_by attribute if present, which always
> takes precedence over the allocation size when the compiler supports it
> for the particular case. So in those cases you can "lie" to the compiler
> about the size of an object.
> 
> clang supports the __counted_by attribute for both cases (p and
> p->a_entries). gcc only supports it for p->a_entries cases.
> 
> 
> 
> Issue A (clang)
> =======
> 
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> 
> These cases also represent the "bdos off by 4" issue in clang. clang
> will compute these results using:
> 
> max(sizeof(struct posix_acl), offsetof(struct posix_acl, a_entries) +
> count * sizeof(struct posix_acl_entries)) = 36
> 
> The kernel on the other hand expects this behavior:
> 
> sizeof(struct posix_acl) + count * sizeof(struct posix_acl_entries) = 40
> 
> 
> I think the correct calculation would actually be this:
> 
> offsetof(struct posix_acl, a_entries)
> + (acl->a_count + 1) * sizeof(struct posix_acl_entry) - 1 = 43
> 
> The C11 standard says that when the . or -> operator is used on a struct
> with an FAM it behaves like the FAM was replaced with the largest array
> (with the same element type) that would not make the object any larger
> (see page 113 and 114 of [2]).
> So there are actually multiple sizes of the object that are consistent
> with a count of 1.
> 
> malloc-max = maximum size of the object
> malloc-min = minimum size of the object
> FAME = flexible array member element
> (FAME) = hypothetical 2nd FAME
> 
> <-----------------malloc-max-------------->
> <-----------------malloc-min------->
> <------sizeof(posix_acl)------->
>                             <-FAME-><(FAME)>
> 
> The clang documentation of type 0 (vs type 2) bdos says this:
> 
> If ``type & 2 == 0``, the least ``n`` is returned such that accesses to 
>    ``(const char*)ptr + n`` and beyond are known to be out of bounds.
> 
> We only _know_ that that access to the last byte of a 2nd hypothetical FAME
> would be out of bounds. All the bytes before that are padding that is
> allowed by the standard.
> 
> 
> However, also this calculation doesn't get the kernel out
> of trouble here. While this would fix the issue for this particular
> struct it does not solve it for all structs:
> 
> What if the elements of the FAM were chars instead of
> struct posix_acl_entries here? In that case the kernel is back to
> overestimating the size of the struct / underreporting the count to the
> compiler. So while I think this answer is more correct it doesn't
> actually solve the issue.
> 
> Example:
> Let's say the kernel allocates one of these posix_acl_char structs for a
> single char in the array:
> 
> malloc(sizeof(posix_acl_char) + 1 * sizeof(char)) = 33
> 
> The C standard actually says that this object will behave like this when
> the FAM is accessed:
> 
> struct posix_acl {
>     refcount_t a_refcount;
>     struct rcu_head a_rcu;
>     unsigned int a_count;
>     char a_entries[5];
> };
> 
> a_count should be set to 5, not 1!
> 
> 
> So we would really need an option to tell the compiler to use the same
> size calculation as the kernel expects here, or maybe be able to specify
> an offset in the __counted_by attribute. Alternatively clang could use
> an option to disable the use of __counted_by for cases where the whole
> struct is passed. This would make it behave like gcc.
> 
> 
> 
> Issue B (clang + gcc)
> =======
> 
> A less serious issue happens with these cases:
> 
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
> 
> In this case the size returned by bos/bdos is too large, so this won't
> lead to false positives. Both clang and gcc simply compute the difference
> between the pointer from the start of the FAM to the end of the whole
> struct. I believe this is wrong. According to the C standard the object
> should behave like the FAM was replaced with the largest array that does
> not make the object any larger. The size of that array is 124 elements.
> So the posix_acl becomes:
> 
> struct posix_acl {
>     refcount_t a_refcount;
>     struct rcu_head a_rcu;
>     unsigned int a_count;
>     struct posix_acl_entry a_entries[124];
> };
> 
> Since this is a type 1 bos/bdos it should return the size of just the
> array, which is 124 * 8 = 992, and not 124.5 * 8 = 996.
> 
> [1] https://godbolt.org/z/a5eM3z8PY
> [2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
> 
> Best Regards
> Jan
> 

  reply	other threads:[~2024-10-07 15:10 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-26 15:14 [REGRESSION][BISECTED] erroneous buffer overflow detected in bch2_xattr_validate Jan Hendrik Farr
2024-09-26 15:28 ` Thorsten Blum
2024-09-26 16:09   ` Thorsten Blum
2024-09-26 16:37     ` Jan Hendrik Farr
2024-09-26 17:01     ` Jan Hendrik Farr
2024-09-26 17:45       ` Jan Hendrik Farr
2024-09-26 19:58         ` Ard Biesheuvel
2024-09-26 22:18           ` Bill Wendling
2024-09-27  1:30             ` Bill Wendling
2024-09-27  3:41               ` Jan Hendrik Farr
2024-09-28 20:50               ` Kees Cook
2024-09-28 23:33                 ` Jan Hendrik Farr
2024-09-29 19:59                   ` Jan Hendrik Farr
2024-09-28 17:36     ` Jan Hendrik Farr
2024-09-28 17:49       ` Jan Hendrik Farr
2024-09-28 20:34       ` Kees Cook
2024-10-02  9:18         ` Thorsten Blum
2024-10-03 11:33           ` Jan Hendrik Farr
2024-10-03 13:07             ` Thorsten Blum
2024-10-03 13:12               ` Jan Hendrik Farr
2024-10-03 15:02                 ` Thorsten Blum
2024-10-03 15:22                   ` Jan Hendrik Farr
2024-10-03 15:30                     ` Thorsten Blum
2024-10-03 15:35                       ` Jan Hendrik Farr
2024-10-03 15:43                         ` Thorsten Blum
2024-10-03 16:32                           ` Jan Hendrik Farr
2024-10-03 15:17               ` Jan Hendrik Farr
2024-10-03 21:28                 ` Kees Cook
2024-10-03 21:48                   ` Jan Hendrik Farr
2024-10-04 17:13                     ` Kees Cook
2024-10-07  3:56                       ` Jan Hendrik Farr
2024-10-07 15:10                         ` Jan Hendrik Farr [this message]
2024-10-16 21:13                           ` Kees Cook
2024-10-16 23:41                         ` Bill Wendling
2024-10-17  0:09                           ` Bill Wendling
2024-10-17  3:04                             ` Jan Hendrik Farr
2024-10-17 16:55                               ` Nathan Chancellor
2024-10-17 17:39                                 ` Miguel Ojeda
2024-10-17 18:55                                   ` Nathan Chancellor
2024-10-18 11:52                                     ` Miguel Ojeda
2024-10-21  1:33                                 ` Jan Hendrik Farr
2024-10-21  6:04                                   ` Miguel Ojeda
2024-10-21 17:01                                     ` Jan Hendrik Farr
2024-10-21 19:25                                   ` Nathan Chancellor
2024-10-24 13:16                                     ` Jan Hendrik Farr
2024-10-25  1:15                                       ` Nathan Chancellor
2024-10-25  8:10                                         ` Miguel Ojeda
2024-10-25 15:27                                           ` Jan Hendrik Farr
2025-05-01 14:30                                             ` Alan Huang
2025-05-01 16:45                                               ` Jan Hendrik Farr
2025-05-01 17:22                                               ` Jan Hendrik Farr
2025-05-01 17:28                                                 ` Alan Huang
2025-05-01 17:58                                                   ` Jan Hendrik Farr
2025-05-01 18:10                                                     ` Kees Cook
2025-05-01 18:18                                                     ` Alan Huang
2024-10-17  0:41                           ` Jan Hendrik Farr
2024-10-14 21:39                       ` Bill Wendling
2024-10-16  1:22             ` Bill Wendling
2024-10-16  2:18               ` Jan Hendrik Farr
2024-10-16 20:43                 ` Kees Cook
2024-10-03 21:23           ` Kees Cook
2024-10-03 22:05             ` Jan Hendrik Farr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZwP5_bY-AmN91_YB@archlinux \
    --to=kernel@jfarr.cc \
    --cc=ardb@kernel.org \
    --cc=kees@kernel.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=morbo@google.com \
    --cc=regressions@lists.linux.dev \
    --cc=thorsten.blum@toblux.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox