git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, correctmost <cmlists@sent.com>,
	Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH 4/9] cache-tree: avoid strtol() on non-string buffer
Date: Wed, 12 Nov 2025 12:26:06 +0100	[thread overview]
Message-ID: <aRRuzrmbJBW8q4Dd@pks.im> (raw)
In-Reply-To: <20251112080537.GD979063@coredump.intra.peff.net>

On Wed, Nov 12, 2025 at 03:05:37AM -0500, Jeff King wrote:
> A cache-tree extension entry in the index looks like this:
> 
>   <name> NUL <entry_nr> SPACE <subtree_nr> NEWLINE <binary_oid>
> 
> where the "_nr" items are human-readable base-10 ASCII. We parse them
> with strtol(), even though we do not have a NUL-terminated string (we'd
> generally have an mmap() of the on-disk index file). For a well-formed
> entry, this is not a problem; strtol() will stop when it sees the
> newline. But there are two problems:
> 
>   1. A corrupted entry could omit the newline, causing us to read
>      further. You'd mostly get stopped by seeing non-digits in the oid
>      field (and if it is likewise truncated, there will still be 20 or
>      more bytes of the index checksum). So it's possible, though
>      unlikely, to see read off the end of the mmap'd buffer. Of course a

s/see read/read/

>      malicious index file can fake the oid and the index checksum to all
>      (ASCII) 0's.
> 
>      This is further complicated by the fact that mmap'd buffers tend to
>      be zero-padded up to the page boundary. So to run off the end, the
>      index size also has to be a multiple of the page size. This is also
>      unlikely, though you can construct a malicious index file that
>      matches this.
> 
>      The security implications aren't too interesting. The index file is
>      a local file anyway (so you can't attack somebody by cloning, but
>      only if you convince them to operate in a .git directory you made,
>      at which point attacking .git/config is much easier). And it's just
>      a read overflow via strtol(), which is unlikely to buy you much
>      beyond a crash.

Agreed. Good to fix it regardless.

> diff --git a/cache-tree.c b/cache-tree.c
> index 2aba47060e..ab20ffe863 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -548,12 +548,36 @@ void cache_tree_write(struct strbuf *sb, struct cache_tree *root)
>  	trace2_region_leave("cache_tree", "write", the_repository);
>  }
>  
> +static long parse_long(const char **ptr, unsigned long *len_p)
> +{
> +	const char *s = *ptr;
> +	unsigned long len = *len_p;
> +	long ret = 0;
> +	int sign = 1;
> +
> +	while (len && *s == '-') {
> +		sign *= -1;
> +		s++;
> +		len--;
> +	}
> +
> +	while (len) {
> +		if (!isdigit(*s))
> +			break;
> +		ret *= 10;
> +		ret += *s - '0';
> +		s++;
> +		len--;
> +	}
> +	*ptr = s;
> +	*len_p = len;
> +	return sign * ret;
> +}

Hm. I'm not a huge fan of not having any error handling at all. It just
feels way too fragile for my taste:

  - As you mention we don't detect overflows, as we would detect them at
    a later point in time when trying to access index entries at invalid
    offsets. But if the input is crafted in a way that the overflow ends
    up with a reasonable index entry we might just as well _not_ detect
    that an overflow has happened and end up using the wrong index
    entry.

  - We don't verify that we even have a number in the first place. We'd
    simply return "0" in that case and not advance the pointer. This is
    fine though as we verify that the returned size is non-zero, so we'd
    detect this case.

I'd much rather prefer to have an interface similar to `git_parse_int()`
and related functions, which are way easier to use compared to the likes
of `stroi()`.

Otherwise, the next time we want to have something similar like you
introduce here we might not be aware of this existing helper and may not
know to generalize it, so we'd introduce another ad-hoc helper.

Anyway. I won't object if you want to go with your version anyway, as it
at least fixes one class of errors.

Thanks!

Patrick

  reply	other threads:[~2025-11-12 11:26 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-12  7:55 [PATCH 0/9] asan bonanza Jeff King
2025-11-12  7:56 ` [PATCH 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-12  8:01 ` [PATCH 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-12 11:25   ` Patrick Steinhardt
2025-11-13  2:55   ` Taylor Blau
2025-11-18  8:59     ` Jeff King
2025-11-12  8:02 ` [PATCH 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-12  8:17   ` Collin Funk
2025-11-12 10:31     ` Jeff King
2025-11-12 20:06       ` Collin Funk
2025-11-12 11:26   ` Patrick Steinhardt
2025-11-13  3:12     ` Taylor Blau
2025-11-13  6:34       ` Patrick Steinhardt
2025-11-18  8:49       ` Jeff King
2025-11-13 16:30     ` Junio C Hamano
2025-11-14  7:00       ` Patrick Steinhardt
2025-11-15  2:13         ` Jeff King
2025-11-12  8:05 ` [PATCH 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-12 11:26   ` Patrick Steinhardt [this message]
2025-11-13  3:09     ` Taylor Blau
2025-11-18  8:40       ` Jeff King
2025-11-18  8:38     ` Jeff King
2025-11-12  8:06 ` [PATCH 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-12  8:06 ` [PATCH 6/9] fsck: avoid strcspn() " Jeff King
2025-11-12  8:06 ` [PATCH 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-12  8:10 ` [PATCH 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-12 11:25   ` Patrick Steinhardt
2025-11-12 19:36     ` Junio C Hamano
2025-11-15  2:12     ` Jeff King
2025-11-12  8:10 ` [PATCH 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-13  3:17 ` [PATCH 0/9] asan bonanza Taylor Blau
2025-11-18  9:11 ` [PATCH v2 " Jeff King
2025-11-18  9:11   ` [PATCH v2 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-18  9:12   ` [PATCH v2 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-18  9:12   ` [PATCH v2 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-18  9:12   ` [PATCH v2 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-18 14:30     ` Phillip Wood
2025-11-23  6:19       ` Junio C Hamano
2025-11-23 15:51         ` Phillip Wood
2025-11-23 18:06           ` Junio C Hamano
2025-11-24 22:30         ` Jeff King
2025-11-24 23:09           ` Junio C Hamano
2025-11-26 15:09             ` Jeff King
2025-11-26 17:22               ` Junio C Hamano
2025-11-30 13:13                 ` [PATCH 0/4] more robust functions for parsing int from buf Jeff King
2025-11-30 13:14                   ` [PATCH 1/4] parse: prefer bool to int for boolean returns Jeff King
2025-12-04 11:23                     ` Patrick Steinhardt
2025-11-30 13:15                   ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Jeff King
2025-11-30 13:46                     ` my complaints with clar Jeff King
2025-12-01 14:16                       ` Phillip Wood
2025-12-04 11:09                         ` Patrick Steinhardt
2025-12-05 18:30                           ` Jeff King
2025-12-04 11:23                     ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Patrick Steinhardt
2025-12-05 16:11                     ` Phillip Wood
2025-11-30 13:15                   ` [PATCH 3/4] cache-tree: use parse_int_from_buf() Jeff King
2025-11-30 13:16                   ` [PATCH 4/4] fsck: use parse_unsigned_from_buf() for parsing timestamp Jeff King
2025-11-18  9:12   ` [PATCH v2 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-18  9:12   ` [PATCH v2 6/9] fsck: avoid strcspn() " Jeff King
2025-11-18  9:12   ` [PATCH v2 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-18  9:12   ` [PATCH v2 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-18  9:12   ` [PATCH v2 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-23  5:49   ` [PATCH v2 0/9] asan bonanza Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aRRuzrmbJBW8q4Dd@pks.im \
    --to=ps@pks.im \
    --cc=cmlists@sent.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).