From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] dir.c: skip .gitignore, etc larger than INT_MAX
Date: Wed, 05 Jun 2024 09:23:21 -0700 [thread overview]
Message-ID: <xmqqplsvuznq.fsf@gitster.g> (raw)
In-Reply-To: <20240605080308.GA2345232@coredump.intra.peff.net> (Jeff King's message of "Wed, 5 Jun 2024 04:03:08 -0400")
Jeff King <peff@peff.net> writes:
> I do not mind if consolidation waits for later, but I guess the
> immediate question is whether we'd prefer to reduce the limit set by my
> patch to a more security-conservative value. It would be easy to swap
> out INT_MAX for a 100MB #define on top.
>
> Maybe this?
Sounds sensible. Thanks.
> -- >8 --
> Subject: [PATCH] dir.c: reduce max pattern file size to 100MB
>
> In a2bc523e1e (dir.c: skip .gitignore, etc larger than INT_MAX,
> 2024-05-31) we put capped the size of some files whose parsing code and
> data structures used ints. Setting the limit to INT_MAX was a natural
> spot, since we know the parsing code would misbehave above that.
>
> But it also leaves the possibility of overflow errors when we multiply
> that limit to allocate memory. For instance, a file consisting only of
> "a\na\n..." could have INT_MAX/2 entries. Allocating an array of
> pointers for each would need INT_MAX*4 bytes on a 64-bit system, enough
> to overflow a 32-bit int.
>
> So let's give ourselves a bit more safety margin by giving a much
> smaller limit. The size 100MB is somewhat arbitrary, but is based on the
> similar value for attribute files added by 3c50032ff5 (attr: ignore
> overly large gitattributes files, 2022-12-01).
>
> There's no particular reason these have to be the same, but the idea is
> that they are in the ballpark of "so huge that nobody would care, but
> small enough to avoid malicious overflow". So lacking a better guess, it
> makes sense to use the same value. The implementation here doesn't share
> the same constant, but we could change that later (or even give it a
> runtime config knob, though nobody has complained yet about the
> attribute limit).
>
> And likewise, let's add a few tests that exercise the limits, based on
> the attr ones. In this case, though, we never read .gitignore from the
> index; the blob code is exercised only for sparse filters. So we'll
> trigger it that way.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> dir.c | 10 ++++++++--
> t/t0008-ignores.sh | 8 ++++++++
> t/t6112-rev-list-filters-objects.sh | 12 ++++++++++++
> 3 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index 914060edfd..ad2b7ebe2d 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -32,6 +32,12 @@
> #include "tree.h"
> #include "hex.h"
>
> + /*
> + * The maximum size of a pattern/exclude file. If the file exceeds this size
> + * we will ignore it.
> + */
> +#define PATTERN_MAX_FILE_SIZE (100 * 1024 * 1024)
> +
> /*
> * Tells read_directory_recursive how a file or directory should be treated.
> * Values are ordered by significance, e.g. if a directory contains both
> @@ -1149,7 +1155,7 @@ static int add_patterns(const char *fname, const char *base, int baselen,
> }
> }
>
> - if (size > INT_MAX) {
> + if (size > PATTERN_MAX_FILE_SIZE) {
> warning("ignoring excessively large pattern file: %s", fname);
> free(buf);
> return -1;
> @@ -1211,7 +1217,7 @@ int add_patterns_from_blob_to_list(
> if (r != 1)
> return r;
>
> - if (size > INT_MAX) {
> + if (size > PATTERN_MAX_FILE_SIZE) {
> warning("ignoring excessively large pattern blob: %s",
> oid_to_hex(oid));
> free(buf);
> diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
> index 361446b2f4..02a18d4fdb 100755
> --- a/t/t0008-ignores.sh
> +++ b/t/t0008-ignores.sh
> @@ -945,4 +945,12 @@ test_expect_success SYMLINKS 'symlinks not respected in-tree' '
> test_grep "unable to access.*gitignore" err
> '
>
> +test_expect_success EXPENSIVE 'large exclude file ignored in tree' '
> + test_when_finished "rm .gitignore" &&
> + dd if=/dev/zero of=.gitignore bs=101M count=1 &&
> + git ls-files -o --exclude-standard 2>err &&
> + echo "warning: ignoring excessively large pattern file: .gitignore" >expect &&
> + test_cmp expect err
> +'
> +
> test_done
> diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
> index 43e1afd44c..0387f35a32 100755
> --- a/t/t6112-rev-list-filters-objects.sh
> +++ b/t/t6112-rev-list-filters-objects.sh
> @@ -701,4 +701,16 @@ test_expect_success 'expand blob limit in protocol' '
> grep "blob:limit=1024" trace
> '
>
> +test_expect_success EXPENSIVE 'large sparse filter file ignored' '
> + blob=$(dd if=/dev/zero bs=101M count=1 |
> + git hash-object -w --stdin) &&
> + test_must_fail \
> + git rev-list --all --objects --filter=sparse:oid=$blob 2>err &&
> + cat >expect <<-EOF &&
> + warning: ignoring excessively large pattern blob: $blob
> + fatal: unable to parse sparse filter data in $blob
> + EOF
> + test_cmp expect err
> +'
> +
> test_done
next prev parent reply other threads:[~2024-06-05 16:23 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-31 12:00 [PATCH] dir.c: skip .gitignore, etc larger than INT_MAX Jeff King
2024-05-31 15:05 ` Junio C Hamano
2024-06-04 10:39 ` Jeff King
2024-06-04 17:45 ` Junio C Hamano
2024-06-05 8:03 ` Jeff King
2024-06-05 16:23 ` Junio C Hamano [this message]
2024-05-31 15:10 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqplsvuznq.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.