[PATCH] apply: keep buffer/size pair in sync when parsing binary hunks

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] apply: keep buffer/size pair in sync when parsing binary hunks
@ 2021-08-10  1:01 Jeff King
  2021-08-10  5:26 ` Carlo Arenas
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2021-08-10  1:01 UTC (permalink / raw)
  To: git; +Cc: Xingman Chen

We parse through binary hunks by looping through the buffer with code
like:

    llen = linelen(buffer, size);

    ...do something with the line...

    buffer += llen;
    size -= llen;

However, before we enter the loop, there is one call that increments
"buffer" but forgets to decrement "size". As a result, our "size" is off
by the length of that line, and subsequent calls to linelen() may look
past the end of the buffer for a newline.

The fix is easy: we just need to decrement size as we do elsewhere.

This bug goes all the way back to 0660626caf (binary diff: further
updates., 2006-05-05). Presumably nobody noticed because it only
triggers if the patch is corrupted, and even then we are often "saved"
by luck. We use a strbuf to store the incoming patch, so we overallocate
there, plus we add a 16-byte run of NULs as slop for memory comparisons.
So if this happened accidentally, the common case is that we'd just read
a few uninitialized bytes from the end of the strbuf before producing
the expected "this patch is corrupted" error complaint.

However, it is possible to carefully construct a case which reads off
the end of the buffer. The included test does so. It will pass both
before and after this patch when run normally, but using a tool like
ASan shows that we get an out-of-bounds read before this patch, but not
after.

Reported-by: Xingman Chen <xichixingman@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
This was reported to the security list, but I think the impact is small
enough to just take it as a normal fix.

I hate the test, as it is brittle and subject to bitrot if we do things
like change the default strbuf hint size. If that does happen, it will
still pass; it just won't do anything interesting. But I would like some
record of how to reproduce, and I prefer this to putting something
hard-to-run in the commit message.

 apply.c                 |  1 +
 t/t4103-apply-binary.sh | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/apply.c b/apply.c
index 44bc31d6eb..4ed4b27169 100644
--- a/apply.c
+++ b/apply.c
@@ -1917,6 +1917,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,

 	state->linenr++;
 	buffer += llen;
+	size -= llen;
 	while (1) {
 		int byte_length, max_byte_length, newsize;
 		llen = linelen(buffer, size);
diff --git a/t/t4103-apply-binary.sh b/t/t4103-apply-binary.sh
index fad6d3f542..d370ecfe0d 100755
--- a/t/t4103-apply-binary.sh
+++ b/t/t4103-apply-binary.sh
@@ -158,4 +158,27 @@ test_expect_success 'apply binary -p0 diff' '
 	test -z "$(git diff --name-status binary -- file3)"
 '

+test_expect_success 'reject truncated binary diff' '
+	do_reset &&
+
+	# this length is calculated to get us very close to
+	# the 8192-byte strbuf we will use to read in the patch.
+	test-tool genrandom foo 6205 >file1 &&
+	git diff --binary >patch &&
+
+	# truncate the patch at the second "literal" line,
+	# but exclude the trailing newline. We must use perl
+	# for this, since tools like "sed" cannot reliably
+	# produce output without the trailing newline.
+	perl -pe "
+		if (/^literal/ && \$count++ >= 1) {
+			chomp;
+			print;
+			exit 0;
+		}
+	" <patch >patch.trunc &&
+
+	do_reset &&
+	test_must_fail git apply patch.trunc
+'
 test_done
-- 
2.33.0.rc1.475.g023efe0ae4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] apply: keep buffer/size pair in sync when parsing binary hunks
  2021-08-10  1:01 [PATCH] apply: keep buffer/size pair in sync when parsing binary hunks Jeff King
@ 2021-08-10  5:26 ` Carlo Arenas
  2021-08-10 18:53   ` Jeff King
  2021-08-12 20:56   ` Felipe Contreras
  0 siblings, 2 replies; 4+ messages in thread
From: Carlo Arenas @ 2021-08-10  5:26 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Xingman Chen

On Mon, Aug 9, 2021 at 9:24 PM Jeff King <peff@peff.net> wrote:
> diff --git a/apply.c b/apply.c
> index 44bc31d6eb..4ed4b27169 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -1917,6 +1917,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
>
>         state->linenr++;
>         buffer += llen;
> +       size -= llen;
>         while (1) {

Ironically, I was looking at this code because of your previous
patch[1] that you suggested was ugly
and because I was going to suggest moving from a for to a while loop
to avoid the overly long line.

It is interesting to note though, that having a for (and obviously
removing the last 2 lines from the loop) with a comma separated
increment instead would
have made this issue IMHO more obvious, and also why I decided against
that; would it be a good idea to fix that as well?

  for (; ; buffer += llen, size -= llen) {

Carlo

[1] https://lore.kernel.org/git/YRGwjgAIyLPb7g50@coredump.intra.peff.net/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] apply: keep buffer/size pair in sync when parsing binary hunks
  2021-08-10  5:26 ` Carlo Arenas
@ 2021-08-10 18:53   ` Jeff King
  2021-08-12 20:56   ` Felipe Contreras
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff King @ 2021-08-10 18:53 UTC (permalink / raw)
  To: Carlo Arenas; +Cc: git, Xingman Chen

On Mon, Aug 09, 2021 at 10:26:51PM -0700, Carlo Arenas wrote:

> On Mon, Aug 9, 2021 at 9:24 PM Jeff King <peff@peff.net> wrote:
> > diff --git a/apply.c b/apply.c
> > index 44bc31d6eb..4ed4b27169 100644
> > --- a/apply.c
> > +++ b/apply.c
> > @@ -1917,6 +1917,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
> >
> >         state->linenr++;
> >         buffer += llen;
> > +       size -= llen;
> >         while (1) {
> 
> Ironically, I was looking at this code because of your previous
> patch[1] that you suggested was ugly
> and because I was going to suggest moving from a for to a while loop
> to avoid the overly long line.
> 
> It is interesting to note though, that having a for (and obviously
> removing the last 2 lines from the loop) with a comma separated
> increment instead would
> have made this issue IMHO more obvious, and also why I decided against
> that; would it be a good idea to fix that as well?
> 
>   for (; ; buffer += llen, size -= llen) {

That wouldn't have fixed this issue, since it was actually before the
start of the loop that where the problem was. It might have made it more
obvious if the two parts were next to each other, but I'm not so sure.
There are lots of reasons why the stuff before a loop may not be
symmetric with the loop increment.

We could also initialize it like this:

  for (buffer += llen, size -= llen; ; buffer += llen, size -= llen)

but IMHO that is harder to read. The increment at the start is not part
of the loop initialization. It is really a "finishing" of the earlier
parsing that occurred (in fact, if I were writing this from scratch, I'd
probably have put extra whitespace before the start of the loop).

I did consider moving that other case to a while-loop, but then you have
to manually catch all of the loop continuations (which is verbose and
error-prone). So I dunno. There is no free lunch. ;)

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] apply: keep buffer/size pair in sync when parsing binary hunks
  2021-08-10  5:26 ` Carlo Arenas
  2021-08-10 18:53   ` Jeff King
@ 2021-08-12 20:56   ` Felipe Contreras
  1 sibling, 0 replies; 4+ messages in thread
From: Felipe Contreras @ 2021-08-12 20:56 UTC (permalink / raw)
  To: Carlo Arenas; +Cc: git, Xingman Chen

Carlo Arenas wrote:
> On Mon, Aug 9, 2021 at 9:24 PM Jeff King <peff@peff.net> wrote:
> > diff --git a/apply.c b/apply.c
> > index 44bc31d6eb..4ed4b27169 100644
> > --- a/apply.c
> > +++ b/apply.c
> > @@ -1917,6 +1917,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
> >
> >         state->linenr++;
> >         buffer += llen;
> > +       size -= llen;
> >         while (1) {
> 
> Ironically, I was looking at this code because of your previous
> patch[1] that you suggested was ugly
> and because I was going to suggest moving from a for to a while loop
> to avoid the overly long line.
> 
> It is interesting to note though, that having a for (and obviously
> removing the last 2 lines from the loop) with a comma separated
> increment instead would
> have made this issue IMHO more obvious, and also why I decided against
> that; would it be a good idea to fix that as well?

What's the point in updating size when it can be calculated? Just add a
pointer to the end of the buffer, and then size is the difference
between the buffer which is moving, and the end pointer which is fixed:

-- 8< --

diff --git a/apply.c b/apply.c
index 44bc31d6eb..cfb5a00356 100644
--- a/apply.c
+++ b/apply.c
@@ -1891,15 +1891,15 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
 	 * to 1-26 bytes, and 'a'-'z' corresponds to 27-52 bytes.
 	 */
 	int llen, used;
-	unsigned long size = *sz_p;
 	char *buffer = *buf_p;
+	char *end = *buf_p + *sz_p;
 	int patch_method;
 	unsigned long origlen;
 	char *data = NULL;
 	int hunk_size = 0;
 	struct fragment *frag;
 
-	llen = linelen(buffer, size);
+	llen = linelen(buffer, end - buffer);
 	used = llen;
 
 	*status_p = 0;
@@ -1919,13 +1919,12 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
 	buffer += llen;
 	while (1) {
 		int byte_length, max_byte_length, newsize;
-		llen = linelen(buffer, size);
+		llen = linelen(buffer, end - buffer);
 		used += llen;
 		state->linenr++;
 		if (llen == 1) {
 			/* consume the blank line */
 			buffer++;
-			size--;
 			break;
 		}
 		/*
@@ -1955,7 +1954,6 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
 			goto corrupt;
 		hunk_size = newsize;
 		buffer += llen;
-		size -= llen;
 	}
 
 	CALLOC_ARRAY(frag, 1);
@@ -1966,7 +1964,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
 	free(data);
 	frag->size = origlen;
 	*buf_p = buffer;
-	*sz_p = size;
+	*sz_p = end - buffer;
 	*used_p = used;
 	frag->binary_patch_method = patch_method;
 	return frag;

-- 
Felipe Contreras

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-08-12 20:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-10  1:01 [PATCH] apply: keep buffer/size pair in sync when parsing binary hunks Jeff King
2021-08-10  5:26 ` Carlo Arenas
2021-08-10 18:53   ` Jeff King
2021-08-12 20:56   ` Felipe Contreras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).