* New index-pack "keep" violates "never overwrite"
@ 2007-03-20 5:38 Shawn O. Pearce
2007-03-20 19:07 ` Nicolas Pitre
2007-03-20 19:32 ` [PATCH] don't ever allow SHA1 collisions to exist by fetching a pack Nicolas Pitre
0 siblings, 2 replies; 3+ messages in thread
From: Shawn O. Pearce @ 2007-03-20 5:38 UTC (permalink / raw)
To: git
This is something that has been bothering me for several weeks now.
Waaaaaaay back Git was considered to be secure as it never overwrote
an object it already had. This was ensured by always unpacking the
packfile received over the network (both in fetch and receive-pack)
and our already existing logic to not create a loose object for an
object we already have.
Lately however we keep "large-ish" packfiles on both fetch and push
by running them through index-pack instead of unpack-objects. This
would let an attacker perform a birthday attack.
How? Assume the attacker knows a SHA-1 that has two different
data streams. He knows the client is likely to have the "good"
one. So he sends the "evil" variant to the other end as part of
a "large-ish" packfile. The recipient keeps that packfile, and
indexes it. Now since this is a birthday attack there is a SHA-1
collision; two objects exist in the repository with the same SHA-1.
They have *very* different data streams. One of them is "evil".
Currently the poor recipient cannot tell the two objects apart,
short of by examining the timestamp of the packfiles. But lets
say the recipient repacks before he realizes he's been attacked.
We may wind up packing the "evil" version of the object, and deleting
the "good" one. This is made *even more likely* by Junio's recent
rearrange_packed_git patch (b867092f).
SHA-1 is generally considered to be broken, as there have been some
attacks implemented where a massive amount of garbage is injected
into a comment, producing a source file that a compiler/interpreter
can still process just fine, but that contains "evil bits of code"
and has the same hash as a "non-evil" version of that same file.
Yes, of course, if you look at the comment you would immediately
realize its crap. You probably would even realize the file is crap
just by looking at the file size, as typically several megabytes
of garbage is required.
But how likely are you to look at a file content, or even size,
during say git-bisect? Especially on a large project? Would you
really notice that "usb.c" took 3 seconds longer than normal to
compile because the preprecessor had to wade through a gigantic
garbage comment?
We broke a fundemental assumption in the Git security model, and I
don't think anyone blinked. Oops. Either the SHA-1 birthday attack
I just described is still thought to be a non-issue for at least
the next few years (due to current computing power limitations),
or we all missed that one, big time.
The fix does appear to be simple. Just don't write the existing
object to the output packfile. But really that is a lot more like
what unpack-objects does: buffer deltas we cannot resolve yet, and
only write out what we cannot find through our ODB. The logic in
index-pack ain't built for that...
For those that are really paranoid about this, you can disable the
pack keeping by setting transfer.unpackLimit to a *huge* value,
one that is far larger than any packfile you might receive.
--
Shawn.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: New index-pack "keep" violates "never overwrite"
2007-03-20 5:38 New index-pack "keep" violates "never overwrite" Shawn O. Pearce
@ 2007-03-20 19:07 ` Nicolas Pitre
2007-03-20 19:32 ` [PATCH] don't ever allow SHA1 collisions to exist by fetching a pack Nicolas Pitre
1 sibling, 0 replies; 3+ messages in thread
From: Nicolas Pitre @ 2007-03-20 19:07 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
On Tue, 20 Mar 2007, Shawn O. Pearce wrote:
> This is something that has been bothering me for several weeks now.
>
> Waaaaaaay back Git was considered to be secure as it never overwrote
> an object it already had. This was ensured by always unpacking the
> packfile received over the network (both in fetch and receive-pack)
> and our already existing logic to not create a loose object for an
> object we already have.
>
> Lately however we keep "large-ish" packfiles on both fetch and push
> by running them through index-pack instead of unpack-objects. This
> would let an attacker perform a birthday attack.
This has been nagging me as well since I made the index-pack changes.
But I dismissed that possibility as pure paranoia.
[...]
> The fix does appear to be simple. Just don't write the existing
> object to the output packfile. But really that is a lot more like
> what unpack-objects does: buffer deltas we cannot resolve yet, and
> only write out what we cannot find through our ODB. The logic in
> index-pack ain't built for that...
Nah. You would have to fix every delta base offsets up etc. and make
the whole thing into a complex mess.
Instead, we only have to check for the local presence of every objects
in the pack, and when one packed object does exist locally and
doesn't match the pack content we simply need to bail out with great
distress. If we really get a SHA1 collision, evil or not, we simply
just cannot continue as if nothing happened.
Patch to follow...
Nicolas
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] don't ever allow SHA1 collisions to exist by fetching a pack
2007-03-20 5:38 New index-pack "keep" violates "never overwrite" Shawn O. Pearce
2007-03-20 19:07 ` Nicolas Pitre
@ 2007-03-20 19:32 ` Nicolas Pitre
1 sibling, 0 replies; 3+ messages in thread
From: Nicolas Pitre @ 2007-03-20 19:32 UTC (permalink / raw)
To: Junio C Hamano, Shawn O. Pearce; +Cc: git
Waaaaaaay back Git was considered to be secure as it never overwrote
an object it already had. This was ensured by always unpacking the
packfile received over the network (both in fetch and receive-pack)
and our already existing logic to not create a loose object for an
object we already have.
Lately however we keep "large-ish" packfiles on both fetch and push
by running them through index-pack instead of unpack-objects. This
would let an attacker perform a birthday attack.
How? Assume the attacker knows a SHA-1 that has two different
data streams. He knows the client is likely to have the "good"
one. So he sends the "evil" variant to the other end as part of
a "large-ish" packfile. The recipient keeps that packfile, and
indexes it. Now since this is a birthday attack there is a SHA-1
collision; two objects exist in the repository with the same SHA-1.
They have *very* different data streams. One of them is "evil".
Currently the poor recipient cannot tell the two objects apart,
short of by examining the timestamp of the packfiles. But lets
say the recipient repacks before he realizes he's been attacked.
We may wind up packing the "evil" version of the object, and deleting
the "good" one. This is made *even more likely* by Junio's recent
rearrange_packed_git patch (b867092f).
It is extremely unlikely for a SHA1 collisions to occur, but if it
ever happens with a remote (hence untrusted) object we simply must
not let the fetch succeed.
Normally received packs should not contain objects we already have.
But when they do we must ensure duplicated objects with the same SHA1
actually contain the same data.
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
Explanation text for the log message shamelessly stolen from Shawn's
email on this issue.
I also provided a test for this otherwise it might never get exercised.
diff --git a/index-pack.c b/index-pack.c
index b405864..4effb2d 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -345,7 +345,8 @@ static int find_delta_children(const union delta_base *base,
}
static void sha1_object(const void *data, unsigned long size,
- enum object_type type, unsigned char *sha1)
+ enum object_type type, unsigned char *sha1,
+ int test_for_collision)
{
SHA_CTX ctx;
char header[50];
@@ -367,6 +368,18 @@ static void sha1_object(const void *data, unsigned long size,
SHA1_Update(&ctx, header, header_size);
SHA1_Update(&ctx, data, size);
SHA1_Final(sha1, &ctx);
+
+ if (test_for_collision && has_sha1_file(sha1)) {
+ void *has_data;
+ enum object_type has_type;
+ unsigned long has_size;
+ has_data = read_sha1_file(sha1, &has_type, &has_size);
+ if (!has_data)
+ die("cannot read existing object %s", sha1_to_hex(sha1));
+ if (size != has_size || type != has_type ||
+ memcmp(data, has_data, size) != 0)
+ die("SHA1 COLLISION FOUND WITH %s !", sha1_to_hex(sha1));
+ }
}
static void resolve_delta(struct object_entry *delta_obj, void *base_data,
@@ -387,7 +400,7 @@ static void resolve_delta(struct object_entry *delta_obj, void *base_data,
free(delta_data);
if (!result)
bad_object(delta_obj->offset, "failed to apply delta");
- sha1_object(result, result_size, type, delta_obj->sha1);
+ sha1_object(result, result_size, type, delta_obj->sha1, 1);
nr_resolved_deltas++;
hashcpy(delta_base.sha1, delta_obj->sha1);
@@ -444,7 +457,7 @@ static void parse_pack_objects(unsigned char *sha1)
delta->obj_no = i;
delta++;
} else
- sha1_object(data, obj->size, obj->type, obj->sha1);
+ sha1_object(data, obj->size, obj->type, obj->sha1, 1);
free(data);
if (verbose)
percent = display_progress(i+1, nr_objects, percent);
@@ -565,7 +578,7 @@ static void append_obj_to_pack(void *buf,
write_or_die(output_fd, header, n);
obj[1].offset = obj[0].offset + n;
obj[1].offset += write_compressed(output_fd, buf, size);
- sha1_object(buf, size, type, obj->sha1);
+ sha1_object(buf, size, type, obj->sha1, 0);
}
static int delta_pos_compare(const void *_a, const void *_b)
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index eacb1e9..75e7276 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -255,4 +255,14 @@ test_expect_success \
:'
+test_expect_success \
+ 'fake a SHA1 hash collision' \
+ 'test -f .git/objects/c8/2de19312b6c3695c0c18f70709a6c535682a67 &&
+ cp -f .git/objects/9d/235ed07cd19811a6ceb342de82f190e49c9f68 \
+ .git/objects/c8/2de19312b6c3695c0c18f70709a6c535682a67'
+
+test_expect_failure \
+ 'make sure index-pack detects the SHA1 collision' \
+ 'git-index-pack -o bad.idx test-3.pack'
+
test_done
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-03-20 19:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-20 5:38 New index-pack "keep" violates "never overwrite" Shawn O. Pearce
2007-03-20 19:07 ` Nicolas Pitre
2007-03-20 19:32 ` [PATCH] don't ever allow SHA1 collisions to exist by fetching a pack Nicolas Pitre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).