* [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents]
@ 2005-06-12 8:25 Petr Baudis
2005-06-12 13:14 ` Morten Welinder
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Petr Baudis @ 2005-06-12 8:25 UTC (permalink / raw)
To: git; +Cc: torvalds
----- Forwarded message from zooko@zooko.com -----
There is nothing theoretically surprising about this, but hopefully its
concreteness and the accompanying scenario will make an impression on people
on people. The same technique should work to generate two documents with
identical SHA1 hashes.
http://www.cits.rub.de/MD5Collisions/
----- End forwarded message -----
I expected the two postscript files differing in some huge binary blob,
but it turns out the binary part is very small (about 256 bytes) and
only few (about nine) bytes are different, contrary to how people have
predicted the collisions. This is much more close to finding a collision
between similar pure C files, I think. Rather unsettling.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
<Espy> be careful, some twit might quote you out of context..
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents]
2005-06-12 8:25 [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents] Petr Baudis
@ 2005-06-12 13:14 ` Morten Welinder
2005-06-12 14:53 ` Martin Uecker
2005-06-12 17:03 ` Linus Torvalds
2 siblings, 0 replies; 5+ messages in thread
From: Morten Welinder @ 2005-06-12 13:14 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
I looked at this.
If you can find just *one* colliding pair that does not contain a
few select bytes like 0x22 (quote), 0x5c (backslash) and
perhaps 0x0a then it is trivial to make a colliding pair of
C programs.
Call the initial pair (A,B) and make the programs
/* Common prefix that happens to have integer block size
ending just before the A below. */
const char junk[] = "AB";
/* "AA" in program 1, "AB" in program 2. */
...
int
main ()
{
int hsize = sizeof (junk) / 2;
if (memcmp (junk, junk + hsize, hsize))
return do_program_2 ();
else
return do_program_1 ();
}
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents]
2005-06-12 8:25 [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents] Petr Baudis
2005-06-12 13:14 ` Morten Welinder
@ 2005-06-12 14:53 ` Martin Uecker
2005-06-12 17:03 ` Linus Torvalds
2 siblings, 0 replies; 5+ messages in thread
From: Martin Uecker @ 2005-06-12 14:53 UTC (permalink / raw)
To: git; +Cc: zooko
[-- Attachment #1: Type: text/plain, Size: 1427 bytes --]
On Sun, Jun 12, 2005 at 10:25:55AM +0200, Petr Baudis wrote:
> ----- Forwarded message from zooko@zooko.com -----
>
> There is nothing theoretically surprising about this, but hopefully its
> concreteness and the accompanying scenario will make an impression on people
> on people. The same technique should work to generate two documents with
> identical SHA1 hashes.
>
> http://www.cits.rub.de/MD5Collisions/
>
> ----- End forwarded message -----
>
> I expected the two postscript files differing in some huge binary blob,
> but it turns out the binary part is very small (about 256 bytes) and
> only few (about nine) bytes are different, contrary to how people have
> predicted the collisions. This is much more close to finding a collision
> between similar pure C files, I think. Rather unsettling.
>
This attack scenario doesn't demonstrate the danger of hash
collisions but the danger of signing documents you do not
understand. The same technique works exactly in the same way
with postscript files which are actually identical but produce
different output under different conditions (time, fonts
installed on the printer whatever).
Never sign anything but plain text or documents which are
created in a controlled way and avoid signing documents
you did not create yourself.
Martin
--
One night, when little Giana from Milano was fast asleep,
she had a strange dream.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents]
2005-06-12 8:25 [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents] Petr Baudis
2005-06-12 13:14 ` Morten Welinder
2005-06-12 14:53 ` Martin Uecker
@ 2005-06-12 17:03 ` Linus Torvalds
2005-06-14 2:06 ` Daniel Barkalow
2 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2005-06-12 17:03 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
On Sun, 12 Jun 2005, Petr Baudis wrote:
>
> I expected the two postscript files differing in some huge binary blob,
> but it turns out the binary part is very small (about 256 bytes) and
> only few (about nine) bytes are different, contrary to how people have
> predicted the collisions. This is much more close to finding a collision
> between similar pure C files, I think. Rather unsettling.
This is not close at all. The "small" binary blob (256 bytes) only encodes
one single bit of information.
In other words, they've really changed _one_ bit of information by doing a
256-byte random binary blob. Anybody who calls that "small" didn't really
look closely.
Is it clever? Yes. But it isn't about making one C file look like another,
it's using the property of controlling _both_ of the files, and making
them contain all the information, and then making the the single-bit
change collapse the output into two different modes by using a postscript
interpreter to make it print out the same.
Is it a real problem? Yes, because a _lot_ of document formats are
structured and are amenable to things like this. But the problem here is
the fact that you can fool somebody into signing something without
realizing that it has a lot of hidden information thanks to having formats
that can hide the blobs.
So the problem is totally different from the way git uses a hash. In the
git model, an attacker by definition cannot control both versions of a
file, since if he controls just _one_ version, he doesn't need to do the
attack in the first place!
Put another way: you could use this exact example for a version of git
that uses md5-sums instead of sha1's, but it wouldn't show anything at all
about a git vulnerability even so.
The one thing it does show is that you should probably never sign anything
but a nice human-readable ASCII file that you actually opened in your own
editor.
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents]
2005-06-12 17:03 ` Linus Torvalds
@ 2005-06-14 2:06 ` Daniel Barkalow
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Barkalow @ 2005-06-14 2:06 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Petr Baudis, git
On Sun, 12 Jun 2005, Linus Torvalds wrote:
> Put another way: you could use this exact example for a version of git
> that uses md5-sums instead of sha1's, but it wouldn't show anything at all
> about a git vulnerability even so.
You couldn't use this exact example for an md5 git; git compresses the
files before hashing, which means that you don't have an md5 block of
arbitrary data you can replace with a different arbitrary block because it
wouldn't decompress.
Of course, if zlib has a way of saying, "if bytes 256-511 match 512-767,
decompress the first of the two records starting at 768, otherwise
decompress the second" then the attack would work, and we should all by
worried (and disturbed by zlib in general). Chances are that it would be
impractical to find a pair of blocks such that they are both valid in the
same part of a zlib record and both leave the compression context such
that the same remaining content decompresses successfully and both have
the same md5 hash, let alone getting the results in both cases to be valid
C that depends on the difference between the blocks. It's possible that
you could get it to work with only a moderately large number of weak
collisions between very similar blocks, but it's not nearly so easy a
task.
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-06-14 2:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-12 8:25 [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful documents] Petr Baudis
2005-06-12 13:14 ` Morten Welinder
2005-06-12 14:53 ` Martin Uecker
2005-06-12 17:03 ` Linus Torvalds
2005-06-14 2:06 ` Daniel Barkalow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).