From: Andy Isaacson <adi@hexapodia.org>
To: omb@bluewin.ch
Cc: git@vger.kernel.org
Subject: Re: SHA1 hash safety
Date: Mon, 18 Apr 2005 00:43:23 -0700 [thread overview]
Message-ID: <20050418074323.GA29765@hexapodia.org> (raw)
In-Reply-To: <4261852B.6090507@khandalf.com>
[trimmed cc list, nobody wants to read this noise]
On Sat, Apr 16, 2005 at 11:35:39PM +0200, Brian O'Mahoney wrote:
> >> (1) I _have_ seen real-life collisions with MD5, in the context of
> >> Document management systems containing ~10^6 ms-WORD documents.
> >
> > Dude! You could have been *famous*! Why the
> > aitch-ee-double-hockey-sticks didn't you publish this when you found it?
> > Seriously, man.
>
> The MD5 has was fine, or at least the code (a) produced the correct
> results on the published test cases, and, (b) was properly applied to
> all bytes of the file(s). I was surprised when it happened, which is why
> I bothered to post to this list at this time, so I make two more points
OK, I guess it's time for some remedial math.
There are 2^128 = 340282366920938463463374607431768211456 different MD5
hashes.
You are suggesting that you found a collision using ~1e6 = ~1,000,000
plaintexts.
Let's suppose there were actually 100,000,000 = 1e8 plaintexts, just in
case you underestimated the number.
Applying the birthday paradox, we have a 50% probability that you'd find
one collision if there were ~7,213,475,309,916,173 possible hash values.
If you extend the birthday argument ("what is the probability of a
collision if you take N samples from a set of size M?") you get the
following results, with N = 1e8:
50% (1 in 2) probability of collision in 7,213,475,309,916,173.
1% (1 in 100) probability of collision in 497,496,027,172,833,194.
.05% (1 in 1845) probability of collision in 9,223,372,036,854,775,806.
That's where my quick-and-dirty solver craps out, but we're still a
really long ways from
340,282,366,920,938,463,463,374,607,431,768,211,456.
A simple linear extrapolation (certainly wrong, but not by more than a
few dozen orders of magnitude) says that the odds would be
1 in 68,056,473,384,187,692,692 for the full MD5 hash (I'm not even
going to dignify that with a percentage).
I'm not going to do the sums, but I would hazard a guess that it's more
likely your PC suffered a cosmic-ray-induced memory fault - EACH OF THE
FOUR TIMES YOU TESTED IT - causing it to report the same MD5, than that
you actually discovered a collision with a measly million (or even
hundred million) plaintexts.
(Of course, I don't know how many tests of the hash you actually did.
But the point stands.)
Hell, if you're *that* lucky, what are you doing in IT? You could be
making a killing at the roulette table.
Or, even more likely, there was some other factor in the system (most
likely that it was only using a few bits, probably 32, of the hash
when looking for collisions) that resulted in a false alarm.
If you had actual evidence of a collision, I'd love to see it - even if
it's just the equivalent of
% md5 foo
d3b07384d113edec49eaa6238ad5ff00 foo
% md5 bar
d3b07384d113edec49eaa6238ad5ff00 bar
% cmp foo bar
foo bar differ: byte 25, line 1
%
But in the absence of actual evidence, we have to assume (just based on
the probabilities) that there was some error in your testing.
-andy
next prev parent reply other threads:[~2005-04-18 7:39 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-16 12:24 SHA1 hash safety David Lang
2005-04-16 12:31 ` Ingo Molnar
2005-04-16 12:48 ` David Lang
2005-04-16 13:29 ` Brian O'Mahoney
2005-04-16 14:58 ` C. Scott Ananian
2005-04-16 15:11 ` Petr Baudis
2005-04-16 15:36 ` C. Scott Ananian
2005-04-16 22:56 ` David Lang
2005-04-16 23:11 ` Paul Jackson
2005-04-16 23:18 ` Martin Mares
2005-04-17 4:38 ` David A. Wheeler
2005-04-18 0:09 ` Theodore Ts'o
2005-04-16 15:49 ` ross
2005-04-17 6:35 ` Horst von Brand
2005-04-18 2:07 ` Brian O'Mahoney
2005-04-18 16:50 ` C. Scott Ananian
2005-04-16 19:16 ` Paul Jackson
2005-04-16 21:35 ` Brian O'Mahoney
2005-04-18 7:43 ` Andy Isaacson [this message]
2005-04-18 17:04 ` C. Scott Ananian
2005-04-19 22:30 ` David Meybohm
2005-04-19 22:48 ` C. Scott Ananian
2005-04-20 18:56 ` David Meybohm
2005-04-16 22:46 ` David Lang
2005-04-16 23:14 ` Paul Jackson
2005-04-16 22:33 ` David Lang
2005-04-17 3:23 ` Tkil
2005-04-17 4:09 ` Paul Jackson
2005-04-17 4:43 ` Tkil
2005-04-17 5:09 ` Paul Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050418074323.GA29765@hexapodia.org \
--to=adi@hexapodia.org \
--cc=git@vger.kernel.org \
--cc=omb@bluewin.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).