From mboxrd@z Thu Jan 1 00:00:00 1970 From: linux@horizon.com Subject: Re: [zooko@zooko.com: [Revctrl] colliding md5 hashes of human-meaningful Date: 13 Jun 2005 23:03:24 -0000 Message-ID: <20050613230324.32002.qmail@science.horizon.com> References: Cc: git@vger.kernel.org, linux@horizon.com X-From: git-owner@vger.kernel.org Tue Jun 14 01:03:15 2005 Return-path: Received: from vger.kernel.org ([12.107.209.244]) by ciao.gmane.org with esmtp (Exim 4.43) id 1DhxxH-0005Gu-P9 for gcvg-git@gmane.org; Tue, 14 Jun 2005 01:02:56 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261647AbVFMXHT (ORCPT ); Mon, 13 Jun 2005 19:07:19 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261646AbVFMXFv (ORCPT ); Mon, 13 Jun 2005 19:05:51 -0400 Received: from science.horizon.com ([192.35.100.1]:25406 "HELO science.horizon.com") by vger.kernel.org with SMTP id S261183AbVFMXDZ (ORCPT ); Mon, 13 Jun 2005 19:03:25 -0400 Received: (qmail 32003 invoked by uid 1000); 13 Jun 2005 23:03:24 -0000 To: torvalds@osdl.org In-Reply-To: Sender: git-owner@vger.kernel.org Precedence: bulk X-Mailing-List: git@vger.kernel.org Oh, for heaven's sake. This is getting to be entirely too big an issue. For the record, I believe: - There are far easier ways to back-door the public Linux kernel. (See the recent "subtly malicious code" contest.) - There are far easier ways to back-door a particular target's kernel. - There are easier ways than back-dooring their kernel to attack a particular target's computer. - A 2^80 work factor is sufficient protection for quite a few years, especially as the processor speed growth curve finally seems to be flattening. All I was trying to point out is that a sqrt(n) hash collision attack *exists*. Not that it's practical or anything. For the quantity of computers required, it would be *cheaper* to start a factory making my own line of PCI bus master network cards that included remote memory sniffing features on-board! > Yeah, yeah. You avoided a few laws of phsyics of your own. > > For example, when you say > > "(With an n-bit hash and an automated way to make harmless changes > to source files, I can generate 2^(n/2) variants of each and expect to > get a match, even in the absence of a better attack.)" > > you kind of ignore the fact that "n" here is 160, and so you're going to > be searching for quite a few versions of each. Also, you have to compare > the sha's of all of those 2**80 versions against each other which is a lot > of work in itself. I am most definitely NOT ignoring it. That's what makes the attack infeasible. (But I insist that 2^80 is a hell of a long way from violating any laws of physics.) As for the comparison, if the two texts have the same form, there are well-known ways to reduce the storage requirements by large factors like 2^40. For texts of different form, regular Pollard's rho doesn't work, but maybe something ingenious is possible. All I was trying to say is that a sqrt(2^160) attack exists. Remember the original statement? >>> So the problem is totally different from the way git uses a hash. In the >>> git model, an attacker by definition cannot control both versions of a >>> file, since if he controls just _one_ version, he doesn't need to do the >>> attack in the first place! >>> >>> Put another way: you could use this exact example for a version of git >>> that uses md5-sums instead of sha1's, but it wouldn't show anything at all >>> about a git vulnerability even so. *That's* what I was responding to. It basically says "there is no collision attack on the way git uses the hash", a statement I disagree with. I didn't say it was a realistic attack, because the whole reason that SHA is 160 bits is so that it's resistant to a hash collision attack! > Finally, you have to make sure that al the versions make sense, and that > people will take them 100% unmodified. Making them all make sense isn't hard; I can come up with 80 binary variants on a piece of code by changing order of helper functions, order of variable declarations, order of statements, comments, spacing, and whatnot without even having to get into trailing or redundant whitespace. Making sure that nobody messes with it is trickier. That would be easier with a small patch, but that would in turn make the problem of generating 2^80 plausible variants harder. I'm reaching a bit, but hey, it was a quick description, not a fully-fledged Cunning Plan. > My plan was more interesting, I feel. Oh for you, sure. But as the one you had nominated to carry it out, I'd like to object! For plausibility, I prefer mine for manufacturing network cards with built-in back doors. There's a chance I could even get the Chinese government to help fund it.