From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff King Subject: Re: Suggestion on hashing Date: Fri, 2 Dec 2011 12:54:44 -0500 Message-ID: <20111202175444.GB24093@sigill.intra.peff.net> References: <1322813319.4340.109.camel@yos> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: git@vger.kernel.org, pclouds@gmail.com To: Bill Zaumen X-From: git-owner@vger.kernel.org Fri Dec 02 18:54:52 2011 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1RWXJz-00033l-DG for gcvg-git-2@lo.gmane.org; Fri, 02 Dec 2011 18:54:51 +0100 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757288Ab1LBRyr (ORCPT ); Fri, 2 Dec 2011 12:54:47 -0500 Received: from 99-108-226-0.lightspeed.iplsin.sbcglobal.net ([99.108.226.0]:39070 "EHLO peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757189Ab1LBRyq (ORCPT ); Fri, 2 Dec 2011 12:54:46 -0500 Received: (qmail 8616 invoked by uid 107); 2 Dec 2011 18:01:22 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) (smtp-auth username relayok, mechanism cram-md5) by peff.net (qpsmtpd/0.84) with ESMTPA; Fri, 02 Dec 2011 13:01:22 -0500 Received: by sigill.intra.peff.net (sSMTP sendmail emulation); Fri, 02 Dec 2011 12:54:44 -0500 Content-Disposition: inline In-Reply-To: <1322813319.4340.109.camel@yos> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Fri, Dec 02, 2011 at 12:08:39AM -0800, Bill Zaumen wrote: > At one point Nguyen said that "What I'm thinking is whether it's > possible to decouple two sha-1 roles in git, as object identifier > and digest, separately. Each sha-1 identifies an object and an extra > set of digests on the "same" object." > > My code pretty much does that (it just uses a CRC instead of a real > digest, but I can easily change that). So the question is whether > using SHA-1 as an ID and SHA-256(?) as a digest is a better long term > solution than simply replacing SHA-1. I think your code is solving the wrong problem (or solving the right problem in a half-way manner). The only things that make sense to me are: 1. Do nothing. SHA-1 is probably not broken yet, even by the NSA, and even if it is, an attack is extremely expensive to mount. This may change in the future, of course, but it will probably stay expensive for a while. 2. Decouple the object identifier and digest roles, but insert the digest into newly created objects, so it can be part of the signature chain. I described such a scheme in one of my replies to you. It has some complexities, but has the bonus that we can build directly on older history, preserving its sha1s. 3. Replace SHA-1 with a more secure algorithm. I'm probably in favor of (1) at this point. Whether to do (2) or (3) will depend on where we are when SHA-1 gets feasibly broken. It may be many years away, at which point we may be considering a git 2.0 that breaks repository compatibility, anyway. That would be a natural time to consider changing the algorithm. > Replacing SHA-1 with something like SHA-256 sounds easier to implement, > but the problem is all the existing repositories. Right. I don't think anyone is denying that it would be a giant pain. -Peff