Re: About git and the use of SHA-1

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Andreas Ericsson <ae@op5.se>
To: Henrik Austad <henrikau@orakel.ntnu.no>
Cc: Daniel Barkalow <barkalow@iabervon.org>, git@vger.kernel.org
Subject: Re: About git and the use of SHA-1
Date: Tue, 29 Apr 2008 08:38:37 +0200	[thread overview]
Message-ID: <4816C26D.9010304@op5.se> (raw)
In-Reply-To: <200804282329.21336.henrikau@orakel.ntnu.no>

Henrik Austad wrote:
> On Monday 28 April 2008 21:34:50 Daniel Barkalow wrote:
>> On Mon, 28 Apr 2008, Henrik Austad wrote:
>>> Hi list!
>>>
>>> As far as I have gathered, the SHA-1-sum is used as a identifier for
>>> commits, and that is the primary reason for using sha1.  However, several
>>> places (including the google tech-talk featuring Linus himself) states
>>> that the id's are cryptographically secure.
>>>
>>> As discussed in [1], SHA-1 is not as secure as it once was (and this was
>>> in 2005), and I'm wondering - are there any plans for migrating to
>>> another hash-algorithm? I.e. SHA-2, whirlpool..
>> No. The cryptographic security we care about is that it's impractical to
>> come up with another set of content that hashes to the same value as a
>> given set of content. The known attacks on SHA-1 (and more broken earlier
>> hashes in the same general class) only allow the attacker to produce two
>> files that will collide. Now, it's true that this would allow somebody to
>> produce a commit where some people see the "good" blob and some people see
>> the "evil" blob, but (a) the "good" blob contains some large chunk of
>> random data, which is a major red flag by itself, and (b) all of these
>> people have to be taking data from the attacker.
> 
> yes, I can see that point, but I was thinking more along the line of:
> 
> 1) clone repo
> 2) add malicious code
> 3) add a huge block of comment, ifdef-block etc somewhere obscure in the code 
> and keep adding random data untill hash matches a well-known release.
> 4) publish repo, or even worse, change central repo
> 

This depends greatly on git accepting objects with a colliding object-name,
which it doesn't. Once you have an object with a particular SHA1, it will
never get overwritten, ever, as git will believe it's about to do unnecessary
work. As such, you'd still have to create a new object, hashing to a new SHA1
and get that new object added to the kernel.

I think perhaps Andrew Morton and a few other "high brass" among the kernel
hackers can get away with pushing crud like that to Linus' public tree
(which is the de facto master copy of published kernel sources), but random
John Doe's such as you and me wouldn't stand a chance, as our patches would
get reviewed by someone who, at the end of the day, makes a living coding
Linux.

> Most users, and probably a lot of developers never browse through the *entire* 
> archive looking for this, and as long as the hash checks out - why would you? 
> Yes, it would probably be discovered soon enough, but take the linux kernel 
> as an example - if you get, say 100 infected machines due to this, what would 
> this do to the reputation of the kernel?
> 

That depends. If the source of it was Linus' public tree, that would not be
very good at all. If the source was a random tarball off a random webpage
or ftp site (which would be the same as fetching and, unverified, using an
unchecked git repository), I doubt it would matter much.

> 
>> If somebody gives you some source, and it's got some large random chunk in
>> it, and the behavior of the object depends on the content of this chunk,
>> and it's unspecified where this chunk comes from, you should be aware
>> that they might be able to swap this chunk for a different chunk. But such
>> a file is pretty blatantly malicious anyway.
> 
> True, but this actually means you have to verify *everything*, even though the 
> hash checks out.
> 

Not really. What you need to verify is that
a) You cloned from somewhere you trust (kernel.org, fe)
b) The SHA1 of the commit you want to build from matches the SHA1 of the same
commit in the repository you originally cloned from.

Colliding objects can never enter a repository. Git is lazy and will reuse the
already existing colliding object with the same name instead.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

next prev parent reply	other threads:[~2008-04-29  6:39 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-28 16:29 About git and the use of SHA-1 Henrik Austad
2008-04-28 19:34 ` Daniel Barkalow
2008-04-28 21:29   ` Henrik Austad
2008-04-28 22:15     ` Daniel Barkalow
2008-04-29  6:38     ` Andreas Ericsson [this message]
2008-04-29  7:09       ` Russ Dill
2008-04-29  7:21         ` Andreas Ericsson
2008-04-29 11:05           ` Sverre Rabbelier
2008-04-29 12:27             ` Andreas Ericsson
2008-04-29 13:05               ` Paolo Bonzini
2008-04-29 14:37                 ` Andreas Ericsson
2008-04-29 14:52                   ` Paolo Bonzini
2008-04-29 16:24                   ` Russ Dill
2008-04-29 12:46         ` Jurko Gospodnetić
2008-04-29 16:21           ` Russ Dill
2008-04-29 15:34   ` Geoffrey Irving
2008-04-29 16:27     ` Daniel Barkalow
2008-04-29 12:41 ` Dmitry Potapov
2008-04-29 14:41   ` Andreas Ericsson
2008-04-29 15:42     ` Nicolas Pitre
2008-04-29 15:59       ` Geoffrey Irving
2008-04-29 16:39         ` Nicolas Pitre
2008-04-29 17:48           ` Geoffrey Irving
2008-04-29 17:55             ` Nicolas Pitre
2008-04-29 18:02               ` Geoffrey Irving
2008-04-29 18:41                 ` Daniel Barkalow
2008-04-29 20:31                   ` Geoffrey Irving
2008-04-29 20:50                     ` Fredrik Skolmli
2008-04-29 21:39                       ` Geoffrey Irving
2008-04-29 21:52                         ` Fredrik Skolmli
2008-04-30  2:58                     ` Martin Langhoff
2008-04-30  5:18                       ` Geoffrey Irving
2008-04-30  5:47                         ` David Brown
2008-04-30  5:56                           ` Martin Langhoff
2008-04-29 18:17         ` Matthieu Moy
2008-04-29 18:23           ` Fredrik Skolmli
2008-04-29 15:02 ` Tom Widmer
2008-04-29 17:08 ` Tom Widmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4816C26D.9010304@op5.se \
    --to=ae@op5.se \
    --cc=barkalow@iabervon.org \
    --cc=git@vger.kernel.org \
    --cc=henrikau@orakel.ntnu.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).