* About git and the use of SHA-1 @ 2008-04-28 16:29 Henrik Austad 2008-04-28 19:34 ` Daniel Barkalow ` (3 more replies) 0 siblings, 4 replies; 38+ messages in thread From: Henrik Austad @ 2008-04-28 16:29 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 570 bytes --] Hi list! As far as I have gathered, the SHA-1-sum is used as a identifier for commits, and that is the primary reason for using sha1. However, several places (including the google tech-talk featuring Linus himself) states that the id's are cryptographically secure. As discussed in [1], SHA-1 is not as secure as it once was (and this was in 2005), and I'm wondering - are there any plans for migrating to another hash-algorithm? I.e. SHA-2, whirlpool.. [1] http://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html -- mvh Henrik Austad [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 16:29 About git and the use of SHA-1 Henrik Austad @ 2008-04-28 19:34 ` Daniel Barkalow 2008-04-28 21:29 ` Henrik Austad 2008-04-29 15:34 ` Geoffrey Irving 2008-04-29 12:41 ` Dmitry Potapov ` (2 subsequent siblings) 3 siblings, 2 replies; 38+ messages in thread From: Daniel Barkalow @ 2008-04-28 19:34 UTC (permalink / raw) To: Henrik Austad; +Cc: git On Mon, 28 Apr 2008, Henrik Austad wrote: > Hi list! > > As far as I have gathered, the SHA-1-sum is used as a identifier for commits, > and that is the primary reason for using sha1. However, several places > (including the google tech-talk featuring Linus himself) states that the id's > are cryptographically secure. > > As discussed in [1], SHA-1 is not as secure as it once was (and this was in > 2005), and I'm wondering - are there any plans for migrating to another > hash-algorithm? I.e. SHA-2, whirlpool.. No. The cryptographic security we care about is that it's impractical to come up with another set of content that hashes to the same value as a given set of content. The known attacks on SHA-1 (and more broken earlier hashes in the same general class) only allow the attacker to produce two files that will collide. Now, it's true that this would allow somebody to produce a commit where some people see the "good" blob and some people see the "evil" blob, but (a) the "good" blob contains some large chunk of random data, which is a major red flag by itself, and (b) all of these people have to be taking data from the attacker. If somebody gives you some source, and it's got some large random chunk in it, and the behavior of the object depends on the content of this chunk, and it's unspecified where this chunk comes from, you should be aware that they might be able to swap this chunk for a different chunk. But such a file is pretty blatantly malicious anyway. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 19:34 ` Daniel Barkalow @ 2008-04-28 21:29 ` Henrik Austad 2008-04-28 22:15 ` Daniel Barkalow 2008-04-29 6:38 ` Andreas Ericsson 2008-04-29 15:34 ` Geoffrey Irving 1 sibling, 2 replies; 38+ messages in thread From: Henrik Austad @ 2008-04-28 21:29 UTC (permalink / raw) To: Daniel Barkalow; +Cc: git [-- Attachment #1: Type: text/plain, Size: 2715 bytes --] On Monday 28 April 2008 21:34:50 Daniel Barkalow wrote: > On Mon, 28 Apr 2008, Henrik Austad wrote: > > Hi list! > > > > As far as I have gathered, the SHA-1-sum is used as a identifier for > > commits, and that is the primary reason for using sha1. However, several > > places (including the google tech-talk featuring Linus himself) states > > that the id's are cryptographically secure. > > > > As discussed in [1], SHA-1 is not as secure as it once was (and this was > > in 2005), and I'm wondering - are there any plans for migrating to > > another hash-algorithm? I.e. SHA-2, whirlpool.. > > No. The cryptographic security we care about is that it's impractical to > come up with another set of content that hashes to the same value as a > given set of content. The known attacks on SHA-1 (and more broken earlier > hashes in the same general class) only allow the attacker to produce two > files that will collide. Now, it's true that this would allow somebody to > produce a commit where some people see the "good" blob and some people see > the "evil" blob, but (a) the "good" blob contains some large chunk of > random data, which is a major red flag by itself, and (b) all of these > people have to be taking data from the attacker. yes, I can see that point, but I was thinking more along the line of: 1) clone repo 2) add malicious code 3) add a huge block of comment, ifdef-block etc somewhere obscure in the code and keep adding random data untill hash matches a well-known release. 4) publish repo, or even worse, change central repo Most users, and probably a lot of developers never browse through the *entire* archive looking for this, and as long as the hash checks out - why would you? Yes, it would probably be discovered soon enough, but take the linux kernel as an example - if you get, say 100 infected machines due to this, what would this do to the reputation of the kernel? > If somebody gives you some source, and it's got some large random chunk in > it, and the behavior of the object depends on the content of this chunk, > and it's unspecified where this chunk comes from, you should be aware > that they might be able to swap this chunk for a different chunk. But such > a file is pretty blatantly malicious anyway. True, but this actually means you have to verify *everything*, even though the hash checks out. but yes, I can see your point, and it would most likely be infeasible to generate a collision using this approach, and changing to another hashfunction would probably not add much. basically I was just curious and played ahead with the idea. Thanks for the answer though :) -- mvh Henrik Austad [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 21:29 ` Henrik Austad @ 2008-04-28 22:15 ` Daniel Barkalow 2008-04-29 6:38 ` Andreas Ericsson 1 sibling, 0 replies; 38+ messages in thread From: Daniel Barkalow @ 2008-04-28 22:15 UTC (permalink / raw) To: Henrik Austad; +Cc: git On Mon, 28 Apr 2008, Henrik Austad wrote: > On Monday 28 April 2008 21:34:50 Daniel Barkalow wrote: > > On Mon, 28 Apr 2008, Henrik Austad wrote: > > > Hi list! > > > > > > As far as I have gathered, the SHA-1-sum is used as a identifier for > > > commits, and that is the primary reason for using sha1. However, several > > > places (including the google tech-talk featuring Linus himself) states > > > that the id's are cryptographically secure. > > > > > > As discussed in [1], SHA-1 is not as secure as it once was (and this was > > > in 2005), and I'm wondering - are there any plans for migrating to > > > another hash-algorithm? I.e. SHA-2, whirlpool.. > > > > No. The cryptographic security we care about is that it's impractical to > > come up with another set of content that hashes to the same value as a > > given set of content. The known attacks on SHA-1 (and more broken earlier > > hashes in the same general class) only allow the attacker to produce two > > files that will collide. Now, it's true that this would allow somebody to > > produce a commit where some people see the "good" blob and some people see > > the "evil" blob, but (a) the "good" blob contains some large chunk of > > random data, which is a major red flag by itself, and (b) all of these > > people have to be taking data from the attacker. > > yes, I can see that point, but I was thinking more along the line of: > > 1) clone repo > 2) add malicious code > 3) add a huge block of comment, ifdef-block etc somewhere obscure in the code > and keep adding random data untill hash matches a well-known release. > 4) publish repo, or even worse, change central repo All known methods for step 3, even on hashes considered long broken, will take until the heat death of the universe. The latest I can find is that, if you use MD4 (which is weak enough that you can find collisions as quickly as you can do two hashes), there's a 1 in a quadrillion chance that your message is weak and somebody could find a replacement with the same hash using known techniques. (With a plausible amount of work, an attacker could take a file and modify it only slightly, and find a replacement for that, but this again requires the attacker to have some non-trivial input to what gets put in the official tree, which leaves the attacker as the responsible party for that object). SHA-1 is enough stronger that the latest attacks are still unable to do with the current available computing power in years what can be done to MD4 in milliseconds. So it's highly unlikely that somebody will break SHA-1 more thoroughly than MD4 is broken any time soon. > > If somebody gives you some source, and it's got some large random chunk in > > it, and the behavior of the object depends on the content of this chunk, > > and it's unspecified where this chunk comes from, you should be aware > > that they might be able to swap this chunk for a different chunk. But such > > a file is pretty blatantly malicious anyway. > > True, but this actually means you have to verify *everything*, even though the > hash checks out. If you don't verify *everything* when the hash checks out, the attacker will just send you a properly-constructed commit with a back door in the code. While you're looking for directly-inserted security holes in the code, you can probably notice if there's some big hunk of line noise in a comment that might make the file vulnerable to replacement. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 21:29 ` Henrik Austad 2008-04-28 22:15 ` Daniel Barkalow @ 2008-04-29 6:38 ` Andreas Ericsson 2008-04-29 7:09 ` Russ Dill 1 sibling, 1 reply; 38+ messages in thread From: Andreas Ericsson @ 2008-04-29 6:38 UTC (permalink / raw) To: Henrik Austad; +Cc: Daniel Barkalow, git Henrik Austad wrote: > On Monday 28 April 2008 21:34:50 Daniel Barkalow wrote: >> On Mon, 28 Apr 2008, Henrik Austad wrote: >>> Hi list! >>> >>> As far as I have gathered, the SHA-1-sum is used as a identifier for >>> commits, and that is the primary reason for using sha1. However, several >>> places (including the google tech-talk featuring Linus himself) states >>> that the id's are cryptographically secure. >>> >>> As discussed in [1], SHA-1 is not as secure as it once was (and this was >>> in 2005), and I'm wondering - are there any plans for migrating to >>> another hash-algorithm? I.e. SHA-2, whirlpool.. >> No. The cryptographic security we care about is that it's impractical to >> come up with another set of content that hashes to the same value as a >> given set of content. The known attacks on SHA-1 (and more broken earlier >> hashes in the same general class) only allow the attacker to produce two >> files that will collide. Now, it's true that this would allow somebody to >> produce a commit where some people see the "good" blob and some people see >> the "evil" blob, but (a) the "good" blob contains some large chunk of >> random data, which is a major red flag by itself, and (b) all of these >> people have to be taking data from the attacker. > > yes, I can see that point, but I was thinking more along the line of: > > 1) clone repo > 2) add malicious code > 3) add a huge block of comment, ifdef-block etc somewhere obscure in the code > and keep adding random data untill hash matches a well-known release. > 4) publish repo, or even worse, change central repo > This depends greatly on git accepting objects with a colliding object-name, which it doesn't. Once you have an object with a particular SHA1, it will never get overwritten, ever, as git will believe it's about to do unnecessary work. As such, you'd still have to create a new object, hashing to a new SHA1 and get that new object added to the kernel. I think perhaps Andrew Morton and a few other "high brass" among the kernel hackers can get away with pushing crud like that to Linus' public tree (which is the de facto master copy of published kernel sources), but random John Doe's such as you and me wouldn't stand a chance, as our patches would get reviewed by someone who, at the end of the day, makes a living coding Linux. > Most users, and probably a lot of developers never browse through the *entire* > archive looking for this, and as long as the hash checks out - why would you? > Yes, it would probably be discovered soon enough, but take the linux kernel > as an example - if you get, say 100 infected machines due to this, what would > this do to the reputation of the kernel? > That depends. If the source of it was Linus' public tree, that would not be very good at all. If the source was a random tarball off a random webpage or ftp site (which would be the same as fetching and, unverified, using an unchecked git repository), I doubt it would matter much. > >> If somebody gives you some source, and it's got some large random chunk in >> it, and the behavior of the object depends on the content of this chunk, >> and it's unspecified where this chunk comes from, you should be aware >> that they might be able to swap this chunk for a different chunk. But such >> a file is pretty blatantly malicious anyway. > > True, but this actually means you have to verify *everything*, even though the > hash checks out. > Not really. What you need to verify is that a) You cloned from somewhere you trust (kernel.org, fe) b) The SHA1 of the commit you want to build from matches the SHA1 of the same commit in the repository you originally cloned from. Colliding objects can never enter a repository. Git is lazy and will reuse the already existing colliding object with the same name instead. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 6:38 ` Andreas Ericsson @ 2008-04-29 7:09 ` Russ Dill 2008-04-29 7:21 ` Andreas Ericsson 2008-04-29 12:46 ` Jurko Gospodnetić 0 siblings, 2 replies; 38+ messages in thread From: Russ Dill @ 2008-04-29 7:09 UTC (permalink / raw) To: Andreas Ericsson; +Cc: Henrik Austad, Daniel Barkalow, git > Colliding objects can never enter a repository. Git is lazy and will reuse the > already existing colliding object with the same name instead. > I think you are missing the point. One of the pluses behind originally using SHA-1 and the signed tags is that the system as a whole is cryptographically secure. You can verify from the public key of whoever made the tag that yes, this really is the source and history they tagged. Not only can DNS attacks be made, fooling users into thinking that they are really connecting to kernel.org, or whatever else server they expect to be connecting to, but also, the server itself may be hacked and objects replaced. I'm just not sure how much time it would take to find a collision. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 7:09 ` Russ Dill @ 2008-04-29 7:21 ` Andreas Ericsson 2008-04-29 11:05 ` Sverre Rabbelier 2008-04-29 12:46 ` Jurko Gospodnetić 1 sibling, 1 reply; 38+ messages in thread From: Andreas Ericsson @ 2008-04-29 7:21 UTC (permalink / raw) To: Russ Dill; +Cc: Henrik Austad, Daniel Barkalow, git Russ Dill wrote: >> Colliding objects can never enter a repository. Git is lazy and will reuse the >> already existing colliding object with the same name instead. >> > > I think you are missing the point. One of the pluses behind originally > using SHA-1 and the signed tags is that the system as a whole is > cryptographically secure. You can verify from the public key of > whoever made the tag that yes, this really is the source and history > they tagged. Not only can DNS attacks be made, fooling users into > thinking that they are really connecting to kernel.org, or whatever > else server they expect to be connecting to, but also, the server > itself may be hacked and objects replaced. > If the server is hacked and objects are replaced, they will either no longer match their cryptographic signature, meaning they'll be new objects or git will determine that they are corrupt, or they *will* match an existing object, but then that object won't be propagated to other repositories since git refuses to overwrite already existing objects. Either way, gits refusal to overwrite objects it already has plays a part in making malicious actions futile, since malicious code is only worth something if it's propagated and actually used. > I'm just not sure how much time it would take to find a collision. Even crypto-experts are arguing about that, so I'm not surprised. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 7:21 ` Andreas Ericsson @ 2008-04-29 11:05 ` Sverre Rabbelier 2008-04-29 12:27 ` Andreas Ericsson 0 siblings, 1 reply; 38+ messages in thread From: Sverre Rabbelier @ 2008-04-29 11:05 UTC (permalink / raw) To: Andreas Ericsson; +Cc: Russ Dill, Henrik Austad, Daniel Barkalow, git On Tue, Apr 29, 2008 at 9:21 AM, Andreas Ericsson <ae@op5.se> wrote: > Russ Dill wrote: > If the server is hacked and objects are replaced, they will either > no longer match their cryptographic signature, meaning they'll be > new objects or git will determine that they are corrupt, or they We were assuming here that once SHA-1 is broken really determined hackers will be able to come up with objects that -do- match the SHA-1, so the above is not relevant. > *will* match an existing object, but then that object won't be > propagated to other repositories since git refuses to overwrite > already existing objects. [...] What about new users cloning the repo? They're just out of luck? I don't think this argument holds, if we want to 'advertise' that git is cryptographically secure we can do so only as long as our hashing algorithm is. (As such, should SHA-1 ever be fully broken we'd need to either switch to another algorithm or stop advertising being cryptographically secure.) > [...] Either way, gits refusal to overwrite > objects it already has plays a part in making malicious actions > futile, since malicious code is only worth something if it's > propagated and actually used. Of course this is true, it makes it a lot harder to do damage, but it doesn't eliminate the problem, it's just a free 'extra protection'. Yes, malicious code is only worth something if it's propagated and actually used, no, it is not impossible to do so in git if/when SHA-1 turns out to have collisions every other file. -- Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 11:05 ` Sverre Rabbelier @ 2008-04-29 12:27 ` Andreas Ericsson 2008-04-29 13:05 ` Paolo Bonzini 0 siblings, 1 reply; 38+ messages in thread From: Andreas Ericsson @ 2008-04-29 12:27 UTC (permalink / raw) To: sverre; +Cc: Russ Dill, Henrik Austad, Daniel Barkalow, git Sverre Rabbelier wrote: > On Tue, Apr 29, 2008 at 9:21 AM, Andreas Ericsson <ae@op5.se> wrote: >> Russ Dill wrote: >> If the server is hacked and objects are replaced, they will either >> no longer match their cryptographic signature, meaning they'll be >> new objects or git will determine that they are corrupt, or they > > We were assuming here that once SHA-1 is broken really determined > hackers will be able to come up with objects that -do- match the > SHA-1, so the above is not relevant. > >> *will* match an existing object, but then that object won't be >> propagated to other repositories since git refuses to overwrite >> already existing objects. [...] > > What about new users cloning the repo? They're just out of luck? Only until someone who's already cloned the repository fetches from it, at which point the collision will be detected. > I > don't think this argument holds, if we want to 'advertise' that git is > cryptographically secure we can do so only as long as our hashing > algorithm is. (As such, should SHA-1 ever be fully broken we'd need to > either switch to another algorithm or stop advertising being > cryptographically secure.) > True. So far though, the only attacks that have been successful requires that the attacker is allowed to create both the colliding data-sets, and so far none has been found that would allow the attacker to follow any kind of syntactical rules what so ever, so from a practical point of view, SHA1 is 100% secure *for sourcecode*. >From a theoretical point of view, no hash is 100% secure, so changing algorithm buys us nothing. Besides, "cryprographically secure" is not the same as "will never ever be broken", because all hashes are obviously susceptible to brute-force attacks. "Cryptographically secure" means, insofar as I've understood it that given a source-file and a key, it would take such an extremely long time to find a different data-set that hashes to the same key that the result is unusable because the original source is obsolete. That is why legal documents are always signed with the "most secure" (or rather, "least insecure") of all available hashes. For our purposes, SHA1 suffices until someone comes up with a relatively trivial way of creating a collision within the parameters above. >> [...] Either way, gits refusal to overwrite >> objects it already has plays a part in making malicious actions >> futile, since malicious code is only worth something if it's >> propagated and actually used. > > Of course this is true, it makes it a lot harder to do damage, but it > doesn't eliminate the problem, it's just a free 'extra protection'. > Yes, malicious code is only worth something if it's propagated and > actually used, no, it is not impossible to do so in git if/when SHA-1 > turns out to have collisions every other file. > Points of fact so far: * It possible to create objects with colliding names (SHA1 hash keys). This holds true whichever algorithm we use, although it will be more difficult with a stronger algorithm. * It is impossible to distribute the colliding content to already cloned repositories. This also holds true for all hash algorithms. I've been arguing that the value of the first point is so greatly diminished by the second, that even if SHA1 turns out to be horribly broken, projects using git will still have a decent protection against malicious code entering the repository without the knowledge of one of the authors. You've been arguing that SHA1 is not theoretically secure, which is obviously true since no hash is theoretically secure. I can think of one way to make git a lot more resilient to hash collisions, regardless of which hash is used, namely: Add the length of the hashed object to the hash. In order for an evil-minded hacker to succeed in doing any real harm, he/she now has to create a conflicting file which is valid for its type (be it C, PHP, JPEG, AVI, PDF or whatever) and is also the same length as the original source, without being allowed to create the original object. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 12:27 ` Andreas Ericsson @ 2008-04-29 13:05 ` Paolo Bonzini 2008-04-29 14:37 ` Andreas Ericsson 0 siblings, 1 reply; 38+ messages in thread From: Paolo Bonzini @ 2008-04-29 13:05 UTC (permalink / raw) To: git; +Cc: sverre, Russ Dill, Henrik Austad, Daniel Barkalow, git > I can think of one way to make git a lot more resilient to hash > collisions, regardless of which hash is used, namely: Add the length > of the hashed object to the hash. Not really, because most attacks are about collisions, not second preimages. They produce two 64-byte blocks (hence, same length) with the same hash value. As such, they allow to change a blob that *the attacker* injected in the repository. The way the more "spectacular" attacks are devised requires a "language" with conditional expressions -- for documents, for example, Postscript is used. If you prepare a postscript file whose code is if (AAAA == BBBB) typeset document 1 else typeset document 2 where AAAA and BBBB are collisions, and you change it to "if (BBBB == BBBB) the hash will be the same, but the outcome will be document 1 instead of document 2. The fact that this requires having the two "behaviors" in the blob is not a big deal for source code, going in the wrong branch of an "if" can be an attack. On the other hand, it makes adding the length useless for collision attacks. True, it wouldn't be useless for second preimage attacks, but SHA-1 is still secure with respect to those. Paolo ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 13:05 ` Paolo Bonzini @ 2008-04-29 14:37 ` Andreas Ericsson 2008-04-29 14:52 ` Paolo Bonzini 2008-04-29 16:24 ` Russ Dill 0 siblings, 2 replies; 38+ messages in thread From: Andreas Ericsson @ 2008-04-29 14:37 UTC (permalink / raw) To: Paolo Bonzini; +Cc: sverre, Russ Dill, Henrik Austad, Daniel Barkalow, git Paolo Bonzini wrote: > >> I can think of one way to make git a lot more resilient to hash >> collisions, regardless of which hash is used, namely: Add the length >> of the hashed object to the hash. > > Not really, because most attacks are about collisions, not second > preimages. They produce two 64-byte blocks (hence, same length) with > the same hash value. > > As such, they allow to change a blob that *the attacker* injected in the > repository. The way the more "spectacular" attacks are devised requires > a "language" with conditional expressions -- for documents, for example, > Postscript is used. If you prepare a postscript file whose code is > > if (AAAA == BBBB) > typeset document 1 > else > typeset document 2 > > where AAAA and BBBB are collisions, and you change it to "if (BBBB == > BBBB) the hash will be the same, but the outcome will be document 1 > instead of document 2. > > The fact that this requires having the two "behaviors" in the blob is > not a big deal for source code, going in the wrong branch of an "if" can > be an attack. On the other hand, it makes adding the length useless for > collision attacks. True, it wouldn't be useless for second preimage > attacks, but SHA-1 is still secure with respect to those. > So what you're saying is that if someone owns a repository and adds a file to it, he can then replace his entire repository with an identical one where the good file is replaced with a bad one, and this will affect people who clone *after* the file gets replaced. Gee, that's one fiendishly large attack vector, quite apart from the fact that said author first has to come up with a program that gets widespread enough that a lot of people all of a sudden wants to use it, but not so widespread that anyone would want to review it before using it. I remain unconvinced as to whether or not SHA1 is, for all practical purposes, cryptographically secure for git's uses. Sure, evil programmers can screw you over if you use their software without reviewing it, but that's hardly due to git using a particular cryptographic algorithm. Otoh, I'm not familiar enough with the nomenclature to say with 100% certainty what's cryprographically secure and what isn't. I just know that there are no collision-less hashes, so whatever "cryptographically secure" really means wrt hashes, "100% collision-free" isn't it. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 14:37 ` Andreas Ericsson @ 2008-04-29 14:52 ` Paolo Bonzini 2008-04-29 16:24 ` Russ Dill 1 sibling, 0 replies; 38+ messages in thread From: Paolo Bonzini @ 2008-04-29 14:52 UTC (permalink / raw) To: Andreas Ericsson; +Cc: sverre, Russ Dill, Henrik Austad, Daniel Barkalow, git > So what you're saying is that if someone owns a repository and adds a > file to it, he can then replace his entire repository with an identical > one where the good file is replaced with a bad one, and this will affect > people who clone *after* the file gets replaced. > > Gee, that's one fiendishly large attack vector I agree (with the irony). Paolo ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 14:37 ` Andreas Ericsson 2008-04-29 14:52 ` Paolo Bonzini @ 2008-04-29 16:24 ` Russ Dill 1 sibling, 0 replies; 38+ messages in thread From: Russ Dill @ 2008-04-29 16:24 UTC (permalink / raw) To: Andreas Ericsson Cc: Paolo Bonzini, sverre, Henrik Austad, Daniel Barkalow, git On Tue, Apr 29, 2008 at 7:37 AM, Andreas Ericsson <ae@op5.se> wrote: > > Paolo Bonzini wrote: > > > > > > > > I can think of one way to make git a lot more resilient to hash > > > collisions, regardless of which hash is used, namely: Add the length > > > of the hashed object to the hash. > > > > > > > Not really, because most attacks are about collisions, not second > preimages. They produce two 64-byte blocks (hence, same length) with the > same hash value. > > > > As such, they allow to change a blob that *the attacker* injected in the > repository. The way the more "spectacular" attacks are devised requires a > "language" with conditional expressions -- for documents, for example, > Postscript is used. If you prepare a postscript file whose code is > > > > if (AAAA == BBBB) > > typeset document 1 > > else > > typeset document 2 > > > > where AAAA and BBBB are collisions, and you change it to "if (BBBB == > BBBB) the hash will be the same, but the outcome will be document 1 instead > of document 2. > > > > The fact that this requires having the two "behaviors" in the blob is not > a big deal for source code, going in the wrong branch of an "if" can be an > attack. On the other hand, it makes adding the length useless for collision > attacks. True, it wouldn't be useless for second preimage attacks, but > SHA-1 is still secure with respect to those. > > > > > > So what you're saying is that if someone owns a repository and adds a > file to it, he can then replace his entire repository with an identical > one where the good file is replaced with a bad one, and this will affect > people who clone *after* the file gets replaced. > No, if someone 0wnz a repository, not owns (Or really, malicious mirror owners could be in on it). Either that or some form of redirection attack. When you download a tarball, you can check the signed checksum that is downloadable along with it. When you clone a repo, you depend on signed tags. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 7:09 ` Russ Dill 2008-04-29 7:21 ` Andreas Ericsson @ 2008-04-29 12:46 ` Jurko Gospodnetić 2008-04-29 16:21 ` Russ Dill 1 sibling, 1 reply; 38+ messages in thread From: Jurko Gospodnetić @ 2008-04-29 12:46 UTC (permalink / raw) To: git; +Cc: Andreas Ericsson, Henrik Austad, Daniel Barkalow, git > I think you are missing the point. One of the pluses behind originally > using SHA-1 and the signed tags is that the system as a whole is > cryptographically secure. You can verify from the public key of > whoever made the tag that yes, this really is the source and history > they tagged. I am not really sure I follow this.... how can you 'verify from the public key of whoever made the tag' that the SHA-1 hash is correct!? SHA-1 does not have anything do with any externally provided keys or have I managed to get something confused here? Best regards, Jurko Gospodnetić ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 12:46 ` Jurko Gospodnetić @ 2008-04-29 16:21 ` Russ Dill 0 siblings, 0 replies; 38+ messages in thread From: Russ Dill @ 2008-04-29 16:21 UTC (permalink / raw) To: Jurko Gospodnetić Cc: Andreas Ericsson, Henrik Austad, Daniel Barkalow, git On Tue, Apr 29, 2008 at 5:46 AM, Jurko Gospodnetić <jurko.gospodnetic@docte.hr> wrote: > > > I think you are missing the point. One of the pluses behind originally > > using SHA-1 and the signed tags is that the system as a whole is > > cryptographically secure. You can verify from the public key of > > whoever made the tag that yes, this really is the source and history > > they tagged. > > > > I am not really sure I follow this.... how can you 'verify from the public > key of whoever made the tag' that the SHA-1 hash is correct!? SHA-1 does not > have anything do with any externally provided keys or have I managed to get > something confused here? > Sorry for the confusion, its about using the signed tag and the SHA-1 of the parent commits, along with their associated trees and blobs to verify the source and history. If you can't trust the signed tag, or all of the SHA-1's, you can't trust the source and history. However, as many said, I don't think there is any reason to not trust SHA-1 is the context of source control. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 19:34 ` Daniel Barkalow 2008-04-28 21:29 ` Henrik Austad @ 2008-04-29 15:34 ` Geoffrey Irving 2008-04-29 16:27 ` Daniel Barkalow 1 sibling, 1 reply; 38+ messages in thread From: Geoffrey Irving @ 2008-04-29 15:34 UTC (permalink / raw) To: Daniel Barkalow; +Cc: Henrik Austad, git On Mon, Apr 28, 2008 at 12:34 PM, Daniel Barkalow <barkalow@iabervon.org> wrote: > On Mon, 28 Apr 2008, Henrik Austad wrote: > > > Hi list! > > > > As far as I have gathered, the SHA-1-sum is used as a identifier for commits, > > and that is the primary reason for using sha1. However, several places > > (including the google tech-talk featuring Linus himself) states that the id's > > are cryptographically secure. > > > > As discussed in [1], SHA-1 is not as secure as it once was (and this was in > > 2005), and I'm wondering - are there any plans for migrating to another > > hash-algorithm? I.e. SHA-2, whirlpool.. > > No. The cryptographic security we care about is that it's impractical to > come up with another set of content that hashes to the same value as a > given set of content. The known attacks on SHA-1 (and more broken earlier > hashes in the same general class) only allow the attacker to produce two > files that will collide. Now, it's true that this would allow somebody to > produce a commit where some people see the "good" blob and some people see > the "evil" blob, but (a) the "good" blob contains some large chunk of > random data, which is a major red flag by itself, and (b) all of these > people have to be taking data from the attacker. > > If somebody gives you some source, and it's got some large random chunk in > it, and the behavior of the object depends on the content of this chunk, > and it's unspecified where this chunk comes from, you should be aware > that they might be able to swap this chunk for a different chunk. But such > a file is pretty blatantly malicious anyway. This argument is invalid, since the use of git is not limited to source code. People can and do store unreadable binary data in git, and unless you are completely sure that no one would ever care about the security of that data in a way that can be attacked with a single collision, git should be secure about those as well. For example, I just converted a 20 GB repository to git which, among other things, contains pdf files of my tax returns. I have looked them over, but I have not opened them in a hex editor and looked them over at the binary level, and I don't think git should expect me to. Incidentally, git was the only version control system I tried except for subversion that didn't choke on that repository. Mercurial looked at my file renames and expanded the size past 45 GB before I killed it, I had to fix a several bugs in the bazaar conversion scripts before I realized it was just too slow, and svk turns out to be even more like the Antichrist than subversion itself is (mirroring N repository copies requires an N-fold increase in size). Geoffrey ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 15:34 ` Geoffrey Irving @ 2008-04-29 16:27 ` Daniel Barkalow 0 siblings, 0 replies; 38+ messages in thread From: Daniel Barkalow @ 2008-04-29 16:27 UTC (permalink / raw) To: Geoffrey Irving; +Cc: Henrik Austad, git On Tue, 29 Apr 2008, Geoffrey Irving wrote: > On Mon, Apr 28, 2008 at 12:34 PM, Daniel Barkalow <barkalow@iabervon.org> wrote: > > On Mon, 28 Apr 2008, Henrik Austad wrote: > > > > > Hi list! > > > > > > As far as I have gathered, the SHA-1-sum is used as a identifier for commits, > > > and that is the primary reason for using sha1. However, several places > > > (including the google tech-talk featuring Linus himself) states that the id's > > > are cryptographically secure. > > > > > > As discussed in [1], SHA-1 is not as secure as it once was (and this was in > > > 2005), and I'm wondering - are there any plans for migrating to another > > > hash-algorithm? I.e. SHA-2, whirlpool.. > > > > No. The cryptographic security we care about is that it's impractical to > > come up with another set of content that hashes to the same value as a > > given set of content. The known attacks on SHA-1 (and more broken earlier > > hashes in the same general class) only allow the attacker to produce two > > files that will collide. Now, it's true that this would allow somebody to > > produce a commit where some people see the "good" blob and some people see > > the "evil" blob, but (a) the "good" blob contains some large chunk of > > random data, which is a major red flag by itself, and (b) all of these > > people have to be taking data from the attacker. > > > > If somebody gives you some source, and it's got some large random chunk in > > it, and the behavior of the object depends on the content of this chunk, > > and it's unspecified where this chunk comes from, you should be aware > > that they might be able to swap this chunk for a different chunk. But such > > a file is pretty blatantly malicious anyway. > > This argument is invalid, since the use of git is not limited to > source code. People > can and do store unreadable binary data in git, and unless you are completely > sure that no one would ever care about the security of that data in a > way that can > be attacked with a single collision, git should be secure about those as well. > > For example, I just converted a 20 GB repository to git which, among > other things, > contains pdf files of my tax returns. I have looked them over, but I > have not opened > them in a hex editor and looked them over at the binary level, and I > don't think git > should expect me to. If you haven't looked over your PDFs with a hex editor, you're depending on the security of the software generating the PDFs and on what you did in generating them. (Looking at the resulting image alone may be unwise if, for example, you redacted anything.) In any case, on the basis of your actions, you may this commit. Now, anyone receiving the repository can, due to the lack of second preimage attacks, be sure that (a) the document is as you committed it; or (b) the document is different from what you committed, but you made the substitution; or (c) the document is different from what you committed, and you were tricked into committing a document carefully designed by somebody else to be weak. Additionally, it's infeasible to create a document such that forensics after the fact can't turn up both the content as originally shown and the content as swapped from either document. I'm also not confident that PDFs are, in general, not vulnerable to an attack where they rasterize entirely differently depending on environmental factors (e.g., the document you're signing says something entirely different when printed on A4 paper than what it says printed on Letter); if so, it doesn't matter much that the document could be replaced, since an attacker could just control the environment and get the same effect. In any case, an attacker can't come along later and make a replacement of a file that originated in your commit. Also, you know that any sets of interchangable documents had already been created when you get a commit that contains one of them. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 16:29 About git and the use of SHA-1 Henrik Austad 2008-04-28 19:34 ` Daniel Barkalow @ 2008-04-29 12:41 ` Dmitry Potapov 2008-04-29 14:41 ` Andreas Ericsson 2008-04-29 15:02 ` Tom Widmer 2008-04-29 17:08 ` Tom Widmer 3 siblings, 1 reply; 38+ messages in thread From: Dmitry Potapov @ 2008-04-29 12:41 UTC (permalink / raw) To: Henrik Austad; +Cc: git On Mon, Apr 28, 2008 at 06:29:07PM +0200, Henrik Austad wrote: > > As discussed in [1], SHA-1 is not as secure as it once was (and this was in > 2005), and I'm wondering - are there any plans for migrating to another > hash-algorithm? I.e. SHA-2, whirlpool.. SHA-1 is broken in the sense that it requires computation less than finding a collision by brute force (2^80). It is still very costly and AFAIK no one yet has found a single collision for SHA-1 yet, but even if such a collision is found, the question is how it can be exploit? This collision cannot be used to replace any existing code in Git. The only way to exploit this collision is to submit a patch based on one sequence to the maintainer and it should look legitimate to be accepted and then create another blob with malicious code based on the other sequence, so the second blob has the same SHA-1 then anyone who pulls from you will get malicious code. However, it is tricky to create these two blobs -- one which should pass inspection and look like as a real improvement but the other one that should do what you want. All what you have is two sequences of 20 bytes with the same SHA-1 and you have no control over them. For some binary files, it is possible by including both good and bad contents in the submitted blob and using one sequence in the right place to hide the bad part and make only the good one active/visible. Then the other blob will be almost the same but contains the other sequence, which is used to activate the bad part. This can work if the maintainer cannot see everything but only the "visible" part. However, I don't think you can do anything like that with _source_ code, which is inspect. And if submitted code is not reviewed, there is nothing that can protect you from malicious code getting into the repository (and even worse it will get directly into the official repository!). So, I don't think we have to worry much about possibility a collision attack, but only about preimage attacks; and a preimage attack on SHA-1 is far away from reality. Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 12:41 ` Dmitry Potapov @ 2008-04-29 14:41 ` Andreas Ericsson 2008-04-29 15:42 ` Nicolas Pitre 0 siblings, 1 reply; 38+ messages in thread From: Andreas Ericsson @ 2008-04-29 14:41 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Henrik Austad, git Dmitry Potapov wrote: > On Mon, Apr 28, 2008 at 06:29:07PM +0200, Henrik Austad wrote: >> As discussed in [1], SHA-1 is not as secure as it once was (and this was in >> 2005), and I'm wondering - are there any plans for migrating to another >> hash-algorithm? I.e. SHA-2, whirlpool.. > > SHA-1 is broken in the sense that it requires computation less than > finding a collision by brute force (2^80). It is still very costly and > AFAIK no one yet has found a single collision for SHA-1 yet, but even if > such a collision is found, the question is how it can be exploit? > > This collision cannot be used to replace any existing code in Git. The > only way to exploit this collision is to submit a patch based on one > sequence to the maintainer and it should look legitimate to be accepted > and then create another blob with malicious code based on the other > sequence, so the second blob has the same SHA-1 then anyone who pulls > from you will get malicious code. > But they won't, because it's impossible to add two objects with the same SHA1 hash key to a git repository, since it will lazily re-use the existing one. In practice, this means that in the case of an "innocent" hash-collision, git will actually break by refusing to store the new content. > However, it is tricky to create these two blobs -- one which should pass > inspection and look like as a real improvement but the other one that > should do what you want. All what you have is two sequences of 20 bytes > with the same SHA-1 and you have no control over them. For some binary > files, it is possible by including both good and bad contents in the > submitted blob and using one sequence in the right place to hide the bad > part and make only the good one active/visible. Then the other blob will > be almost the same but contains the other sequence, which is used to > activate the bad part. This can work if the maintainer cannot see > everything but only the "visible" part. However, I don't think you can > do anything like that with _source_ code, which is inspect. And if > submitted code is not reviewed, there is nothing that can protect you > from malicious code getting into the repository (and even worse it will > get directly into the official repository!). > > So, I don't think we have to worry much about possibility a collision > attack, but only about preimage attacks; and a preimage attack on SHA-1 > is far away from reality. > Right. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 14:41 ` Andreas Ericsson @ 2008-04-29 15:42 ` Nicolas Pitre 2008-04-29 15:59 ` Geoffrey Irving 0 siblings, 1 reply; 38+ messages in thread From: Nicolas Pitre @ 2008-04-29 15:42 UTC (permalink / raw) To: Andreas Ericsson; +Cc: Dmitry Potapov, Henrik Austad, git On Tue, 29 Apr 2008, Andreas Ericsson wrote: > But they won't, because it's impossible to add two objects with the same > SHA1 hash key to a git repository, since it will lazily re-use the > existing one. In practice, this means that in the case of an "innocent" > hash-collision, git will actually break by refusing to store the new > content. I'd also like to point out that Git usually receive "untrusted" new objects via the Git protocol through 'git index-pack'. If you look at sha1_object() in index-pack.c, you'll see that active verification against hash collision is performed, and the fetch will abruptly be aborted if ever that happens. Yes, writing a test case for this was tricky. :-) Nicolas ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 15:42 ` Nicolas Pitre @ 2008-04-29 15:59 ` Geoffrey Irving 2008-04-29 16:39 ` Nicolas Pitre 2008-04-29 18:17 ` Matthieu Moy 0 siblings, 2 replies; 38+ messages in thread From: Geoffrey Irving @ 2008-04-29 15:59 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 8:42 AM, Nicolas Pitre <nico@cam.org> wrote: > On Tue, 29 Apr 2008, Andreas Ericsson wrote: > > > But they won't, because it's impossible to add two objects with the same > > SHA1 hash key to a git repository, since it will lazily re-use the > > existing one. In practice, this means that in the case of an "innocent" > > hash-collision, git will actually break by refusing to store the new > > content. > > I'd also like to point out that Git usually receive "untrusted" new > objects via the Git protocol through 'git index-pack'. If you look at > sha1_object() in index-pack.c, you'll see that active verification > against hash collision is performed, and the fetch will abruptly be > aborted if ever that happens. > > Yes, writing a test case for this was tricky. :-) Here's the standard scenario for a hash collision attack, with parties, A, B, and C: 1. C, the malicious one, computes the standard two pdfs with matching sha1 hashes. 2. C sends the valid pdf to B through a git commit, and B signs it with a tag. 3. C grabs the signature, and then forwards the "signed" commit to A, but substitutes the invalid pdf with the same hash. The fact that git will check for hash collisions within one repository is nice, but it doesn't significantly increase the security of git against hash collision attacks. Geoffrey ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 15:59 ` Geoffrey Irving @ 2008-04-29 16:39 ` Nicolas Pitre 2008-04-29 17:48 ` Geoffrey Irving 2008-04-29 18:17 ` Matthieu Moy 1 sibling, 1 reply; 38+ messages in thread From: Nicolas Pitre @ 2008-04-29 16:39 UTC (permalink / raw) To: Geoffrey Irving; +Cc: Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, 29 Apr 2008, Geoffrey Irving wrote: > On Tue, Apr 29, 2008 at 8:42 AM, Nicolas Pitre <nico@cam.org> wrote: > > On Tue, 29 Apr 2008, Andreas Ericsson wrote: > > > > > But they won't, because it's impossible to add two objects with the same > > > SHA1 hash key to a git repository, since it will lazily re-use the > > > existing one. In practice, this means that in the case of an "innocent" > > > hash-collision, git will actually break by refusing to store the new > > > content. > > > > I'd also like to point out that Git usually receive "untrusted" new > > objects via the Git protocol through 'git index-pack'. If you look at > > sha1_object() in index-pack.c, you'll see that active verification > > against hash collision is performed, and the fetch will abruptly be > > aborted if ever that happens. > > > > Yes, writing a test case for this was tricky. :-) > > Here's the standard scenario for a hash collision attack, with > parties, A, B, and C: > > 1. C, the malicious one, computes the standard two pdfs with matching > sha1 hashes. > 2. C sends the valid pdf to B through a git commit, and B signs it with a tag. > 3. C grabs the signature, and then forwards the "signed" commit to A, > but substitutes the invalid pdf with the same hash. > > The fact that git will check for hash collisions within one repository > is nice, but it doesn't significantly increase the security of git > against hash collision attacks. Sure. But this is all complete handwaving until a practical collision can be demonstrated. So far the demonstration hasn't happened, practical or not. Nicolas ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 16:39 ` Nicolas Pitre @ 2008-04-29 17:48 ` Geoffrey Irving 2008-04-29 17:55 ` Nicolas Pitre 0 siblings, 1 reply; 38+ messages in thread From: Geoffrey Irving @ 2008-04-29 17:48 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 9:39 AM, Nicolas Pitre <nico@cam.org> wrote: > > On Tue, 29 Apr 2008, Geoffrey Irving wrote: > > > On Tue, Apr 29, 2008 at 8:42 AM, Nicolas Pitre <nico@cam.org> wrote: > > > On Tue, 29 Apr 2008, Andreas Ericsson wrote: > > > > > > > But they won't, because it's impossible to add two objects with the same > > > > SHA1 hash key to a git repository, since it will lazily re-use the > > > > existing one. In practice, this means that in the case of an "innocent" > > > > hash-collision, git will actually break by refusing to store the new > > > > content. > > > > > > I'd also like to point out that Git usually receive "untrusted" new > > > objects via the Git protocol through 'git index-pack'. If you look at > > > sha1_object() in index-pack.c, you'll see that active verification > > > against hash collision is performed, and the fetch will abruptly be > > > aborted if ever that happens. > > > > > > Yes, writing a test case for this was tricky. :-) > > > > Here's the standard scenario for a hash collision attack, with > > parties, A, B, and C: > > > > 1. C, the malicious one, computes the standard two pdfs with matching > > sha1 hashes. > > 2. C sends the valid pdf to B through a git commit, and B signs it with a tag. > > 3. C grabs the signature, and then forwards the "signed" commit to A, > > but substitutes the invalid pdf with the same hash. > > > > The fact that git will check for hash collisions within one repository > > is nice, but it doesn't significantly increase the security of git > > against hash collision attacks. > > Sure. But this is all complete handwaving until a practical collision > can be demonstrated. So far the demonstration hasn't happened, > practical or not. Sorry for the confusion: it would handwaving if I was saying git was insecure, but I'm not. I'm saying that if or when SHA1 becomes vulnerable to collision attacks, git will be insecure. Geoffrey ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 17:48 ` Geoffrey Irving @ 2008-04-29 17:55 ` Nicolas Pitre 2008-04-29 18:02 ` Geoffrey Irving 0 siblings, 1 reply; 38+ messages in thread From: Nicolas Pitre @ 2008-04-29 17:55 UTC (permalink / raw) To: Geoffrey Irving; +Cc: Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, 29 Apr 2008, Geoffrey Irving wrote: > Sorry for the confusion: it would handwaving if I was saying git was insecure, > but I'm not. I'm saying that if or when SHA1 becomes vulnerable to collision > attacks, git will be insecure. Right. And if or when that happens then we'll make Git secure again with a different hash. In the mean time there is low return for the effort involved. Nicolas ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 17:55 ` Nicolas Pitre @ 2008-04-29 18:02 ` Geoffrey Irving 2008-04-29 18:41 ` Daniel Barkalow 0 siblings, 1 reply; 38+ messages in thread From: Geoffrey Irving @ 2008-04-29 18:02 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 10:55 AM, Nicolas Pitre <nico@cam.org> wrote: > On Tue, 29 Apr 2008, Geoffrey Irving wrote: > > > > Sorry for the confusion: it would handwaving if I was saying git was insecure, > > but I'm not. I'm saying that if or when SHA1 becomes vulnerable to collision > > attacks, git will be insecure. > > Right. And if or when that happens then we'll make Git secure again > with a different hash. In the mean time there is low return for the > effort involved. Yes. I wasn't trying to advocate switching, just making sure people know that the "collisions don't matter" argument is bogus. One important thing: when SHA1 becomes vulnerable to collision attacks, it will still be secure to trust the repositories and tags that exist *at that moment.* I.e., the transition period from SHA1 to the next hash will also be secure, assuming that preimage attacks don't become possible simultaneously. So everything is good. Geoffrey ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 18:02 ` Geoffrey Irving @ 2008-04-29 18:41 ` Daniel Barkalow 2008-04-29 20:31 ` Geoffrey Irving 0 siblings, 1 reply; 38+ messages in thread From: Daniel Barkalow @ 2008-04-29 18:41 UTC (permalink / raw) To: Geoffrey Irving Cc: Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, 29 Apr 2008, Geoffrey Irving wrote: > On Tue, Apr 29, 2008 at 10:55 AM, Nicolas Pitre <nico@cam.org> wrote: > > On Tue, 29 Apr 2008, Geoffrey Irving wrote: > > > > > > > Sorry for the confusion: it would handwaving if I was saying git was insecure, > > > but I'm not. I'm saying that if or when SHA1 becomes vulnerable to collision > > > attacks, git will be insecure. > > > > Right. And if or when that happens then we'll make Git secure again > > with a different hash. In the mean time there is low return for the > > effort involved. > > Yes. I wasn't trying to advocate switching, just making sure people > know that the "collisions don't matter" argument is bogus. It's bogus to say they completely don't matter, but I still claim that they don't matter for the things people actually care about. If people can generate collisions, they can commit a "weak" blob with a conditional that can be switched by replacing the blob. But it's almost always true that people could commit a blob with a conditional that can be switched by something else under the attacker's more direct control. Using a better hash function won't save you from a document like: if (getdate() < 2009) render_good_text else render_evil_text even if it does help with: if (AA == AA) render_good_text else render_evil_text If you're not checking your files for the former, you shouldn't worry about the latter, because the former is much easier and more subtle. (Now, an arbitrary preimage attack would actually be significant, still, because the attacker could replace an honestly-created "restrictive security policy" file with garbage that will be ignored, leaving stuff unprotected) -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 18:41 ` Daniel Barkalow @ 2008-04-29 20:31 ` Geoffrey Irving 2008-04-29 20:50 ` Fredrik Skolmli 2008-04-30 2:58 ` Martin Langhoff 0 siblings, 2 replies; 38+ messages in thread From: Geoffrey Irving @ 2008-04-29 20:31 UTC (permalink / raw) To: Daniel Barkalow Cc: Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 11:41 AM, Daniel Barkalow <barkalow@iabervon.org> wrote: > On Tue, 29 Apr 2008, Geoffrey Irving wrote: > > > On Tue, Apr 29, 2008 at 10:55 AM, Nicolas Pitre <nico@cam.org> wrote: > > > On Tue, 29 Apr 2008, Geoffrey Irving wrote: > > > > > > > > > > Sorry for the confusion: it would handwaving if I was saying git was insecure, > > > > but I'm not. I'm saying that if or when SHA1 becomes vulnerable to collision > > > > attacks, git will be insecure. > > > > > > Right. And if or when that happens then we'll make Git secure again > > > with a different hash. In the mean time there is low return for the > > > effort involved. > > > > Yes. I wasn't trying to advocate switching, just making sure people > > know that the "collisions don't matter" argument is bogus. > > It's bogus to say they completely don't matter, but I still claim that > they don't matter for the things people actually care about. If people can > generate collisions, they can commit a "weak" blob with a conditional that > can be switched by replacing the blob. But it's almost always true that > people could commit a blob with a conditional that can be switched by > something else under the attacker's more direct control. Using a better > hash function won't save you from a document like: > > if (getdate() < 2009) > render_good_text > else > render_evil_text > > even if it does help with: > > if (AA == AA) > render_good_text > else > render_evil_text > > If you're not checking your files for the former, you shouldn't worry > about the latter, because the former is much easier and more subtle. I sincerely hope that pdf/postscript don't allow the internal rendering code to branch based on the current date. That would be an absurd security hole, and would indeed make you entirely correct. If you actually know that it is possible to write that in postscript, I would very much want to see an example. In any case, in a binary document format that isn't insane (examples of these at least include black and white .png images of documents), a visual check of the content is sufficient to ensure that the next person who looks at it will see roughly the same visual content. Git should be (and currently is) a secure method of transferring sane binary documents. Geoffrey ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 20:31 ` Geoffrey Irving @ 2008-04-29 20:50 ` Fredrik Skolmli 2008-04-29 21:39 ` Geoffrey Irving 2008-04-30 2:58 ` Martin Langhoff 1 sibling, 1 reply; 38+ messages in thread From: Fredrik Skolmli @ 2008-04-29 20:50 UTC (permalink / raw) To: Geoffrey Irving Cc: Daniel Barkalow, Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 01:31:51PM -0700, Geoffrey Irving wrote: > I sincerely hope that pdf/postscript don't allow the internal > rendering code to branch based on the current date. That would be an > absurd security hole, and would indeed make you entirely correct. If > you actually know that it is possible to write that in postscript, I > would very much want to see an example. Have a look at * http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/letter_of_rec.ps vs * http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/order.ps both found on a website[1] already mentioned[2] in this thread. :-) [1]: http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/ [2]: http://marc.info/?l=git&m=120949349923584&w=2 - F -- Regards, Fredrik Skolmli ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 20:50 ` Fredrik Skolmli @ 2008-04-29 21:39 ` Geoffrey Irving 2008-04-29 21:52 ` Fredrik Skolmli 0 siblings, 1 reply; 38+ messages in thread From: Geoffrey Irving @ 2008-04-29 21:39 UTC (permalink / raw) To: Fredrik Skolmli Cc: Daniel Barkalow, Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 1:50 PM, Fredrik Skolmli <fredrik@frsk.net> wrote: > On Tue, Apr 29, 2008 at 01:31:51PM -0700, Geoffrey Irving wrote: > > > I sincerely hope that pdf/postscript don't allow the internal > > rendering code to branch based on the current date. That would be an > > absurd security hole, and would indeed make you entirely correct. If > > you actually know that it is possible to write that in postscript, I > > would very much want to see an example. > > Have a look at > > * http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/letter_of_rec.ps > vs > * http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/order.ps > > both found on a website[1] already mentioned[2] in this thread. :-) > > [1]: http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/ > [2]: http://marc.info/?l=git&m=120949349923584&w=2 This is an example of a hash collision, not conditional rendering based on the current date. I.e., you didn't actually read my email or the email I was replying to. :) Geoffrey ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 21:39 ` Geoffrey Irving @ 2008-04-29 21:52 ` Fredrik Skolmli 0 siblings, 0 replies; 38+ messages in thread From: Fredrik Skolmli @ 2008-04-29 21:52 UTC (permalink / raw) To: Geoffrey Irving; +Cc: git On Tue, Apr 29, 2008 at 02:39:46PM -0700, Geoffrey Irving wrote: > This is an example of a hash collision, not conditional rendering > based on the current date. I.e., you didn't actually read my email or > the email I was replying to. :) Ah, you're right. Didn't notice the part about dates. Sorry ;-) -- Regards, Fredrik Skolmli ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 20:31 ` Geoffrey Irving 2008-04-29 20:50 ` Fredrik Skolmli @ 2008-04-30 2:58 ` Martin Langhoff 2008-04-30 5:18 ` Geoffrey Irving 1 sibling, 1 reply; 38+ messages in thread From: Martin Langhoff @ 2008-04-30 2:58 UTC (permalink / raw) To: Geoffrey Irving Cc: Daniel Barkalow, Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Wed, Apr 30, 2008 at 8:31 AM, Geoffrey Irving <irving@naml.us> wrote: > I sincerely hope that pdf/postscript don't allow the internal > rendering code to branch based on the current date. That would be an > absurd security hole, and would indeed make you entirely correct. If PS is Turing complete, and does know about dates. So yes, you can make such conditionals. That original md5 paper with the 2 PDF files is mainly a good example that you should trust binary blobs, that's all. The md5 trick is a nice demo, but misses the point entirely. I can't find it now, but someone had written a PDF file that printed Pi computing in inside the PS VM. The tiny file would keep the printer churning out paper until it ran out of memory. :-) cheers, m -- martin.langhoff@gmail.com martin@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-30 2:58 ` Martin Langhoff @ 2008-04-30 5:18 ` Geoffrey Irving 2008-04-30 5:47 ` David Brown 0 siblings, 1 reply; 38+ messages in thread From: Geoffrey Irving @ 2008-04-30 5:18 UTC (permalink / raw) To: Martin Langhoff Cc: Daniel Barkalow, Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 7:58 PM, Martin Langhoff <martin.langhoff@gmail.com> wrote: > On Wed, Apr 30, 2008 at 8:31 AM, Geoffrey Irving <irving@naml.us> wrote: > > I sincerely hope that pdf/postscript don't allow the internal > > rendering code to branch based on the current date. That would be an > > absurd security hole, and would indeed make you entirely correct. If > > PS is Turing complete, and does know about dates. So yes, you can make > such conditionals. I knew postscript was Turing complete, but had (naively) assumed it executed sandboxed and deterministically and would therefore display uniformly barring interpreter bugs. Looking over the spec, I can't find where it's possible to read the current date, but the usertime/realtime variables are sufficient as long as the attacker knows how fast the relevant machines are. > That original md5 paper with the 2 PDF files is mainly a good example > that you should trust binary blobs, that's all. The md5 trick is a > nice demo, but misses the point entirely. > > I can't find it now, but someone had written a PDF file that printed > Pi computing in inside the PS VM. The tiny file would keep the printer > churning out paper until it ran out of memory. :-) According to wikipedia, PDF doesn't have conditionals or loops of any kind, so you probably mean a postscript file. Geoffrey ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-30 5:18 ` Geoffrey Irving @ 2008-04-30 5:47 ` David Brown 2008-04-30 5:56 ` Martin Langhoff 0 siblings, 1 reply; 38+ messages in thread From: David Brown @ 2008-04-30 5:47 UTC (permalink / raw) To: Geoffrey Irving Cc: Martin Langhoff, Daniel Barkalow, Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 10:18:55PM -0700, Geoffrey Irving wrote: >> PS is Turing complete, and does know about dates. So yes, you can make >> such conditionals. > >I knew postscript was Turing complete, but had (naively) assumed it >executed sandboxed and deterministically and would therefore display >uniformly barring interpreter bugs. Looking over the spec, I can't >find where it's possible to read the current date, but the >usertime/realtime variables are sufficient as long as the attacker >knows how fast the relevant machines are. usertime and realtime are from the start of the invocation of the postscript interpreter, not based on the outside world. So, the interpreter could wait arbitrarily long, but has no way of knowing any external reference to time. I could imagine trickery with PDF signatures and their expiration times, but you shouldn't be able to do anything with the information, so it would be an exploit, and would probably be fixed. David ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-30 5:47 ` David Brown @ 2008-04-30 5:56 ` Martin Langhoff 0 siblings, 0 replies; 38+ messages in thread From: Martin Langhoff @ 2008-04-30 5:56 UTC (permalink / raw) To: Geoffrey Irving, Martin Langhoff, Daniel Barkalow, Nicolas Pitre, Andreas Ericsson On Wed, Apr 30, 2008 at 5:47 PM, David Brown <git@davidb.org> wrote: > usertime and realtime are from the start of the invocation of the > postscript interpreter, not based on the outside world. So, the You guys are right - I misremembered the spec wrt dates. I had the distinct impression that there was a way to get the epoch. Sorry about the noise. martin -- martin.langhoff@gmail.com martin@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 15:59 ` Geoffrey Irving 2008-04-29 16:39 ` Nicolas Pitre @ 2008-04-29 18:17 ` Matthieu Moy 2008-04-29 18:23 ` Fredrik Skolmli 1 sibling, 1 reply; 38+ messages in thread From: Matthieu Moy @ 2008-04-29 18:17 UTC (permalink / raw) To: Geoffrey Irving Cc: Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git "Geoffrey Irving" <irving@naml.us> writes: > Here's the standard scenario for a hash collision attack, with > parties, A, B, and C: > > 1. C, the malicious one, computes the standard two pdfs with matching > sha1 hashes. > 2. C sends the valid pdf to B through a git commit, and B signs it with a tag. > 3. C grabs the signature, and then forwards the "signed" commit to A, > but substitutes the invalid pdf with the same hash. Just to add my 2 cents, examples of this are available on the web, like: http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/ Same size, same hash. But that's with md5, not sha1. -- Matthieu ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-29 18:17 ` Matthieu Moy @ 2008-04-29 18:23 ` Fredrik Skolmli 0 siblings, 0 replies; 38+ messages in thread From: Fredrik Skolmli @ 2008-04-29 18:23 UTC (permalink / raw) To: Matthieu Moy Cc: Geoffrey Irving, Nicolas Pitre, Andreas Ericsson, Dmitry Potapov, Henrik Austad, git On Tue, Apr 29, 2008 at 08:17:51PM +0200, Matthieu Moy wrote: > > Here's the standard scenario for a hash collision attack, with > > parties, A, B, and C: > > > > 1. C, the malicious one, computes the standard two pdfs with matching > > sha1 hashes. > > 2. C sends the valid pdf to B through a git commit, and B signs it with a tag. > > 3. C grabs the signature, and then forwards the "signed" commit to A, > > but substitutes the invalid pdf with the same hash. > > Just to add my 2 cents, examples of this are available on the web, > like: > > http://th.informatik.uni-mannheim.de/People/Lucks/HashCollisions/ > > Same size, same hash. But that's with md5, not sha1. Well yes, but that's still using the methods already mentioned in this thread. So you do have to get your "good" code approved before replacing it with something nasty. - Fredrik -- Regards, Fredrik Skolmli ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 16:29 About git and the use of SHA-1 Henrik Austad 2008-04-28 19:34 ` Daniel Barkalow 2008-04-29 12:41 ` Dmitry Potapov @ 2008-04-29 15:02 ` Tom Widmer 2008-04-29 17:08 ` Tom Widmer 3 siblings, 0 replies; 38+ messages in thread From: Tom Widmer @ 2008-04-29 15:02 UTC (permalink / raw) To: git Henrik Austad wrote: > Hi list! > > As far as I have gathered, the SHA-1-sum is used as a identifier for commits, > and that is the primary reason for using sha1. However, several places > (including the google tech-talk featuring Linus himself) states that the id's > are cryptographically secure. > > As discussed in [1], SHA-1 is not as secure as it once was (and this was in > 2005), and I'm wondering - are there any plans for migrating to another > hash-algorithm? I.e. SHA-2, whirlpool.. > > [1] http://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html Why not wait until the results of: are available. That will surely be soon enough. Tom ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: About git and the use of SHA-1 2008-04-28 16:29 About git and the use of SHA-1 Henrik Austad ` (2 preceding siblings ...) 2008-04-29 15:02 ` Tom Widmer @ 2008-04-29 17:08 ` Tom Widmer 3 siblings, 0 replies; 38+ messages in thread From: Tom Widmer @ 2008-04-29 17:08 UTC (permalink / raw) To: git Henrik Austad wrote: > Hi list! > > As far as I have gathered, the SHA-1-sum is used as a identifier for commits, > and that is the primary reason for using sha1. However, several places > (including the google tech-talk featuring Linus himself) states that the id's > are cryptographically secure. > > As discussed in [1], SHA-1 is not as secure as it once was (and this was in > 2005), and I'm wondering - are there any plans for migrating to another > hash-algorithm? I.e. SHA-2, whirlpool.. > > [1] http://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html Why not wait until the results of: http://www.csrc.nist.gov/groups/ST/hash/index.html are available. That will surely be soon enough (I think 2012 is the expected finish date), and should prevent having to switch again in the future. The necessity or otherwise of improving the hashing will be clearer by then too. Tom ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2008-04-30 5:57 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-28 16:29 About git and the use of SHA-1 Henrik Austad 2008-04-28 19:34 ` Daniel Barkalow 2008-04-28 21:29 ` Henrik Austad 2008-04-28 22:15 ` Daniel Barkalow 2008-04-29 6:38 ` Andreas Ericsson 2008-04-29 7:09 ` Russ Dill 2008-04-29 7:21 ` Andreas Ericsson 2008-04-29 11:05 ` Sverre Rabbelier 2008-04-29 12:27 ` Andreas Ericsson 2008-04-29 13:05 ` Paolo Bonzini 2008-04-29 14:37 ` Andreas Ericsson 2008-04-29 14:52 ` Paolo Bonzini 2008-04-29 16:24 ` Russ Dill 2008-04-29 12:46 ` Jurko Gospodnetić 2008-04-29 16:21 ` Russ Dill 2008-04-29 15:34 ` Geoffrey Irving 2008-04-29 16:27 ` Daniel Barkalow 2008-04-29 12:41 ` Dmitry Potapov 2008-04-29 14:41 ` Andreas Ericsson 2008-04-29 15:42 ` Nicolas Pitre 2008-04-29 15:59 ` Geoffrey Irving 2008-04-29 16:39 ` Nicolas Pitre 2008-04-29 17:48 ` Geoffrey Irving 2008-04-29 17:55 ` Nicolas Pitre 2008-04-29 18:02 ` Geoffrey Irving 2008-04-29 18:41 ` Daniel Barkalow 2008-04-29 20:31 ` Geoffrey Irving 2008-04-29 20:50 ` Fredrik Skolmli 2008-04-29 21:39 ` Geoffrey Irving 2008-04-29 21:52 ` Fredrik Skolmli 2008-04-30 2:58 ` Martin Langhoff 2008-04-30 5:18 ` Geoffrey Irving 2008-04-30 5:47 ` David Brown 2008-04-30 5:56 ` Martin Langhoff 2008-04-29 18:17 ` Matthieu Moy 2008-04-29 18:23 ` Fredrik Skolmli 2008-04-29 15:02 ` Tom Widmer 2008-04-29 17:08 ` Tom Widmer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).