* easy way to demonstrate length of colliding SHA-1 prefixes? @ 2018-12-02 11:50 Robert P. J. Day 2018-12-02 13:23 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 5+ messages in thread From: Robert P. J. Day @ 2018-12-02 11:50 UTC (permalink / raw) To: Git Mailing list as part of an upcoming git class i'm delivering, i thought it would be amusing to demonstrate the maximum length of colliding SHA-1 prefixes in a repository (in my case, i use the linux kernel git repo for most of my examples). is there a way to display the objects in the object database that clash in the longest object name SHA-1 prefix; i mean, short of manually listing all object names, running that through cut and sort and uniq and ... you get the idea. is there a cute way to do that? thanks. rday ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: easy way to demonstrate length of colliding SHA-1 prefixes? 2018-12-02 11:50 easy way to demonstrate length of colliding SHA-1 prefixes? Robert P. J. Day @ 2018-12-02 13:23 ` Ævar Arnfjörð Bjarmason 2018-12-02 16:24 ` Robert P. J. Day 2018-12-03 22:30 ` Matthew DeVore 0 siblings, 2 replies; 5+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-12-02 13:23 UTC (permalink / raw) To: Robert P. J. Day; +Cc: Git Mailing list On Sun, Dec 02 2018, Robert P. J. Day wrote: > as part of an upcoming git class i'm delivering, i thought it would > be amusing to demonstrate the maximum length of colliding SHA-1 > prefixes in a repository (in my case, i use the linux kernel git repo > for most of my examples). > > is there a way to display the objects in the object database that > clash in the longest object name SHA-1 prefix; i mean, short of > manually listing all object names, running that through cut and sort > and uniq and ... you get the idea. > > is there a cute way to do that? thanks. You'll always need to list them all. It's inherently an operation where for each SHA-1 you need to search for other ones with that prefix up to a given length. Perhaps you've missed that you can use --abbrev=N for this, and just grep for things that are loger than that N, e.g. for linux.git: git log --oneline --abbrev=10 --pretty=format:%h | grep -E -v '^.{10}$' | perl -pe 's/^(.{10}).*/$1/' This will list the 4 objects that need more than 10 characters to be shown unambiguously. If you then "git cat-file -t" them you'll get the disambiguation help. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: easy way to demonstrate length of colliding SHA-1 prefixes? 2018-12-02 13:23 ` Ævar Arnfjörð Bjarmason @ 2018-12-02 16:24 ` Robert P. J. Day 2018-12-03 22:30 ` Matthew DeVore 1 sibling, 0 replies; 5+ messages in thread From: Robert P. J. Day @ 2018-12-02 16:24 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing list [-- Attachment #1: Type: text/plain, Size: 1835 bytes --] On Sun, 2 Dec 2018, Ævar Arnfjörð Bjarmason wrote: > On Sun, Dec 02 2018, Robert P. J. Day wrote: > > > as part of an upcoming git class i'm delivering, i thought it > > would be amusing to demonstrate the maximum length of colliding > > SHA-1 prefixes in a repository (in my case, i use the linux kernel > > git repo for most of my examples). > > > > is there a way to display the objects in the object database > > that clash in the longest object name SHA-1 prefix; i mean, short > > of manually listing all object names, running that through cut and > > sort and uniq and ... you get the idea. > > > > is there a cute way to do that? thanks. > > You'll always need to list them all. It's inherently an operation > where for each SHA-1 you need to search for other ones with that > prefix up to a given length. i assumed as much, just wasn't sure about the esoteric dark corners of git i've never gotten to yet. > Perhaps you've missed that you can use --abbrev=N for this, and just > grep for things that are loger than that N, e.g. for linux.git: > > git log --oneline --abbrev=10 --pretty=format:%h | > grep -E -v '^.{10}$' | > perl -pe 's/^(.{10}).*/$1/' > > This will list the 4 objects that need more than 10 characters to be > shown unambiguously. If you then "git cat-file -t" them you'll get > the disambiguation help. that's pretty close to what i came up with, thanks. rday -- ======================================================================== Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca/dokuwiki Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday ======================================================================== ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: easy way to demonstrate length of colliding SHA-1 prefixes? 2018-12-02 13:23 ` Ævar Arnfjörð Bjarmason 2018-12-02 16:24 ` Robert P. J. Day @ 2018-12-03 22:30 ` Matthew DeVore 2018-12-03 22:57 ` Jeff King 1 sibling, 1 reply; 5+ messages in thread From: Matthew DeVore @ 2018-12-03 22:30 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, Robert P. J. Day; +Cc: Git Mailing list On 12/02/2018 05:23 AM, Ævar Arnfjörð Bjarmason wrote: > > On Sun, Dec 02 2018, Robert P. J. Day wrote: > >> as part of an upcoming git class i'm delivering, i thought it would >> be amusing to demonstrate the maximum length of colliding SHA-1 >> prefixes in a repository (in my case, i use the linux kernel git repo >> for most of my examples). >> >> is there a way to display the objects in the object database that >> clash in the longest object name SHA-1 prefix; i mean, short of >> manually listing all object names, running that through cut and sort >> and uniq and ... you get the idea. >> >> is there a cute way to do that? thanks. > Here is a one-liner to do it. It is Perl line noise, so it's not very cute, thought that is subjective. The output shown below is for the Git project (not Linux) repository as I've currently synced it: $ git rev-list --objects HEAD | sort | perl -anE 'BEGIN { $prev = ""; $long = "" } $n = $F[0]; for my $i (reverse 1..40) {last if $i < length($long); if (substr($prev, 0, $i) eq substr($n, 0, $i)) {$long = substr($prev, 0, $i); last} } $prev = $n; END {say $long}' c68038ef $ git cat-file -t c68038ef error: short SHA1 c68038ef is ambiguous hint: The candidates are: hint: c68038effe commit 2012-06-01 - vcs-svn: suppress a signed/unsigned comparison warning hint: c68038ef00 blob fatal: Not a valid object name c68038ef > You'll always need to list them all. It's inherently an operation where > for each SHA-1 you need to search for other ones with that prefix up to > a given length. > > Perhaps you've missed that you can use --abbrev=N for this, and just > grep for things that are loger than that N, e.g. for linux.git: > > git log --oneline --abbrev=10 --pretty=format:%h | > grep -E -v '^.{10}$' | > perl -pe 's/^(.{10}).*/$1/' I think the goal was to search all object hashes, not just commits. And git rev-list --objects will do that. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: easy way to demonstrate length of colliding SHA-1 prefixes? 2018-12-03 22:30 ` Matthew DeVore @ 2018-12-03 22:57 ` Jeff King 0 siblings, 0 replies; 5+ messages in thread From: Jeff King @ 2018-12-03 22:57 UTC (permalink / raw) To: Matthew DeVore Cc: Ævar Arnfjörð Bjarmason, Robert P. J. Day, Git Mailing list On Mon, Dec 03, 2018 at 02:30:44PM -0800, Matthew DeVore wrote: > Here is a one-liner to do it. It is Perl line noise, so it's not very cute, > thought that is subjective. The output shown below is for the Git project > (not Linux) repository as I've currently synced it: > > $ git rev-list --objects HEAD | sort | perl -anE 'BEGIN { $prev = ""; $long > = "" } $n = $F[0]; for my $i (reverse 1..40) {last if $i < length($long); if > (substr($prev, 0, $i) eq substr($n, 0, $i)) {$long = substr($prev, 0, $i); > last} } $prev = $n; END {say $long}' Ooh, object-collision golf. Try: git cat-file --batch-all-objects --batch-check='%(objectname)' instead of "rev-list | sort". It's _much_ faster, because it doesn't have to actually open the objects and walk the graph. Some versions of uniq have "-w" (including GNU, but it's definitely not in POSIX), which lets you do: git cat-file --batch-all-objects --batch-check='%(objectname)' | uniq -cdw 7 to list all collisions of length 7 (it will show just the first item from each group, but you can use -D to see them all). > > You'll always need to list them all. It's inherently an operation where > > for each SHA-1 you need to search for other ones with that prefix up to > > a given length. > > > > Perhaps you've missed that you can use --abbrev=N for this, and just > > grep for things that are loger than that N, e.g. for linux.git: > > > > git log --oneline --abbrev=10 --pretty=format:%h | > > grep -E -v '^.{10}$' | > > perl -pe 's/^(.{10}).*/$1/' > > I think the goal was to search all object hashes, not just commits. And git > rev-list --objects will do that. You can add "-t --raw" to see the abbreviated tree and blob names, though it gets tricky around handling merges. -Peff ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-12-03 22:57 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-12-02 11:50 easy way to demonstrate length of colliding SHA-1 prefixes? Robert P. J. Day 2018-12-02 13:23 ` Ævar Arnfjörð Bjarmason 2018-12-02 16:24 ` Robert P. J. Day 2018-12-03 22:30 ` Matthew DeVore 2018-12-03 22:57 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).