* how to determine version of binary
@ 2012-05-05 7:12 Neal Kreitzinger
2012-05-05 9:24 ` Jeff King
2012-05-05 11:43 ` Sitaram Chamarty
0 siblings, 2 replies; 5+ messages in thread
From: Neal Kreitzinger @ 2012-05-05 7:12 UTC (permalink / raw)
To: git
Scenario: I detect a binary file that is 'dirty'. I don't know how it
got there. However, I know it came from a git repo. So I calculate the
sha1 of the binary. What is the git command to determine which commit
that binary version first appeared in? And the last commit that binary
appeared in?
Why: we have people ftp'ing binaries around. I want to see the commit
message and source change of that commit to see what the binary version is.
v/r,
neal
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to determine version of binary
2012-05-05 7:12 how to determine version of binary Neal Kreitzinger
@ 2012-05-05 9:24 ` Jeff King
2012-05-05 16:18 ` Neal Kreitzinger
2012-05-05 11:43 ` Sitaram Chamarty
1 sibling, 1 reply; 5+ messages in thread
From: Jeff King @ 2012-05-05 9:24 UTC (permalink / raw)
To: Neal Kreitzinger; +Cc: git
On Sat, May 05, 2012 at 02:12:44AM -0500, Neal Kreitzinger wrote:
> Scenario: I detect a binary file that is 'dirty'. I don't know how
> it got there. However, I know it came from a git repo. So I
> calculate the sha1 of the binary. What is the git command to
> determine which commit that binary version first appeared in? And
> the last commit that binary appeared in?
There is no pre-made git commit. I would look at the output of "git log --raw
--no-abbrev" in a pager and search for the sha1 in question. That will show you
the commits that made it come and go. Note that there may be multiple instances
in which the sha1 comes and goes (e.g., two parallel lines of development which
both introduce or modify a sha1, or even linear development with reverting).
You can script it like this:
git log --format=%H --no-abbrev --raw |
perl -lne '
BEGIN { $sha1 = shift }
if (/^[0-9a-f]{40}$/) {
$commit = $_;
}
elsif (/^:\d+ \d+ ([0-9a-f]{40}) ([0-9a-f]{40}) \S+\t(.*)/) {
if ($2 eq $sha1) {
# sha1 on "after" side; content probably came into existence
if ($1 eq $sha1) {
# unless it was that way before, in which case it was a mode change
# or rename. Ignore.
}
else {
print "$commit: $sha1 appears (as $3)";
}
}
elsif ($1 eq $sha1) {
# sha1 on "before" side; content went away
print "$commit: $sha1 went away (from $3)";
}
}
' $sha1_of_interest
though I wouldn't bother to do so unless I was going to do some analysis over
many files.
> Why: we have people ftp'ing binaries around. I want to see the
> commit message and source change of that commit to see what the
> binary version is.
This won't necessarily show you the version they have; it will only show you
the version that introduced that particular version of a file. A more general
question is "given a set of files, which revision did they come from?". For
that, you would want to find the set of commits that contain sha1 A, then
intersect them with the set of commits that contain sha1 B, and so forth. You
can do that by scripting around "rev-list" and "ls-tree", but it's a little
more complicated.
-Peff
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to determine version of binary
2012-05-05 9:24 ` Jeff King
@ 2012-05-05 16:18 ` Neal Kreitzinger
2012-05-06 12:43 ` Jeff King
0 siblings, 1 reply; 5+ messages in thread
From: Neal Kreitzinger @ 2012-05-05 16:18 UTC (permalink / raw)
To: Jeff King; +Cc: git
On 5/5/2012 4:24 AM, Jeff King wrote:
> On Sat, May 05, 2012 at 02:12:44AM -0500, Neal Kreitzinger wrote:
>
>> Scenario: I detect a binary file that is 'dirty'. I don't know how
>> it got there. However, I know it came from a git repo. So I
>> calculate the sha1 of the binary. What is the git command to
>> determine which commit that binary version first appeared in? And
>> the last commit that binary appeared in?
> There is no pre-made git commit. I would look at the output of "git log --raw
> --no-abbrev" in a pager and search for the sha1 in question. That will show you
> the commits that made it come and go. Note that there may be multiple instances
> in which the sha1 comes and goes (e.g., two parallel lines of development which
> both introduce or modify a sha1, or even linear development with reverting).
>
> You can script it like this:
>
> git log --format=%H --no-abbrev --raw |
> perl -lne '
> BEGIN { $sha1 = shift }
> if (/^[0-9a-f]{40}$/) {
> $commit = $_;
> }
> elsif (/^:\d+ \d+ ([0-9a-f]{40}) ([0-9a-f]{40}) \S+\t(.*)/) {
> if ($2 eq $sha1) {
> # sha1 on "after" side; content probably came into existence
> if ($1 eq $sha1) {
> # unless it was that way before, in which case it was a mode change
> # or rename. Ignore.
> }
> else {
> print "$commit: $sha1 appears (as $3)";
> }
> }
> elsif ($1 eq $sha1) {
> # sha1 on "before" side; content went away
> print "$commit: $sha1 went away (from $3)";
> }
> }
> ' $sha1_of_interest
>
> though I wouldn't bother to do so unless I was going to do some analysis over
> many files.
>
>> Why: we have people ftp'ing binaries around. I want to see the
>> commit message and source change of that commit to see what the
>> binary version is.
> This won't necessarily show you the version they have; it will only show you
> the version that introduced that particular version of a file. A more general
> question is "given a set of files, which revision did they come from?". For
> that, you would want to find the set of commits that contain sha1 A, then
> intersect them with the set of commits that contain sha1 B, and so forth. You
> can do that by scripting around "rev-list" and "ls-tree", but it's a little
> more complicated.
>
What about this recipe:
calculate sha1 of dirty deliverable (binary, html, etc)
grep git tree objects for that sha1
somehow determine which of the tree sha1's is newest. Not sure how to
do that.
grep commit objects for that tree sha1
now you have the last commit containing that file so now you know the
version of that file.
-neal
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to determine version of binary
2012-05-05 16:18 ` Neal Kreitzinger
@ 2012-05-06 12:43 ` Jeff King
0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2012-05-06 12:43 UTC (permalink / raw)
To: Neal Kreitzinger; +Cc: git
On Sat, May 05, 2012 at 11:18:13AM -0500, Neal Kreitzinger wrote:
> What about this recipe:
>
> calculate sha1 of dirty deliverable (binary, html, etc)
>
> grep git tree objects for that sha1
>
> somehow determine which of the tree sha1's is newest. Not sure how
> to do that.
>
> grep commit objects for that tree sha1
>
> now you have the last commit containing that file so now you know the
> version of that file.
Your "not sure" step would be to walk the revision graph and look for
the tree in question. So really you would just walk and grep the trees.
If you know the filename (which you do in your instance), then it's not
even that expensive:
git rev-list HEAD |
while read commit; do
if test "`git rev-parse $commit:path/to/file`" = $sha1; then
echo "found it in $commit"
break
fi
done
But note that that does not tell you the revision of the whole project.
It tells you one _possible_ version, because it is one that contains
that file. If you remove the "break" there you can get the full set of
commits. And then you cross-reference that with the set of commits in
another file. And then another, and so on, until you eventually have
narrowed it down to a single commit.
It's kind of slow, mostly because we have to invoke rev-parse over and
over. But I don't think there is a way to print the sha1 of some path
for each revision via the regular revision walker.
You could probably do better by finding trees that contain a particular
sha1, then finding trees that contain that tree, and so forth, until you
have a set of commits that contain those trees. And then you could do
that backwards walk for all of the files in parallel (i.e., only accept
a tree if it matches all of the deliverables you have).
-Peff
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to determine version of binary
2012-05-05 7:12 how to determine version of binary Neal Kreitzinger
2012-05-05 9:24 ` Jeff King
@ 2012-05-05 11:43 ` Sitaram Chamarty
1 sibling, 0 replies; 5+ messages in thread
From: Sitaram Chamarty @ 2012-05-05 11:43 UTC (permalink / raw)
To: Neal Kreitzinger; +Cc: git
On Sat, May 5, 2012 at 12:42 PM, Neal Kreitzinger <nkreitzinger@gmail.com> wrote:
> Scenario: I detect a binary file that is 'dirty'. I don't know how it got
> there. However, I know it came from a git repo. So I calculate the sha1 of
> the binary. What is the git command to determine which commit that binary
make sure you are using 'git hash-object' to compute the sha,
not the system supplied 'sha1sum' or eqvt.
> version first appeared in? And the last commit that binary appeared in?
Unless it is a frequent need, I would just use git log's --raw
option to search for the first 7 digits of the SHA you found
above.
For example, a very quick one (which does not count odd
situations like the same file appearing multiple times or on
other branches, for instance) is:
git log --raw | less +/$SHA
You'll want a line where the SHA appears as the second SHA, not
the first one (in case a later commit changed the file it would
also appear as the first sha). Example, I'm looking for
"14136eb", so I type in
git log --raw '--format=%n%ncommit: %h subject: %s' | egrep commit\|14136eb | grep -B 1 14136eb
The output I get back is:
commit: 1cf062f subject: ACCESS_CHECK split into ACCESS_1 and ACCESS_2; docs updated
:100755 100755 14136eb... 2a57e2d... M src/gitolite-shell
--
commit: b391000 subject: POST_GIT triggers get 4 more arguments
:100755 100755 20f4e5d... 14136eb... M src/gitolite-shell
So the commit that introduced this version of this file is
b391000 (1cf062f is a later one where this file got changed to
something else).
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-05-06 12:43 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-05 7:12 how to determine version of binary Neal Kreitzinger
2012-05-05 9:24 ` Jeff King
2012-05-05 16:18 ` Neal Kreitzinger
2012-05-06 12:43 ` Jeff King
2012-05-05 11:43 ` Sitaram Chamarty
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).