git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] get_sha1() shorthands for blob/tree objects
@ 2006-04-18 23:45 Linus Torvalds
  2006-04-19  0:14 ` Martin Langhoff
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-18 23:45 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List



[ NOTE! The reason I put "RFC" in the subject rather than "PATCH" is that 
  I'm not 100% sure this isn't just a "shiny object" of mine rather than a 
  really useful thing to do. What do people think? Have you ever wanted to 
  access individual files in some random revision? Do you think this is 
  useful? I think it's cool and _may_ be useful, but I'm not going to 
  really push this patch. Consider it a throw-away patch unless somebody 
  else finds it intriguing enough.. ]

This is a fairly straightforward patch to allow "get_sha1()" to also have 
shorthands for tree and blob objects.

The syntax is very simple and intuitive: you can specify a tree or a blob 
by simply specifying <revision>:<path>, and get_sha1() will do the SHA1 
lookup from the tree for you.

You can currently do it with "git ls-tree <rev> <path>" and parsing the 
output, but that's actually pretty awkward.

With this, you can do something like

	git cat-file blob v1.2.4:Makefile

to get the contents of "Makefile" at revision v1.2.4.

Now, this isn't necessarily something you really need all that often, but 
the concept itself is actually pretty powerful. We could, for example, 
allow things like

	git diff v0.99.6:git-commit-script..v1.3.0:git-commit.sh

to see the difference between two arbitrary files in two arbitrary 
revisions. To do that, the only thing we'd have to do is to make 
git-diff-tree accept two blobs to diff, in addition to the two trees it 
now expects.

[ IOW, don't get me wrong: the get_sha1() parsing is just the first step, 
  and does _not_ allow that "git diff" syntax to work yet. It parses the 
  object names fine, but git-diff-tree will currently exit with a "fatal: 
  unable to read source tree" error message because the objects aren't 
  tree objects ]

			Linus
---
diff --git a/sha1_name.c b/sha1_name.c
index 4f92e12..0cd1139 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -3,6 +3,7 @@ #include "tag.h"
 #include "commit.h"
 #include "tree.h"
 #include "blob.h"
+#include "diff.h"
 
 static int find_short_object_filename(int len, const char *name, unsigned char *sha1)
 {
@@ -449,12 +450,76 @@ static int get_sha1_1(const char *name, 
 	return get_short_sha1(name, len, sha1, 0);
 }
 
+static int get_tree_entry(const unsigned char *, const char *, unsigned char *);
+
+static int find_tree_entry(struct tree_desc *t, const char *name, unsigned char *result)
+{
+	int namelen = strlen(name);
+	while (t->size) {
+		const char *entry;
+		const unsigned char *sha1;
+		int entrylen, cmp;
+		unsigned mode;
+
+		sha1 = tree_entry_extract(t, &entry, &mode);
+		update_tree_entry(t);
+		entrylen = strlen(entry);
+		if (entrylen > namelen)
+			continue;
+		cmp = memcmp(name, entry, entrylen);
+		if (cmp > 0)
+			continue;
+		if (cmp < 0)
+			break;
+		if (entrylen == namelen) {
+			memcpy(result, sha1, 20);
+			return 0;
+		}
+		if (name[entrylen] != '/')
+			continue;
+		if (!S_ISDIR(mode))
+			break;
+		if (++entrylen == namelen) {
+			memcpy(result, sha1, 20);
+			return 0;
+		}
+		return get_tree_entry(sha1, name + entrylen, result);
+	}
+	return -1;
+}
+
+static int get_tree_entry(const unsigned char *tree_sha1, const char *name, unsigned char *sha1)
+{
+	int retval;
+	void *tree;
+	struct tree_desc t;
+
+	tree = read_object_with_reference(tree_sha1, tree_type, &t.size, NULL);
+	if (!tree)
+		return -1;
+	t.buf = tree;
+	retval = find_tree_entry(&t, name, sha1);
+	free(tree);
+	return retval;
+}
+
 /*
  * This is like "get_sha1_basic()", except it allows "sha1 expressions",
  * notably "xyz^" for "parent of xyz"
  */
 int get_sha1(const char *name, unsigned char *sha1)
 {
+	int ret;
+
 	prepare_alt_odb();
-	return get_sha1_1(name, strlen(name), sha1);
+	ret = get_sha1_1(name, strlen(name), sha1);
+	if (ret < 0) {
+		const char *cp = strchr(name, ':');
+		if (cp) {
+			unsigned char tree_sha1[20];
+			if (!get_sha1_1(name, cp-name, tree_sha1))
+				return get_tree_entry(tree_sha1, cp+1, sha1);
+		}
+	}
+	return ret;
 }

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-18 23:45 [RFC] get_sha1() shorthands for blob/tree objects Linus Torvalds
@ 2006-04-19  0:14 ` Martin Langhoff
  2006-04-19  0:21   ` Shawn Pearce
  2006-04-19  0:27 ` Junio C Hamano
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Martin Langhoff @ 2006-04-19  0:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

On 4/19/06, Linus Torvalds <torvalds@osdl.org> wrote:

>   What do people think? Have you ever wanted to
>   access individual files in some random revision? Do you think this is
>   useful?

Definitely, I've several times had to go through contortions to do
this easily, and I've ended up turning to gitweb often to quickly see
the state of a file at a given revision.

> With this, you can do something like
>
>         git cat-file blob v1.2.4:Makefile
>
> to get the contents of "Makefile" at revision v1.2.4.
>
> Now, this isn't necessarily something you really need all that often, but
> the concept itself is actually pretty powerful. We could, for example,
> allow things like
>
>         git diff v0.99.6:git-commit-script..v1.3.0:git-commit.sh

These two examples are more than enough -- I buy ;-)


martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  0:14 ` Martin Langhoff
@ 2006-04-19  0:21   ` Shawn Pearce
  2006-04-19  1:20     ` Ray Lehtiniemi
  0 siblings, 1 reply; 24+ messages in thread
From: Shawn Pearce @ 2006-04-19  0:21 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Linus Torvalds, Junio C Hamano, Git Mailing List

Not that my voice carries much weight, but a coworker has been
asking for this feature to be added to pg for months.  I've just
been too lazy to get around to writing the shell code to do it.
Making it part of git cat-file seems like a good idea, making it
usable by other tools like git diff just rocks.  :-)

I think its a very worthwhile addition.

Martin Langhoff <martin.langhoff@gmail.com> wrote:
> On 4/19/06, Linus Torvalds <torvalds@osdl.org> wrote:
> 
> >   What do people think? Have you ever wanted to
> >   access individual files in some random revision? Do you think this is
> >   useful?
> 
> Definitely, I've several times had to go through contortions to do
> this easily, and I've ended up turning to gitweb often to quickly see
> the state of a file at a given revision.
> 
> > With this, you can do something like
> >
> >         git cat-file blob v1.2.4:Makefile
> >
> > to get the contents of "Makefile" at revision v1.2.4.
> >
> > Now, this isn't necessarily something you really need all that often, but
> > the concept itself is actually pretty powerful. We could, for example,
> > allow things like
> >
> >         git diff v0.99.6:git-commit-script..v1.3.0:git-commit.sh
> 
> These two examples are more than enough -- I buy ;-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-18 23:45 [RFC] get_sha1() shorthands for blob/tree objects Linus Torvalds
  2006-04-19  0:14 ` Martin Langhoff
@ 2006-04-19  0:27 ` Junio C Hamano
  2006-04-19  0:44   ` Linus Torvalds
  2006-04-19  0:56 ` Junio C Hamano
  2006-04-22  0:49 ` [RFC] get_sha1(): :path and :[0-3]:path to extract from index Junio C Hamano
  3 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2006-04-19  0:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> [ NOTE! The reason I put "RFC" in the subject rather than "PATCH" is that 
>   I'm not 100% sure this isn't just a "shiny object" of mine rather than a 
>   really useful thing to do. What do people think? Have you ever wanted to 
>   access individual files in some random revision? Do you think this is 
>   useful? I think it's cool and _may_ be useful, but I'm not going to 
>   really push this patch. Consider it a throw-away patch unless somebody 
>   else finds it intriguing enough.. ]

Yes, I wanted to do this myself for a while.  The only issue I
might have is what the separator character between rev and path
should be.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  0:27 ` Junio C Hamano
@ 2006-04-19  0:44   ` Linus Torvalds
  2006-04-19  0:47     ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  0:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 18 Apr 2006, Junio C Hamano wrote:
> 
> Yes, I wanted to do this myself for a while.  The only issue I
> might have is what the separator character between rev and path
> should be.

I didn't put a lot of thought into it, but using ':' not only is pretty 
visually pleasing, it also means that shell completion automatically works 
at least on bash, even without any git-specific completion rules.

That's not true for a lot of other special characters: colon really is 
special, exactly because it has been used for things like PATH separators, 
etc.

I'm a total filename completion junkie - I simply cannot type in a 
filename any more. I'll literally press <tab> to complete even short 
filenames, just because it gives me that nice confirmation of the 
existence of the filename (ie I've typed it all in, and the <tab> adds the 
space after the filename).

So to me, ':' is just clearly superior. I can't think of any other 
separator that works with filename completion and has no shell issues, 
_and_ looks logical.

And I thought we already disallowed ':' in branch names because cogito 
uses them for the strange <rev>:<rev> syntax.. 

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  0:44   ` Linus Torvalds
@ 2006-04-19  0:47     ` Linus Torvalds
  2006-04-19  8:15       ` Andreas Ericsson
  0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  0:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 18 Apr 2006, Linus Torvalds wrote:
> 
> And I thought we already disallowed ':' in branch names because cogito 
> uses them for the strange <rev>:<rev> syntax.. 

Btw, pathnames with ':' in them aren't a problem. It's only revision
names that can't have ':' in them and still be used together with this 
syntax.

If you have a pathname with ':', it's fine to do

	git cat-file v1.2.4:pathname:with:colon

because the magic revision parsing only cares about the _first_ colon, and 
will split that into "v1.2.4" and "pathname:with:colon" without ever even 
looking at the other ones.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-18 23:45 [RFC] get_sha1() shorthands for blob/tree objects Linus Torvalds
  2006-04-19  0:14 ` Martin Langhoff
  2006-04-19  0:27 ` Junio C Hamano
@ 2006-04-19  0:56 ` Junio C Hamano
  2006-04-19  1:16   ` Linus Torvalds
  2006-04-22  0:49 ` [RFC] get_sha1(): :path and :[0-3]:path to extract from index Junio C Hamano
  3 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2006-04-19  0:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> 	git diff v0.99.6:git-commit-script..v1.3.0:git-commit.sh

This is interesting.

Yet to be born "internal diff".  Should I start one, or are you
already hacking on it?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  0:56 ` Junio C Hamano
@ 2006-04-19  1:16   ` Linus Torvalds
  2006-04-19  1:19     ` Linus Torvalds
  2006-04-19  1:30     ` Junio C Hamano
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  1:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 18 Apr 2006, Junio C Hamano wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > 	git diff v0.99.6:git-commit-script..v1.3.0:git-commit.sh
> 
> This is interesting.
> 
> Yet to be born "internal diff".  Should I start one, or are you
> already hacking on it?

I'm not working on it, but I might get back to it if nobody else beats me 
to it.

Probably not in the next few days, though, so if you feel the urge, please 
do look into it.

Anyway, the above syntax would actually work even with the current 
external diff, because with my suggested patch we get:

  git-rev-parse v0.99.6:git-commit-script..v1.3.0:git-commit.sh

resulting in

	01c73bdd08e075d650e58664650bcd7fe1cd1551
	^a2455b0f8ff1582248b0678b9c85b2f064d972c4

(which are just the SHA1's for the object), and then the external diff 
will just see that it's two commits, so it decides to just feed them 
directly into git-diff-tree.

So if we added the test for "are the objects two commits" to git-diff-tree 
and diffed them directly, it would all "just work".

But you're certainly correct in stating that it would be _cleaner_ to do 
it with a internal "git diff" command. Everything should be perfectly set 
up by just calling the regular "setup_revisions()", and looking at the 
"rev.pending_objects" list.

Now, the thing that an internal "git diff" could do better is to notice 
when it gets _one_ blob revision, and one filename, ie we could do

	git diff v0.99.6:git-commit-script git-commit.sh

which parses as one SHA1 of a blob (put onto the rev.pending_objects 
list), and one filename (in the rev.prune_data array). We could decide to 
automatically do the "right thing" for that case too.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  1:16   ` Linus Torvalds
@ 2006-04-19  1:19     ` Linus Torvalds
  2006-04-19  1:30     ` Junio C Hamano
  1 sibling, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  1:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 18 Apr 2006, Linus Torvalds wrote:
> 
> So if we added the test for "are the objects two commits" to git-diff-tree 
> and diffed them directly, it would all "just work".

That "two commits" should obviously be "two blobs".

"Dave, my mind is going.. I can feel it.."

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  0:21   ` Shawn Pearce
@ 2006-04-19  1:20     ` Ray Lehtiniemi
  0 siblings, 0 replies; 24+ messages in thread
From: Ray Lehtiniemi @ 2006-04-19  1:20 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Martin Langhoff, Linus Torvalds, Junio C Hamano, Git Mailing List


one more lightweight vote in favor....  such a feature would eliminate a few 
shell scripts i cobbled together which essentially allow me to say

  git diff 8a6352:path/to/module 5f3461:moved/and/hacked/copy


On Tuesday 18 April 2006 18:21, Shawn Pearce wrote:
> Not that my voice carries much weight, but a coworker has been
> asking for this feature to be added to pg for months.  I've just
> been too lazy to get around to writing the shell code to do it.
> Making it part of git cat-file seems like a good idea, making it
> usable by other tools like git diff just rocks.  :-)
>
> I think its a very worthwhile addition.
>
> Martin Langhoff <martin.langhoff@gmail.com> wrote:
> > On 4/19/06, Linus Torvalds <torvalds@osdl.org> wrote:
> > >   What do people think? Have you ever wanted to
> > >   access individual files in some random revision? Do you think this is
> > >   useful?
> >
> > Definitely, I've several times had to go through contortions to do
> > this easily, and I've ended up turning to gitweb often to quickly see
> > the state of a file at a given revision.
> >
> > > With this, you can do something like
> > >
> > >         git cat-file blob v1.2.4:Makefile
> > >
> > > to get the contents of "Makefile" at revision v1.2.4.
> > >
> > > Now, this isn't necessarily something you really need all that often,
> > > but the concept itself is actually pretty powerful. We could, for
> > > example, allow things like
> > >
> > >         git diff v0.99.6:git-commit-script..v1.3.0:git-commit.sh
> >
> > These two examples are more than enough -- I buy ;-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  1:16   ` Linus Torvalds
  2006-04-19  1:19     ` Linus Torvalds
@ 2006-04-19  1:30     ` Junio C Hamano
  2006-04-19  1:43       ` Linus Torvalds
  2006-04-19  3:51       ` Martin Langhoff
  1 sibling, 2 replies; 24+ messages in thread
From: Junio C Hamano @ 2006-04-19  1:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> Now, the thing that an internal "git diff" could do better is to notice 
> when it gets _one_ blob revision, and one filename, ie we could do
>
> 	git diff v0.99.6:git-commit-script git-commit.sh
>
> which parses as one SHA1 of a blob (put onto the rev.pending_objects 
> list), and one filename (in the rev.prune_data array). We could decide to 
> automatically do the "right thing" for that case too.

The "right thing" is ambiguous, I am afraid.

I think it would be natural to interpret the request as a diff
between the blob from v0.99.6 and a random working tree file, 
which may not even exist in the index.

However I suspect what you are getting at is to act as if the
user said:

	git diff v0.99.6:git-commit-script HEAD:git-commit.sh

Oh, another possibility is to act as if the user said

	git diff v0.99.6:this :git-commit.sh

where "(empty):" would stand for "look up in the index, not in a
tree".

I think these are all valid interpretations and there are useful
use cases (admittably the last one is "diff-cache --cached").

Unfortunatly, I do not think this parses well:

	git diff git-commit.sh v0.99.6:git-commit-script

but you could always say:

	git diff -R v0.99.6:git-commit-script git-commit.sh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  1:30     ` Junio C Hamano
@ 2006-04-19  1:43       ` Linus Torvalds
  2006-04-19  4:02         ` Junio C Hamano
  2006-04-19  3:51       ` Martin Langhoff
  1 sibling, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  1:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 18 Apr 2006, Junio C Hamano wrote:

> > 	git diff v0.99.6:git-commit-script git-commit.sh
> >
> > which parses as one SHA1 of a blob (put onto the rev.pending_objects 
> > list), and one filename (in the rev.prune_data array). We could decide to 
> > automatically do the "right thing" for that case too.
> 
> The "right thing" is ambiguous, I am afraid.
> 
> I think it would be natural to interpret the request as a diff
> between the blob from v0.99.6 and a random working tree file, 
> which may not even exist in the index.

Actually, it's not ambiguous. Just take the named file. If you want to 
diff it against HEAD:<named-file>, you should just say so.

Now, if you _really_ want to be difficult, there's a third option: to diff 
it against the named file as named in the index. "git diff" does have the 
"--cached" option that would make that case unambiguous too, of course.

Ie:

 (a) diff against the current HEAD:

	git diff v0.99.6:git-commit-script HEAD:git-commit.sh

 (b) diff against the current index contents for "git-commit.sh":

	git diff --cached v0.99.6:git-commit-script git-commit.sh

 (c) diff against a random file (which may not even be in the index):

	git diff v0.99.6:git-commit-script git-commit.sh

are all sensible operations, and unambiguous.

The only real downside is that the nice "a..b" syntax doesn't work for 
them, except fot the very first case. That syntax requires both sides to 
be SHA1 names (and while the "index" case could be a SHA1 name, we have no 
way to say so).

> Oh, another possibility is to act as if the user said
> 
> 	git diff v0.99.6:this :git-commit.sh
> 
> where "(empty):" would stand for "look up in the index, not in a
> tree".

That would possibly be a nice extension, yes. But I really did mean 
"random file, possibly not even in the index".

> Unfortunatly, I do not think this parses well:
> 
> 	git diff git-commit.sh v0.99.6:git-commit-script

Correct. As you point out, that would have to be done with -R.

			Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  1:30     ` Junio C Hamano
  2006-04-19  1:43       ` Linus Torvalds
@ 2006-04-19  3:51       ` Martin Langhoff
  2006-04-19  3:58         ` Linus Torvalds
  1 sibling, 1 reply; 24+ messages in thread
From: Martin Langhoff @ 2006-04-19  3:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

<wishlist>
  What about support for diffing a subtree?

      git diff v2.2:net/appletalk v2.9:net/appletalk-ng

</whishlist>

cheers,


martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  3:51       ` Martin Langhoff
@ 2006-04-19  3:58         ` Linus Torvalds
  2006-04-19  4:04           ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  3:58 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Junio C Hamano, git



On Wed, 19 Apr 2006, Martin Langhoff wrote:
>
> <wishlist>
>   What about support for diffing a subtree?
> 
>       git diff v2.2:net/appletalk v2.9:net/appletalk-ng
> 
> </whishlist>

Did you try it? It works as-is with that patch. Exactly because it will 
call git-diff-tree with the two SHA1's, and since they are now trees (and 
not blobs) git-diff-tree will do the right thing.

And no, I didn't actually test it. But it really _should_ work with that 
trivial diff, including with the "a..b" format.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  1:43       ` Linus Torvalds
@ 2006-04-19  4:02         ` Junio C Hamano
  2006-04-19  4:14           ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2006-04-19  4:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> Actually, it's not ambiguous. Just take the named file. If you want to 
> diff it against HEAD:<named-file>, you should just say so.
>
> Now, if you _really_ want to be difficult, there's a third option: to diff 
> it against the named file as named in the index. "git diff" does have the 
> "--cached" option that would make that case unambiguous too, of course.
>
> Ie:
>
>  (a) diff against the current HEAD:
>
> 	git diff v0.99.6:git-commit-script HEAD:git-commit.sh
>
>  (b) diff against the current index contents for "git-commit.sh":
>
> 	git diff --cached v0.99.6:git-commit-script git-commit.sh
>
>  (c) diff against a random file (which may not even be in the index):
>
> 	git diff v0.99.6:git-commit-script git-commit.sh
>
> are all sensible operations, and unambiguous.

A small fry in the ointment.  What should the parts that are
output with --name-only say for such a diff?

Blob references like v0.99.6:git-commit-script are resolved by
the extended SHA1 interpreter, and all what the caller of
setup_revisions() can see and feed the diff machinery with has
are their object names.  Something like this is a possibility,
but is ugly.

        diff --git a/a2455b0... b/01c73bd...
        index a2455b0..01c73bd 100644
        --- a/a2455b0...
        +++ b/01c73bd...
        @@ -1,118 +1,509 @@
         #!/bin/sh
         #
         # Copyright (c) 2005 Linus Torvalds
        +# Copyright (c) 2006 Junio C Hamano
        +
        +USAGE='[-a] [-s] [-v] [--no-verify] [-m <message>...
        +SUBDIRECTORY_OK=Yes
        ...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  3:58         ` Linus Torvalds
@ 2006-04-19  4:04           ` Linus Torvalds
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  4:04 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Junio C Hamano, git



On Tue, 18 Apr 2006, Linus Torvalds wrote:
> 
> And no, I didn't actually test it. But it really _should_ work with that 
> trivial diff, including with the "a..b" format.

Ok, tested and verified.

On the kernel:

	git diff v2.6.13:drivers/block/..HEAD:block/ -- ll_rw_blk.c

actually does the right thing - it shows the diff for ll_rw_blk.c which 
got moved from drivers/block/ into block/ between 2.6.13 and current.

Note that the "--" is required, since ll_rw_blk.c doesn't exist in the 
top-level directory where we run the command (but does exist in the git 
trees that we have specified manually with the new shorthand).

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  4:02         ` Junio C Hamano
@ 2006-04-19  4:14           ` Linus Torvalds
  2006-04-19 21:49             ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19  4:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 18 Apr 2006, Junio C Hamano wrote:
> 
> A small fry in the ointment.  What should the parts that are
> output with --name-only say for such a diff?

I have no idea, I have to say ;)

> Blob references like v0.99.6:git-commit-script are resolved by
> the extended SHA1 interpreter, and all what the caller of
> setup_revisions() can see and feed the diff machinery with has
> are their object names.

Actually, the names are there. The object list has the "->name" field 
(used to do the name-based sorting), and we actually even fill it in for 
the stuff we pass in as arguments. See the "add_pending_object()" calls 
in setup_revisions().

We just don't use them right now. We _could_.

> Something like this is a possibility, but is ugly.
> 
>         diff --git a/a2455b0... b/01c73bd...
>         index a2455b0..01c73bd 100644
>         --- a/a2455b0...
>         +++ b/01c73bd...

Yes. But if you look at the object list name (in the "pending_object" 
thing), you _could_ actually get this to be something like

	diff --git v0.99.6:git-commit-script git-commit.sh
	index a2455b0..01c73bd 100644
	--- v0.99.6:git-commit-script
	+++ git-commit.sh

which would be much prettier, although I'm not saying it's necessarily 
worth it. I'm just saying that it's _possible_ with the cmd line parsing 
infrastructure we have now.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  0:47     ` Linus Torvalds
@ 2006-04-19  8:15       ` Andreas Ericsson
  2006-04-19 14:44         ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Ericsson @ 2006-04-19  8:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds wrote:
> 
> On Tue, 18 Apr 2006, Linus Torvalds wrote:
> 
>>And I thought we already disallowed ':' in branch names because cogito 
>>uses them for the strange <rev>:<rev> syntax.. 
> 
> 
> Btw, pathnames with ':' in them aren't a problem. It's only revision
> names that can't have ':' in them and still be used together with this 
> syntax.
> 
> If you have a pathname with ':', it's fine to do
> 
> 	git cat-file v1.2.4:pathname:with:colon
> 
> because the magic revision parsing only cares about the _first_ colon, and 
> will split that into "v1.2.4" and "pathname:with:colon" without ever even 
> looking at the other ones.
> 

Except that you'll have to explicitly state HEAD:pathname:with:colon, or 
does it try finding a file with the argument verbatim first?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  8:15       ` Andreas Ericsson
@ 2006-04-19 14:44         ` Linus Torvalds
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19 14:44 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Junio C Hamano, git



On Wed, 19 Apr 2006, Andreas Ericsson wrote:
> 
> Except that you'll have to explicitly state HEAD:pathname:with:colon, or does
> it try finding a file with the argument verbatim first?

If you have a pathname:with:colon and you _just_ want to use it as a 
pathname, you'd do one of either:

 - use "--" to separate the pathnames from the revision info. That always 
   works.
 - even without the "--", if the first part of the pathname:with:colon (ie 
   the "pathname" part) cannot be looked up as a SHA1 tag, then the
   pathname:with:colon is left alone and seen as a normal file.

So "--" is always the dis-ambiguator. But it almost certainly will never 
be needed.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19  4:14           ` Linus Torvalds
@ 2006-04-19 21:49             ` Junio C Hamano
  2006-04-19 21:57               ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2006-04-19 21:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 18 Apr 2006, Junio C Hamano wrote:
>> 
>> A small fry in the ointment.  What should the parts that are
>> output with --name-only say for such a diff?
>
> I have no idea, I have to say ;)

Another small one ;-).  Bare blobs do not have modes, so diffcore
needs to be told "do not bother comparing mode for this pair".

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1() shorthands for blob/tree objects
  2006-04-19 21:49             ` Junio C Hamano
@ 2006-04-19 21:57               ` Linus Torvalds
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-19 21:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 19 Apr 2006, Junio C Hamano wrote:
>
> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > On Tue, 18 Apr 2006, Junio C Hamano wrote:
> >> 
> >> A small fry in the ointment.  What should the parts that are
> >> output with --name-only say for such a diff?
> >
> > I have no idea, I have to say ;)
> 
> Another small one ;-).  Bare blobs do not have modes, so diffcore
> needs to be told "do not bother comparing mode for this pair".

True, but sad.

We clearly _do_ have the mode when we look it up using the tree:name 
shorthand, so it's a bit sad that we drop it on the floor like that.

We could do something like with the "name" part - squirrel the mode away 
for users that could use it, and use the special mode of 0 as "unknown" 
for when somebody really does give a pure SHA1 number (short or full).

But I suspect most people simply don't care, so dropping the mode might 
just be the right thing to do. Especially if we do know that we _could_ 
get the mode if people really really care and complain some day in the 
future..

(The logic being: 640kB really _is_ enough for everybody, if you just know 
that you can extend on it in the future without undue pain when people 
complain)

			Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC] get_sha1(): :path and :[0-3]:path to extract from index.
  2006-04-18 23:45 [RFC] get_sha1() shorthands for blob/tree objects Linus Torvalds
                   ` (2 preceding siblings ...)
  2006-04-19  0:56 ` Junio C Hamano
@ 2006-04-22  0:49 ` Junio C Hamano
  2006-04-25  8:37   ` Uwe Zeisberger
  3 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2006-04-22  0:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

[ NOTE! The reason I put "RFC" in the subject rather than "PATCH" is that 
  I'm not 100% sure this isn't just a "shiny object" of mine rather than a 
  really useful thing to do. What do people think? Have you ever wanted to 
  access individual files in some random revision? Do you think this is 
  useful? I think it's cool and _may_ be useful, but I'm not going to 
  really push this patch. Consider it a throw-away patch unless somebody 
  else finds it intriguing enough.. ]

This is a fairly straightforward patch to allow "get_sha1()" to
also have shorthands for blob objects in the current index.

The syntax is very simple and intuitive: you can specify a blob
from the current index by simply specifying :<stage>:<path> or
:<path>, and get_sha1() will do the SHA1 lookup from the index
for you.

You can currently do it with "git ls-files -s <path>" and parsing the 
output, but that's actually pretty awkward.

With this, you can do something like

	git cat-file blob :3:Makefile

to get the contents of "Makefile" from their version during a
conflicted merge.

This does not do trees like <ent>:<path> version does.  We could
do that by writing out a temporary tree object if we wanted to,
but that would smudge the object database, so I refrained from
doing that as part of this patch.  If we did that,

	git diff-tree :2: :3:

would be an inefficient equivalent to "git diff-stages 2 3".

---
 sha1_name.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 345935b..ec5cd2c 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -458,17 +458,55 @@ int get_sha1(const char *name, unsigned 
 {
 	int ret;
 	unsigned unused;
+	int namelen = strlen(name);
+	const char *cp;
 
 	prepare_alt_odb();
-	ret = get_sha1_1(name, strlen(name), sha1);
-	if (ret < 0) {
-		const char *cp = strchr(name, ':');
-		if (cp) {
-			unsigned char tree_sha1[20];
-			if (!get_sha1_1(name, cp-name, tree_sha1))
-				return get_tree_entry(tree_sha1, cp+1, sha1,
-						      &unused);
+	ret = get_sha1_1(name, namelen, sha1);
+	if (!ret)
+		return ret;
+	/* sha1:path --> object name of path in ent sha1
+	 * :path -> object name of path in index
+	 * :[0-3]:path -> object name of path in index at stage
+	 */
+	if (name[0] == ':') {
+		int stage = 0;
+		struct cache_entry *ce;
+		int pos;
+		if (namelen < 3 ||
+		    name[2] != ':' ||
+		    name[1] < '0' || '3' < name[1])
+			cp = name + 1;
+		else {
+			stage = name[1] - '0';
+			cp = name + 3;
 		}
+		namelen = namelen - (cp - name);
+		if (!active_cache)
+			read_cache();
+		if (active_nr < 0)
+			return -1;
+		pos = cache_name_pos(cp, namelen);
+		if (pos < 0)
+			pos = -pos - 1;
+		while (pos < active_nr) {
+			ce = active_cache[pos];
+			if (ce_namelen(ce) != namelen ||
+			    memcmp(ce->name, cp, namelen))
+				break;
+			if (ce_stage(ce) == stage) {
+				memcpy(sha1, ce->sha1, 20);
+				return 0;
+			}
+		}
+		return -1;
+	}
+	cp = strchr(name, ':');
+	if (cp) {
+		unsigned char tree_sha1[20];
+		if (!get_sha1_1(name, cp-name, tree_sha1))
+			return get_tree_entry(tree_sha1, cp+1, sha1,
+					      &unused);
 	}
 	return ret;
 }
-- 
1.3.0.g6082

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1(): :path and :[0-3]:path to extract from index.
  2006-04-22  0:49 ` [RFC] get_sha1(): :path and :[0-3]:path to extract from index Junio C Hamano
@ 2006-04-25  8:37   ` Uwe Zeisberger
  2006-04-25  8:46     ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Uwe Zeisberger @ 2006-04-25  8:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

Junio C Hamano wrote:
> [ NOTE! The reason I put "RFC" in the subject rather than "PATCH" is that 
>   I'm not 100% sure this isn't just a "shiny object" of mine rather than a 
>   really useful thing to do. What do people think? Have you ever wanted to 
>   access individual files in some random revision? Do you think this is 
>   useful? I think it's cool and _may_ be useful, but I'm not going to 
>   really push this patch. Consider it a throw-away patch unless somebody 
>   else finds it intriguing enough.. ]
> 
> This is a fairly straightforward patch to allow "get_sha1()" to
> also have shorthands for blob objects in the current index.
I sometimes want to have something like that:

	uzeisberger@io:~/gsrc/linux-2.6$ git cat-file blob v2.6.16:Makefile

That is not a shortcut for objects in the current index, but for blobs
in written trees.

It's easy to hack a script that does that.  Something like that[1]:


	#! /bin/sh

	eval `echo ${1} | sed 's/\\(.*\\):\\(.*\\)/commit=\"\\1^{}\"; file=\"\\2\"/'`

	tree=`git cat-file commit ${commit} | sed -n 's/tree //p'`

	blob=`git ls-tree -r ${tree} | awk "\\\$4 == \\"${file}\\" { print \\\$3 }"`

	git cat-file blob ${blob}


But if the rev-parser could handle that, that would be much finer.  Or
is there already a way to do this that I don't know?

Best regards
Uwe

[1] It's not tested and probably fails if there are some "bad"
characters in ${1} and could be implemented in a much cleverer way.

-- 
Uwe Zeisberger

http://www.google.com/search?q=0+degree+Celsius+in+kelvin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] get_sha1(): :path and :[0-3]:path to extract from index.
  2006-04-25  8:37   ` Uwe Zeisberger
@ 2006-04-25  8:46     ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2006-04-25  8:46 UTC (permalink / raw)
  To: Uwe Zeisberger; +Cc: Linus Torvalds, git

Uwe Zeisberger <zeisberg@informatik.uni-freiburg.de> writes:

>> This is a fairly straightforward patch to allow "get_sha1()" to
>> also have shorthands for blob objects in the current index.
>
> I sometimes want to have something like that:
>
> 	uzeisberger@io:~/gsrc/linux-2.6$ git cat-file blob v2.6.16:Makefile
>
> That is not a shortcut for objects in the current index, but for blobs
> in written trees.

That's already present in the "master".  You are responding to a
wrong message ;-).

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2006-04-25  8:46 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-18 23:45 [RFC] get_sha1() shorthands for blob/tree objects Linus Torvalds
2006-04-19  0:14 ` Martin Langhoff
2006-04-19  0:21   ` Shawn Pearce
2006-04-19  1:20     ` Ray Lehtiniemi
2006-04-19  0:27 ` Junio C Hamano
2006-04-19  0:44   ` Linus Torvalds
2006-04-19  0:47     ` Linus Torvalds
2006-04-19  8:15       ` Andreas Ericsson
2006-04-19 14:44         ` Linus Torvalds
2006-04-19  0:56 ` Junio C Hamano
2006-04-19  1:16   ` Linus Torvalds
2006-04-19  1:19     ` Linus Torvalds
2006-04-19  1:30     ` Junio C Hamano
2006-04-19  1:43       ` Linus Torvalds
2006-04-19  4:02         ` Junio C Hamano
2006-04-19  4:14           ` Linus Torvalds
2006-04-19 21:49             ` Junio C Hamano
2006-04-19 21:57               ` Linus Torvalds
2006-04-19  3:51       ` Martin Langhoff
2006-04-19  3:58         ` Linus Torvalds
2006-04-19  4:04           ` Linus Torvalds
2006-04-22  0:49 ` [RFC] get_sha1(): :path and :[0-3]:path to extract from index Junio C Hamano
2006-04-25  8:37   ` Uwe Zeisberger
2006-04-25  8:46     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).