Bug in "git rev-parse --verify"

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Bug in "git rev-parse --verify"
@ 2013-03-28 13:04 Michael Haggerty
  2013-03-28 14:05 ` Jeff King
       [not found] ` <CAPc5daUqzz=9TBmj2Q0MHqEc6gMHxXoGr9+JV3hq76zDKJAyCw@mail.gmail.com>
  0 siblings, 2 replies; 22+ messages in thread
From: Michael Haggerty @ 2013-03-28 13:04 UTC (permalink / raw)
  To: git discussion list

On Junio's master, "git rev-parse --verify" accepts *any* 40-digit
hexadecimal number.  For example, pass it 40 "1" characters, and it
accepts the argument:

$ git rev-parse --verify 1111111111111111111111111111111111111111
1111111111111111111111111111111111111111
$ echo $?
0

Obviously, my repo doesn't have an object with this hash :-) so I think
this argument should be rejected.

If you add or remove a digit (to make the length different than 40), it
is correctly rejected:

$ git rev-parse --verify 111111111111111111111111111111111111111
fatal: Needed a single revision
$ echo $?
128

I believe that "git rev-parse --verify" is meant to verify that the
argument is an actual object, and that it should reject fictional SHA1s.
 (If not then the documentation should be clarified.)  The same problem
also exists in 1.8.2 but I haven't checked how much older it is.

The behavior presumably comes from the following clause in get_sha1_basic():

	if (len == 40 && !get_sha1_hex(str, sha1))
		return 0;

I won't have time to pursue this.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-28 13:04 Bug in "git rev-parse --verify" Michael Haggerty
@ 2013-03-28 14:05 ` Jeff King
  2013-03-28 14:55   ` Junio C Hamano
       [not found] ` <CAPc5daUqzz=9TBmj2Q0MHqEc6gMHxXoGr9+JV3hq76zDKJAyCw@mail.gmail.com>
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff King @ 2013-03-28 14:05 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git discussion list

On Thu, Mar 28, 2013 at 02:04:27PM +0100, Michael Haggerty wrote:

> $ git rev-parse --verify 1111111111111111111111111111111111111111
> 1111111111111111111111111111111111111111
> $ echo $?
> 0
>
> [...]
>
> I believe that "git rev-parse --verify" is meant to verify that the
> argument is an actual object, and that it should reject fictional SHA1s.
>  (If not then the documentation should be clarified.)  The same problem
> also exists in 1.8.2 but I haven't checked how much older it is.

I think it is only about verifying the name of the object. I.e., that
you can resolve the argument to an object. It has always behaved this
way; I even just tested with git v0.99.

I can't think of any reason it would not be helpful to have it _also_
verify that we have the object in question. Most of the time that would
be a no-op, as any ref we resolve should point to a valid object, but it
would mean we could catch repository corruption (i.e., missing objects)
early.

Doing a has_sha1_file() check would be relatively quick. Doing a full
parse_object and checking the sha1 would be more robust, but would mean
that:

  blob=`git rev-parse --verify HEAD:some-large-file`

would become much more expensive (and presumably you are about to access
the object _anyway_, at which point you would notice problems, so it is
redundantly expensive).

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-28 14:05 ` Jeff King
@ 2013-03-28 14:55   ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2013-03-28 14:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git discussion list

Jeff King <peff@peff.net> writes:

> On Thu, Mar 28, 2013 at 02:04:27PM +0100, Michael Haggerty wrote:
>
>> $ git rev-parse --verify 1111111111111111111111111111111111111111
>> 1111111111111111111111111111111111111111
>> $ echo $?
>> 0
>>
>> [...]
>>
>> I believe that "git rev-parse --verify" is meant to verify that the
>> argument is an actual object, and that it should reject fictional SHA1s.
>>  (If not then the documentation should be clarified.)  The same problem
>> also exists in 1.8.2 but I haven't checked how much older it is.
>
> I think it is only about verifying the name of the object. I.e., that
> you can resolve the argument to an object. It has always behaved this
> way; I even just tested with git v0.99.

Correct.  It is about "is it well formed and something we can turn
into 20-byte object name?" and nothing more.  It certainly does not
mean "do we have it?", as the function needs to be able to validate
something we do not yet have.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
       [not found] ` <CAPc5daUqzz=9TBmj2Q0MHqEc6gMHxXoGr9+JV3hq76zDKJAyCw@mail.gmail.com>
@ 2013-03-28 15:34   ` Michael Haggerty
  2013-03-28 15:38     ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Haggerty @ 2013-03-28 15:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Jeff King

On 03/28/2013 02:48 PM, Junio C Hamano wrote:
> I think it has always been about "is this well formed and we can turn it
> into a raw 20-byte object name?" and never about"does it exist?"

That's surprising.  The man page says

    --verify
        The parameter given must be usable as a single, valid object name.
        Otherwise barf and abort.

"Valid", to me, implies that the parameter should be the name of an
actual object, and this also seems a more useful concept to me and more
consistent with the command's behavior when passed other arguments.


Is there a simple way to verify an object name more strictly and convert
it to an SHA1?  I can only think of solutions that require two commands,
like

    git cat-file -e $ARG && git rev-parse --verify $ARG

I suppose in most contexts where one wants to know whether an object
name is valid, one should also verify that the object has the type that
you expect:

    test X$(git cat-file -t $ARG) = Xcommit &&
        git rev-parse --verify $ARG

or (allowing tag dereferencing)

    git cat-file -e $ARG^{commit} &&
        git rev-parse --verify $ARG^{commit}

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-28 15:34   ` Michael Haggerty
@ 2013-03-28 15:38     ` Jeff King
  2013-03-28 15:52       ` Michael Haggerty
  2013-03-28 16:52       ` Junio C Hamano
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff King @ 2013-03-28 15:38 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Git Mailing List

On Thu, Mar 28, 2013 at 04:34:19PM +0100, Michael Haggerty wrote:

> Is there a simple way to verify an object name more strictly and convert
> it to an SHA1?  I can only think of solutions that require two commands,
> like
> 
>     git cat-file -e $ARG && git rev-parse --verify $ARG

Is the rev-parse line doing anything there? If $ARG does not resolve to
a sha1, then wouldn't cat-file have failed?

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-28 15:38     ` Jeff King
@ 2013-03-28 15:52       ` Michael Haggerty
  2013-03-28 16:38         ` Jeff King
  2013-03-28 16:52       ` Junio C Hamano
  1 sibling, 1 reply; 22+ messages in thread
From: Michael Haggerty @ 2013-03-28 15:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List

On 03/28/2013 04:38 PM, Jeff King wrote:
> On Thu, Mar 28, 2013 at 04:34:19PM +0100, Michael Haggerty wrote:
> 
>> Is there a simple way to verify an object name more strictly and convert
>> it to an SHA1?  I can only think of solutions that require two commands,
>> like
>>
>>     git cat-file -e $ARG && git rev-parse --verify $ARG
> 
> Is the rev-parse line doing anything there? If $ARG does not resolve to
> a sha1, then wouldn't cat-file have failed?

It's outputting the SHA1, which cat-file seems incapable of providing in
a useful way.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-28 15:52       ` Michael Haggerty
@ 2013-03-28 16:38         ` Jeff King
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff King @ 2013-03-28 16:38 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Git Mailing List

On Thu, Mar 28, 2013 at 04:52:15PM +0100, Michael Haggerty wrote:

> On 03/28/2013 04:38 PM, Jeff King wrote:
> > On Thu, Mar 28, 2013 at 04:34:19PM +0100, Michael Haggerty wrote:
> > 
> >> Is there a simple way to verify an object name more strictly and convert
> >> it to an SHA1?  I can only think of solutions that require two commands,
> >> like
> >>
> >>     git cat-file -e $ARG && git rev-parse --verify $ARG
> > 
> > Is the rev-parse line doing anything there? If $ARG does not resolve to
> > a sha1, then wouldn't cat-file have failed?
> 
> It's outputting the SHA1, which cat-file seems incapable of providing in
> a useful way.

Ah, I see; I was looking too much at your example, and not thinking
about how you would want to use it in a script.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-28 15:38     ` Jeff King
  2013-03-28 15:52       ` Michael Haggerty
@ 2013-03-28 16:52       ` Junio C Hamano
  2013-03-30  3:44         ` Michael Haggerty
  1 sibling, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2013-03-28 16:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, Junio C Hamano, Git Mailing List

Jeff King <peff@peff.net> writes:

> On Thu, Mar 28, 2013 at 04:34:19PM +0100, Michael Haggerty wrote:
>
>> Is there a simple way to verify an object name more strictly and convert
>> it to an SHA1?  I can only think of solutions that require two commands,
>> like
>> 
>>     git cat-file -e $ARG && git rev-parse --verify $ARG
>
> Is the rev-parse line doing anything there? If $ARG does not resolve to
> a sha1, then wouldn't cat-file have failed?
>
> -Peff

You could force rev-parse to resolve the input to an existing
object, with something like this:

	git rev-parse --verify "$ARG^{}"

It will unwrap a tag, so the output may end up pointing at a object
that is different from $ARG in such a case.

But what is the purpose of turning a random string to a random
40-hex in the first place?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-28 16:52       ` Junio C Hamano
@ 2013-03-30  3:44         ` Michael Haggerty
  2013-03-30  4:09           ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Haggerty @ 2013-03-30  3:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Junio C Hamano, Git Mailing List

On 03/28/2013 05:52 PM, Junio C Hamano wrote:
> You could force rev-parse to resolve the input to an existing
> object, with something like this:
> 
> 	git rev-parse --verify "$ARG^{}"
> 
> It will unwrap a tag, so the output may end up pointing at a object
> that is different from $ARG in such a case.

Yes, if unwrapping tags is OK then this would work.

> But what is the purpose of turning a random string to a random
> 40-hex in the first place?

In non-trivial scripts, it makes sense to convert user input into a
known and verified quantity (SHA1) once, while processing external
inputs, and not have to think about it afterwards.  Verifying and
converting to pure-SHA1s as soon as possible has several advantages:

1. An SHA1 is a canonical representation of the argument, useful for
example as the key in a hash map for for looking for the presence of a
commit in a rev-list output.

2. An SHA1 is persistent.  For example, I use them when caching
benchmark results across versions.  Moreover, they are safe for use in
filenames.  The persistence also makes scripts more robust against
simultaneous changes to the repo by other processes, whereas if I use a
string like "branch^" multiple times, there is no guarantee that it
always refers to the same commit.

3. Verifying arguments at one spot centralizes error-checking at the
start of a script and eliminates one reason for random git commands to
fail later (when error recovery is perhaps more difficult).

4. Converting once avoids the overhead of repeated conversion from a
free-form committish into an object name if the argument needs to be
passed to multiple git commands (though presumably the overhead is
insignificant in most cases).

Undoubtedly in many cases this practice of
early-verification-and-conversion is unnecessary, or the same benefit
could be had from either verifying or converting without doing both.
But verifying-and-converting is easy, cheap, and means not having to
decide on a case-by-case basis whether it could have been avoided.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-30  3:44         ` Michael Haggerty
@ 2013-03-30  4:09           ` Junio C Hamano
  2013-03-30  4:15             ` Junio C Hamano
  2013-03-30  5:29             ` Michael Haggerty
  0 siblings, 2 replies; 22+ messages in thread
From: Junio C Hamano @ 2013-03-30  4:09 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Junio C Hamano, Git Mailing List

Michael Haggerty <mhagger@alum.mit.edu> writes:

> 1. An SHA1 is a canonical representation of the argument, useful for
> example as the key in a hash map for for looking for the presence of a
> commit in a rev-list output.
>
> 2. An SHA1 is persistent.  For example, I use them when caching
> benchmark results across versions.  Moreover, they are safe for use in
> filenames.  The persistence also makes scripts more robust against
> simultaneous changes to the repo by other processes, whereas if I use a
> string like "branch^" multiple times, there is no guarantee that it
> always refers to the same commit.

These two are half-irrelevant; they only call for use of the current
"rev-parse --verify" that always gives you 40-hex.  The more
important one is the next one.

> 3. Verifying arguments at one spot centralizes error-checking at the
> start of a script and eliminates one reason for random git commands to
> fail later (when error recovery is perhaps more difficult).

Not necessarily, unless your script is a very narrow corner case
that is satisfied with "any kind of object goes".  When a parameter
has to be commit for some purpose and another parameter can be any
tree-ish, you would want to validate them _with_ type requirement.

If you are using the "object name persistency" to create a cache of
"how many times the word 'gogopuff' appears in the entire tree?",
you want to cache the result keyed with the object name of the tree,
whether your user gives you v1.8.0 (tag), v1.8.0^0 (commit), or
v1.8.0^{tree} (tree), and you want to unwrap the user input down to
tree object name to look up the pre-computed result for the cache to
be any useful.

> 4. Converting once avoids the overhead of repeated conversion from a
> free-form committish into an object name if the argument needs to be
> passed to multiple git commands (though presumably the overhead is
> insignificant in most cases).

True.

So why not verify arguments while making sure of their type early
with 'rev-parse --verify "$userinput^{desiredtype}"?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-30  4:09           ` Junio C Hamano
@ 2013-03-30  4:15             ` Junio C Hamano
  2013-03-30  5:29             ` Michael Haggerty
  1 sibling, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2013-03-30  4:15 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Junio C Hamano, Git Mailing List

Junio C Hamano <gitster@pobox.com> writes:

>> 3. Verifying arguments at one spot centralizes error-checking at the
>> start of a script and eliminates one reason for random git commands to
>> fail later (when error recovery is perhaps more difficult).
>
> Not necessarily, unless your script is a very narrow corner case
> that is satisfied with "any kind of object goes".

Clarification.  I meant "A single central point to check is a useful
concept, but centrally checking 'does this exist? I do not care what
it is' is not necessarily useful and I suspect it is not useful more
often than it is".  When you care if something exists, you often
want to know what it is and other aspects of that thing at the same
time.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-30  4:09           ` Junio C Hamano
  2013-03-30  4:15             ` Junio C Hamano
@ 2013-03-30  5:29             ` Michael Haggerty
  2013-03-30  6:38               ` Junio C Hamano
  1 sibling, 1 reply; 22+ messages in thread
From: Michael Haggerty @ 2013-03-30  5:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Junio C Hamano, Git Mailing List

On 03/30/2013 05:09 AM, Junio C Hamano wrote:
> So why not verify arguments while making sure of their type early
> with 'rev-parse --verify "$userinput^{desiredtype}"?

Yes, that's a better solution in almost all cases.  Thanks for the tip.

(It doesn't change my opinion that the documentation for "rev-parse
--verify" is misleading, but given that you don't appear to want to
change its behavior I will submit a documentation patch.)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-30  5:29             ` Michael Haggerty
@ 2013-03-30  6:38               ` Junio C Hamano
  2013-03-30  7:05                 ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2013-03-30  6:38 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Git Mailing List

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 03/30/2013 05:09 AM, Junio C Hamano wrote:
>> So why not verify arguments while making sure of their type early
>> with 'rev-parse --verify "$userinput^{desiredtype}"?
>
> Yes, that's a better solution in almost all cases.  Thanks for the tip.
>
> (It doesn't change my opinion that the documentation for "rev-parse
> --verify" is misleading, but given that you don't appear to want to
> change its behavior I will submit a documentation patch.)

It does not matter what I want.  What matters is that changing the
definition is a _wrong_ thing to do, as --verify is designed to be
usable for objects you may not yet have.

What we may want is another type peeling operator, ^{object}.
makes sure it is an object, that lets you say:

    rev-parse --verify 572a535454612a046e7dd7404dcca94d6243c788^{object}

It asks "I have this 40-hex; I want an object out of it", just like
frotz^{tree} is "I have 'frotz'; I want a tree-ish" for any value of
'frotz'.

With that, a use case that it wants to see _any_ object can safely
use 'rev-parse --verify "$userinput^{object}' without an annotated
tag getting in the way.

How does that sound?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-30  6:38               ` Junio C Hamano
@ 2013-03-30  7:05                 ` Junio C Hamano
  2013-03-30  8:09                   ` Michael Haggerty
                                     ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Junio C Hamano @ 2013-03-30  7:05 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Git Mailing List

Junio C Hamano <gitster@pobox.com> writes:

> What we may want is another type peeling operator, ^{object}.
> that makes sure it is an object, like this:
>
>     rev-parse --verify 572a535454612a046e7dd7404dcca94d6243c788^{object}
>
> It asks "I have this 40-hex; I want an object out of it", just like
> frotz^{tree} is "I have 'frotz'; I want a tree-ish" for any value of
> 'frotz'.
>
> With that, a use case that it wants to see _any_ object can safely
> use 'rev-parse --verify "$userinput^{object}' without an annotated
> tag getting in the way.
>
> How does that sound?

Perhaps something like this.  Note that the last hunk is unrelated
thinko-fix I noticed while browsing the code.

-- >8 --
Subject: sha1_name.c: ^{object} peeler

A string that names an object can be suffixed with ^{type} peeler to
say "I have this object name; peel it until you get this type. If
you cannot do so, it is an error".  v1.8.2^{commit} asks for a commit
that is pointed at an annotated tag v1.8.2; v1.8.2^{tree} unwraps it
further to the top-level tree object.  A special suffix ^{} (i.e. no
type specified) means "I do not care what it unwraps to; just peel
annotated tag until you get something that is not a tag".

When you have a random user-supplied string, you can turn it to a
bare 40-hex object name, and cause it to error out if such an object
does not exist, with:

	git rev-parse --verify "$userstring^{}"

for most objects, but this does not yield the tag object name when
$userstring refers to an annotated tag.

Introduce a new suffix, ^{object}, that only makes sure the given
name refers to an existing object.  Then

	git rev-parse --verify "$userstring^{object}"

becomes a way to make sure $userstring refers to an existing object.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sha1_name.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/sha1_name.c b/sha1_name.c
index c50630a..85b6e75 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -594,7 +594,7 @@ struct object *peel_to_type(const char *name, int namelen,
 	while (1) {
 		if (!o || (!o->parsed && !parse_object(o->sha1)))
 			return NULL;
-		if (o->type == expected_type)
+		if (expected_type == OBJ_ANY || o->type == expected_type)
 			return o;
 		if (o->type == OBJ_TAG)
 			o = ((struct tag*) o)->tagged;
@@ -645,6 +645,8 @@ static int peel_onion(const char *name, int len, unsigned char *sha1)
 		expected_type = OBJ_TREE;
 	else if (!strncmp(blob_type, sp, 4) && sp[4] == '}')
 		expected_type = OBJ_BLOB;
+	else if (!prefixcmp(sp, "object}"))
+		expected_type = OBJ_ANY;
 	else if (sp[0] == '}')
 		expected_type = OBJ_NONE;
 	else if (sp[0] == '/')
@@ -654,6 +656,8 @@ static int peel_onion(const char *name, int len, unsigned char *sha1)
 
 	if (expected_type == OBJ_COMMIT)
 		lookup_flags = GET_SHA1_COMMITTISH;
+	else if (expected_type == OBJ_TREE)
+		lookup_flags = GET_SHA1_TREEISH;
 
 	if (get_sha1_1(name, sp - name - 2, outer, lookup_flags))
 		return -1;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-30  7:05                 ` Junio C Hamano
@ 2013-03-30  8:09                   ` Michael Haggerty
  2013-03-30  8:14                   ` Elia Pinto
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Michael Haggerty @ 2013-03-30  8:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

On 03/30/2013 08:05 AM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> What we may want is another type peeling operator, ^{object}.
>> that makes sure it is an object, like this:
>>
>>     rev-parse --verify 572a535454612a046e7dd7404dcca94d6243c788^{object}
>>
>> It asks "I have this 40-hex; I want an object out of it", just like
>> frotz^{tree} is "I have 'frotz'; I want a tree-ish" for any value of
>> 'frotz'.
>>
>> With that, a use case that it wants to see _any_ object can safely
>> use 'rev-parse --verify "$userinput^{object}' without an annotated
>> tag getting in the way.
>>
>> How does that sound?
> 
> Perhaps something like this.  Note that the last hunk is unrelated
> thinko-fix I noticed while browsing the code.

Sounds reasonable to me.  I'm not familiar with this code, but your
change looks simple enough.  Plus documentation change in
Documentation/revisions.txt, of course.

Thanks,
Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bug in "git rev-parse --verify"
  2013-03-30  7:05                 ` Junio C Hamano
  2013-03-30  8:09                   ` Michael Haggerty
@ 2013-03-30  8:14                   ` Elia Pinto
  2013-03-31 22:38                   ` [PATCH 1/2] peel_onion: disambiguate to favor tree-ish when we want a tree-ish Junio C Hamano
  2013-03-31 22:38                   ` [PATCH 2/2] peel_onion(): teach $foo^{object} peeler Junio C Hamano
  3 siblings, 0 replies; 22+ messages in thread
From: Elia Pinto @ 2013-03-30  8:14 UTC (permalink / raw)
  To: Junio C Hamano, Michael Haggerty, Jeff King, Git Mailing List

Fwiw, look very a sound idea for me.

Best

2013/3/30, Junio C Hamano <gitster@pobox.com>:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> What we may want is another type peeling operator, ^{object}.
>> that makes sure it is an object, like this:
>>
>>     rev-parse --verify 572a535454612a046e7dd7404dcca94d6243c788^{object}
>>
>> It asks "I have this 40-hex; I want an object out of it", just like
>> frotz^{tree} is "I have 'frotz'; I want a tree-ish" for any value of
>> 'frotz'.
>>
>> With that, a use case that it wants to see _any_ object can safely
>> use 'rev-parse --verify "$userinput^{object}' without an annotated
>> tag getting in the way.
>>
>> How does that sound?
>
> Perhaps something like this.  Note that the last hunk is unrelated
> thinko-fix I noticed while browsing the code.
>
> -- >8 --
> Subject: sha1_name.c: ^{object} peeler
>
> A string that names an object can be suffixed with ^{type} peeler to
> say "I have this object name; peel it until you get this type. If
> you cannot do so, it is an error".  v1.8.2^{commit} asks for a commit
> that is pointed at an annotated tag v1.8.2; v1.8.2^{tree} unwraps it
> further to the top-level tree object.  A special suffix ^{} (i.e. no
> type specified) means "I do not care what it unwraps to; just peel
> annotated tag until you get something that is not a tag".
>
> When you have a random user-supplied string, you can turn it to a
> bare 40-hex object name, and cause it to error out if such an object
> does not exist, with:
>
> 	git rev-parse --verify "$userstring^{}"
>
> for most objects, but this does not yield the tag object name when
> $userstring refers to an annotated tag.
>
> Introduce a new suffix, ^{object}, that only makes sure the given
> name refers to an existing object.  Then
>
> 	git rev-parse --verify "$userstring^{object}"
>
> becomes a way to make sure $userstring refers to an existing object.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  sha1_name.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/sha1_name.c b/sha1_name.c
> index c50630a..85b6e75 100644
> --- a/sha1_name.c
> +++ b/sha1_name.c
> @@ -594,7 +594,7 @@ struct object *peel_to_type(const char *name, int
> namelen,
>  	while (1) {
>  		if (!o || (!o->parsed && !parse_object(o->sha1)))
>  			return NULL;
> -		if (o->type == expected_type)
> +		if (expected_type == OBJ_ANY || o->type == expected_type)
>  			return o;
>  		if (o->type == OBJ_TAG)
>  			o = ((struct tag*) o)->tagged;
> @@ -645,6 +645,8 @@ static int peel_onion(const char *name, int len,
> unsigned char *sha1)
>  		expected_type = OBJ_TREE;
>  	else if (!strncmp(blob_type, sp, 4) && sp[4] == '}')
>  		expected_type = OBJ_BLOB;
> +	else if (!prefixcmp(sp, "object}"))
> +		expected_type = OBJ_ANY;
>  	else if (sp[0] == '}')
>  		expected_type = OBJ_NONE;
>  	else if (sp[0] == '/')
> @@ -654,6 +656,8 @@ static int peel_onion(const char *name, int len,
> unsigned char *sha1)
>
>  	if (expected_type == OBJ_COMMIT)
>  		lookup_flags = GET_SHA1_COMMITTISH;
> +	else if (expected_type == OBJ_TREE)
> +		lookup_flags = GET_SHA1_TREEISH;
>
>  	if (get_sha1_1(name, sp - name - 2, outer, lookup_flags))
>  		return -1;
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Inviato dal mio dispositivo mobile

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/2] peel_onion: disambiguate to favor tree-ish when we want a tree-ish
  2013-03-30  7:05                 ` Junio C Hamano
  2013-03-30  8:09                   ` Michael Haggerty
  2013-03-30  8:14                   ` Elia Pinto
@ 2013-03-31 22:38                   ` Junio C Hamano
  2013-03-31 22:38                   ` [PATCH 2/2] peel_onion(): teach $foo^{object} peeler Junio C Hamano
  3 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2013-03-31 22:38 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Git Mailing List

The function already knows when interpreting $foo^{commit} to tell
the underlying get_sha1_1() to expect a commit-ish while evaluating
$foo.  Teach it to do the same when asked for $foo^{tree}; we are
expecting a tree-ish and $foo should be disambiguated in favor of a
tree-ish, discarding a possible ambiguous match with a blob object.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

Junio C Hamano <gitster@pobox.com> writes:

> Perhaps something like this.  Note that the last hunk is unrelated
> thinko-fix I noticed while browsing the code.

 sha1_name.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sha1_name.c b/sha1_name.c
index c50630a..45788df 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -654,6 +654,8 @@ static int peel_onion(const char *name, int len, unsigned char *sha1)
 
 	if (expected_type == OBJ_COMMIT)
 		lookup_flags = GET_SHA1_COMMITTISH;
+	else if (expected_type == OBJ_TREE)
+		lookup_flags = GET_SHA1_TREEISH;
 
 	if (get_sha1_1(name, sp - name - 2, outer, lookup_flags))
 		return -1;
-- 
1.8.2-441-g6e6f07b

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/2] peel_onion(): teach $foo^{object} peeler
  2013-03-30  7:05                 ` Junio C Hamano
                                     ` (2 preceding siblings ...)
  2013-03-31 22:38                   ` [PATCH 1/2] peel_onion: disambiguate to favor tree-ish when we want a tree-ish Junio C Hamano
@ 2013-03-31 22:38                   ` Junio C Hamano
  2013-04-02  8:20                     ` Michael Haggerty
  3 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2013-03-31 22:38 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Git Mailing List

A string that names an object can be suffixed with ^{type} peeler to
say "I have this object name; peel it until you get this type. If
you cannot do so, it is an error".  v1.8.2^{commit} asks for a commit
that is pointed at an annotated tag v1.8.2; v1.8.2^{tree} unwraps it
further to the top-level tree object.  A special suffix ^{} (i.e. no
type specified) means "I do not care what it unwraps to; just peel
annotated tag until you get something that is not a tag".

When you have a random user-supplied string, you can turn it to a
bare 40-hex object name, and cause it to error out if such an object
does not exist, with:

	git rev-parse --verify "$userstring^{}"

for most objects, but this does not yield the tag object name when
$userstring refers to an annotated tag.

Introduce a new suffix, ^{object}, that only makes sure the given
name refers to an existing object.  Then

	git rev-parse --verify "$userstring^{object}"

becomes a way to make sure $userstring refers to an existing object.

This is necessary because the plumbing "rev-parse --verify" is only
about "make sure the argument is something we can feed to get_sha1()
and turn it into a raw 20-byte object name SHA-1" and is not about
"make sure that 20-byte object name SHA-1 refers to an object that
exists in our object store".  When the given $userstring is already
a 40-hex, by definition "rev-parse --verify $userstring" can turn it
into a raw 20-byte object name.  With "$userstring^{object}", we can
make sure that the 40-hex string names an object that exists in our
object store before "--verify" kicks in.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sha1_name.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/sha1_name.c b/sha1_name.c
index 45788df..85b6e75 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -594,7 +594,7 @@ struct object *peel_to_type(const char *name, int namelen,
 	while (1) {
 		if (!o || (!o->parsed && !parse_object(o->sha1)))
 			return NULL;
-		if (o->type == expected_type)
+		if (expected_type == OBJ_ANY || o->type == expected_type)
 			return o;
 		if (o->type == OBJ_TAG)
 			o = ((struct tag*) o)->tagged;
@@ -645,6 +645,8 @@ static int peel_onion(const char *name, int len, unsigned char *sha1)
 		expected_type = OBJ_TREE;
 	else if (!strncmp(blob_type, sp, 4) && sp[4] == '}')
 		expected_type = OBJ_BLOB;
+	else if (!prefixcmp(sp, "object}"))
+		expected_type = OBJ_ANY;
 	else if (sp[0] == '}')
 		expected_type = OBJ_NONE;
 	else if (sp[0] == '/')
-- 
1.8.2-441-g6e6f07b

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] peel_onion(): teach $foo^{object} peeler
  2013-03-31 22:38                   ` [PATCH 2/2] peel_onion(): teach $foo^{object} peeler Junio C Hamano
@ 2013-04-02  8:20                     ` Michael Haggerty
  2013-04-02 15:45                       ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Haggerty @ 2013-04-02  8:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

While I was in the middle of suggesting documentation for this new
syntax, I discovered that you already added documentation to your repo
but didn't mention the new version on the mailing list (or maybe I
overlooked it).  It would be helpful if you would submit your own
changes to the mailing list to make it harder for the rest of us to
overlook them--and easier to look over them :-)

The documentation looks fine to me.

Off topic: Your patch reminds me of something else that surprised me:
there is no "$userstring^{tag}".  I suppose it would be a bit ambiguous,
given that tags can point at tags, and it would also be less useful than
the other suffixes.  But its absence irked the completionist in me :-)

Michael

On 04/01/2013 12:38 AM, Junio C Hamano wrote:
> A string that names an object can be suffixed with ^{type} peeler to
> say "I have this object name; peel it until you get this type. If
> you cannot do so, it is an error".  v1.8.2^{commit} asks for a commit
> that is pointed at an annotated tag v1.8.2; v1.8.2^{tree} unwraps it
> further to the top-level tree object.  A special suffix ^{} (i.e. no
> type specified) means "I do not care what it unwraps to; just peel
> annotated tag until you get something that is not a tag".
> 
> When you have a random user-supplied string, you can turn it to a
> bare 40-hex object name, and cause it to error out if such an object
> does not exist, with:
> 
> 	git rev-parse --verify "$userstring^{}"
> 
> for most objects, but this does not yield the tag object name when
> $userstring refers to an annotated tag.
> 
> Introduce a new suffix, ^{object}, that only makes sure the given
> name refers to an existing object.  Then
> 
> 	git rev-parse --verify "$userstring^{object}"
> 
> becomes a way to make sure $userstring refers to an existing object.
> 
> This is necessary because the plumbing "rev-parse --verify" is only
> about "make sure the argument is something we can feed to get_sha1()
> and turn it into a raw 20-byte object name SHA-1" and is not about
> "make sure that 20-byte object name SHA-1 refers to an object that
> exists in our object store".  When the given $userstring is already
> a 40-hex, by definition "rev-parse --verify $userstring" can turn it
> into a raw 20-byte object name.  With "$userstring^{object}", we can
> make sure that the 40-hex string names an object that exists in our
> object store before "--verify" kicks in.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  sha1_name.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/sha1_name.c b/sha1_name.c
> index 45788df..85b6e75 100644
> --- a/sha1_name.c
> +++ b/sha1_name.c
> @@ -594,7 +594,7 @@ struct object *peel_to_type(const char *name, int namelen,
>  	while (1) {
>  		if (!o || (!o->parsed && !parse_object(o->sha1)))
>  			return NULL;
> -		if (o->type == expected_type)
> +		if (expected_type == OBJ_ANY || o->type == expected_type)
>  			return o;
>  		if (o->type == OBJ_TAG)
>  			o = ((struct tag*) o)->tagged;
> @@ -645,6 +645,8 @@ static int peel_onion(const char *name, int len, unsigned char *sha1)
>  		expected_type = OBJ_TREE;
>  	else if (!strncmp(blob_type, sp, 4) && sp[4] == '}')
>  		expected_type = OBJ_BLOB;
> +	else if (!prefixcmp(sp, "object}"))
> +		expected_type = OBJ_ANY;
>  	else if (sp[0] == '}')
>  		expected_type = OBJ_NONE;
>  	else if (sp[0] == '/')
> 


-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] peel_onion(): teach $foo^{object} peeler
  2013-04-02  8:20                     ` Michael Haggerty
@ 2013-04-02 15:45                       ` Junio C Hamano
  2013-04-02 16:32                         ` Michael Haggerty
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2013-04-02 15:45 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Git Mailing List

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Off topic: Your patch reminds me of something else that surprised me:
> there is no "$userstring^{tag}".  I suppose it would be a bit ambiguous,
> given that tags can point at tags, and it would also be less useful than
> the other suffixes.  But its absence irked the completionist in me :-)

Yes, unfortunately, foo^{type} means "start from foo, and until what
you are looking at it is of type, repeatedly peel to see if you can
get to an object of that type, or stop and report an error".  If a
tag A points at another tag B, which in turn points at an object C,
you will never see B by applying usual peeling operator.

Also, v1.8.2^{tag} would be give the tag itself, while master^{tag}
would not report the commit "master" but would error out, which
would be useless.  You are better off doing `git cat-file -t foo`
and seeing if it is a tag object at that point.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] peel_onion(): teach $foo^{object} peeler
  2013-04-02 15:45                       ` Junio C Hamano
@ 2013-04-02 16:32                         ` Michael Haggerty
  2013-04-02 16:56                           ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Haggerty @ 2013-04-02 16:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

On 04/02/2013 05:45 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Off topic: Your patch reminds me of something else that surprised me:
>> there is no "$userstring^{tag}".  I suppose it would be a bit ambiguous,
>> given that tags can point at tags, and it would also be less useful than
>> the other suffixes.  But its absence irked the completionist in me :-)
> 
> Yes, unfortunately, foo^{type} means "start from foo, and until what
> you are looking at it is of type, repeatedly peel to see if you can
> get to an object of that type, or stop and report an error".  If a
> tag A points at another tag B, which in turn points at an object C,
> you will never see B by applying usual peeling operator.
> 
> Also, v1.8.2^{tag} would be give the tag itself, while master^{tag}
> would not report the commit "master" but would error out, which
> would be useless.  You are better off doing `git cat-file -t foo`
> and seeing if it is a tag object at that point.

All correct, of course.  But the user would never use "master^{tag}"
unless he wants a tag and nothing else, so erroring out would be exactly
the thing he wants in that case.  This is no different than the
"^{commit}" part of "master^{tree}^{commit}", which correctly errors out
because a commit cannot be inferred from a tree.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] peel_onion(): teach $foo^{object} peeler
  2013-04-02 16:32                         ` Michael Haggerty
@ 2013-04-02 16:56                           ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2013-04-02 16:56 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Git Mailing List

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 04/02/2013 05:45 PM, Junio C Hamano wrote:
>
>> Also, v1.8.2^{tag} would be give the tag itself, while master^{tag}
>> would not report the commit "master" but would error out, which
>> would be useless.  You are better off doing `git cat-file -t foo`
>> and seeing if it is a tag object at that point.
>
> All correct, of course.  But the user would never use "master^{tag}"
> unless he wants a tag and nothing else, so erroring out would be exactly
> the thing he wants in that case.  This is no different than the
> "^{commit}" part of "master^{tree}^{commit}", which correctly errors out
> because a commit cannot be inferred from a tree.

Correct; I was only saying adding it is not something that solves a
problem that cannot be solved with the current system (i.e. no added
value from feature point-of-view).  I would not object to a patch to
allow "git rev-parse v1.8.2^{tag}" for completeness.

We may want to rethink if we can lose the hardcoded lengths like 6,
3, 4, 4 from here, but I didn't bother ;-).


 sha1_name.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 85b6e75..47f39a8 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -639,16 +639,18 @@ static int peel_onion(const char *name, int len, unsigned char *sha1)
 		return -1;
 
 	sp++; /* beginning of type name, or closing brace for empty */
-	if (!strncmp(commit_type, sp, 6) && sp[6] == '}')
+	if (!strncmp(tag_type, sp, 3) && sp[3] == '}')
+		expected_type = OBJ_TAG;
+	else if (!strncmp(commit_type, sp, 6) && sp[6] == '}')
 		expected_type = OBJ_COMMIT;
 	else if (!strncmp(tree_type, sp, 4) && sp[4] == '}')
 		expected_type = OBJ_TREE;
 	else if (!strncmp(blob_type, sp, 4) && sp[4] == '}')
 		expected_type = OBJ_BLOB;
 	else if (!prefixcmp(sp, "object}"))
-		expected_type = OBJ_ANY;
+		expected_type = OBJ_ANY; /* ok as long as it exists */
 	else if (sp[0] == '}')
-		expected_type = OBJ_NONE;
+		expected_type = OBJ_NONE; /* unwrap until we get a non-tag */
 	else if (sp[0] == '/')
 		expected_type = OBJ_COMMIT;
 	else

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-04-02 16:57 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-28 13:04 Bug in "git rev-parse --verify" Michael Haggerty
2013-03-28 14:05 ` Jeff King
2013-03-28 14:55   ` Junio C Hamano
     [not found] ` <CAPc5daUqzz=9TBmj2Q0MHqEc6gMHxXoGr9+JV3hq76zDKJAyCw@mail.gmail.com>
2013-03-28 15:34   ` Michael Haggerty
2013-03-28 15:38     ` Jeff King
2013-03-28 15:52       ` Michael Haggerty
2013-03-28 16:38         ` Jeff King
2013-03-28 16:52       ` Junio C Hamano
2013-03-30  3:44         ` Michael Haggerty
2013-03-30  4:09           ` Junio C Hamano
2013-03-30  4:15             ` Junio C Hamano
2013-03-30  5:29             ` Michael Haggerty
2013-03-30  6:38               ` Junio C Hamano
2013-03-30  7:05                 ` Junio C Hamano
2013-03-30  8:09                   ` Michael Haggerty
2013-03-30  8:14                   ` Elia Pinto
2013-03-31 22:38                   ` [PATCH 1/2] peel_onion: disambiguate to favor tree-ish when we want a tree-ish Junio C Hamano
2013-03-31 22:38                   ` [PATCH 2/2] peel_onion(): teach $foo^{object} peeler Junio C Hamano
2013-04-02  8:20                     ` Michael Haggerty
2013-04-02 15:45                       ` Junio C Hamano
2013-04-02 16:32                         ` Michael Haggerty
2013-04-02 16:56                           ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).