Git development
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Junio C Hamano <junkio@cox.net>
Cc: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org
Subject: Re: [RFC][PATCH] Allow transfer of any valid sha1
Date: Fri, 26 May 2006 11:32:40 -0600	[thread overview]
Message-ID: <m1ac94bsjr.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <7vac95m799.fsf@assigned-by-dhcp.cox.net> (Junio C. Hamano's message of "Fri, 26 May 2006 03:04:50 -0700")

Junio C Hamano <junkio@cox.net> writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> diff --git a/fetch-pack.c b/fetch-pack.c
>> index a3bcad0..c767d84 100644
>> --- a/fetch-pack.c
>> +++ b/fetch-pack.c
>> @@ -260,6 +260,27 @@ static void mark_recent_complete_commits
>>  	}
>>  }
>>  
>> +static struct ref **get_sha1_heads(struct ref **refs, int nr_heads, char
> **head)
>> +{
>> +	int i;
>> +	for (i  = 0; i < nr_heads; i++) {
>> +		struct ref *ref;
>> +		unsigned char sha1[20];
>> +		char *s = head[i];
>> +		int len = strlen(s);
>> +
>> +		if (len != 40 || get_sha1_hex(s, sha1))
>> +			continue;
>
> So the new convention is fetch-pack can take ref name (as
> before), or a bare 40-byte hexadecimal.  I think sane people
> would not use ambiguous refname that says "deadbeef" five times,
> and even if the do so they could disambiguate by explicitly
> saying "refs/heads/" followed by "deadbeef" five times, so it
> should be OK.

Yes.

>> +
>> +		ref = xcalloc(1, sizeof(*ref) + len + 1);
>> +		memcpy(ref->old_sha1, sha1, 20);
>> +		memcpy(ref->name, s, len + 1);
>> +		*refs = ref;
>> +		refs = &ref->next;
>> +	}
>> +	return refs;
>> +}
>> +
>
> This function takes the pointer to a location that holds a
> pointer to a "struct ref" -- it is the location to store the
> newly allocated ref structure, i.e. the next pointer of the last
> element in the list.  When it returns, the location pointed at
> by the pointer given to you points at the first element you
> allocated, and it returns the next pointer of the last element
> allocated by it.  That is the same calling convention as
> connect.c::get_remote_heads().  So when calling this function to
> append to a list you already have, you would give the next
> pointer to the last element of the existing list.  But you do
> not seem to do that.

Ack. That does look like a bug.  I knew there as something
fishy about that code.  But it worked for my basic testing so I didn't
worry about it.

> I think the body of fetch_pack() should become something like:
>
> 	struct ref *ref, **tail;
>
>         tail = get_remote_heads(fd[0], &ref, 0, NULL, 0);
> 	if (server_supports("multi_ack")) {
> 		...
> 	}
> 	tail = get_sha1_heads(tail, nr_match, match);
> 	if (everything_local(&ref, nr_match, match)) {
> 		...

Actually because we want the filter to resolve sha1s by
default in terms of what was passed on the command line.  I'm pretty
certain that should be:

	tail = get_sha1_heads(&ref, nr_match, match);
	tail = get_remote_heads(fd[0], tail, 0, NULL, 0);
        ...


>> @@ -311,6 +332,8 @@ static int everything_local(struct ref *
>>  	if (cutoff)
>>  		mark_recent_complete_commits(cutoff);
>>  
>> +	filter_refs(refs, nr_match, match);
>> +
>
> I am not sure about this change.

Agreed.  It was a hold over from an earlier way of injecting
the sha1 into the logic.  

As for what happens I think I need to audit everything that
takes a ref from fetch_pack.  To make certain I have not
messed up the logic.

> In the original code we do not let get_remote_heads() to filter
> the refs but call filter_refs() after the "mark all complete
> remote refs as common" step for a reason.  Even though we may
> not be fetching from some remote refs, we would want to take
> advantage of the knowledge of what objects they have so that we
> can mark as many objects as common as possible in the early
> stage.  I suspect this change defeats that optimization.

It feels like it.

> So instead I would teach "mark all complete remote refs" loop
> that not everything in refs list is a valid remote ref, and skip
> what get_sha1_heads() injected, because these arbitrary ones we
> got from the command line are not something we know exist on the
> remote side.  Maybe something like this.

Sounds sane.  We also introduce a new possibility of having a
ref that is complete but not remote.

> 	/*
> 	 * Mark all complete remote refs as common refs.
> 	 * Don't mark them common yet; the server has to be told so first.
> 	 */
> 	for (ref = *refs; ref; ref = ref->next) {
> 		struct object *o;
>                 if (ref is SHA1 from the command line)
>                 	continue;
> 		o = deref_tag(lookup_object(ref->old_sha1), NULL, 0);
> 		if (!o || o->type != commit_type || !(o->flags & COMPLETE))
> 			continue;
> 		...
>
> To implement "ref is SHA1 from the command line", I would add
> another 1-bit field to "struct ref" and mark the new ones you
> create in get_sha1_heads() as such (existing "force" field
> could also become an 1-bit field -- we do not neeed a char).

Sounds sane.
So that gives me:
	unsigned int force : 1;
	unsigned int injected : 1;

Which aligns them to an int boundary but since we are followed
immediately by a pointer should result in no additional storage being
consumed.

>> @@ -373,6 +394,7 @@ static int fetch_pack(int fd[2], int nr_
>>  		packet_flush(fd[1]);
>>  		die("no matching remote head");
>>  	}
>> +	get_sha1_heads(&ref, nr_match, match);
>
> I talked about this one already...
>
>> diff --git a/git-parse-remote.sh b/git-parse-remote.sh
>> index 187f088..2372df8 100755
>> --- a/git-parse-remote.sh
>> +++ b/git-parse-remote.sh
>> @@ -105,6 +105,7 @@ canon_refs_list_for_fetch () {
>>  		'') remote=HEAD ;;
>>  		refs/heads/* | refs/tags/* | refs/remotes/*) ;;
>>  		heads/* | tags/* | remotes/* ) remote="refs/$remote" ;;
>> +
> [0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F])
> ;;
>
> Yuck.  Don't we have $_x40 somewhere?

I couldn't find one in shell.  

> We never use uppercase so at least we could save 24 columns from
> here ;-).

I'm not certain why we always add make $remote="refs/heads/$remote" by
default in that switch statement. git-fetch-pack at least doesn't need
it.

If that is true of the other consumers we could easily make the test:
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]*) ;;
Or even simply make the default case *) ;;

But for the moment I will stick to the long form because it is
obviously correct.

Eric

  reply	other threads:[~2006-05-26 17:33 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-24  7:51 [RFC][PATCH] Allow transfer of any valid sha1 Eric W. Biederman
2006-05-24  9:07 ` Junio C Hamano
2006-05-25  5:09   ` Eric W. Biederman
2006-05-25  6:36     ` Junio C Hamano
2006-05-25 17:00       ` Eric W. Biederman
2006-05-25 17:28         ` Linus Torvalds
2006-05-25 17:59           ` Eric W. Biederman
2006-05-25 18:28           ` Junio C Hamano
2006-05-25 18:36             ` Linus Torvalds
2006-05-25 20:30               ` Eric W. Biederman
2006-05-25 20:53                 ` Junio C Hamano
2006-05-26  8:27                   ` Eric W. Biederman
2006-05-26 10:04                 ` Junio C Hamano
2006-05-26 17:32                   ` Eric W. Biederman [this message]
2006-05-25 20:50             ` Eric W. Biederman
2006-05-25 21:04               ` Junio C Hamano
2006-05-26  8:32                 ` Eric W. Biederman
2006-06-08  9:33                 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1ac94bsjr.fsf@ebiederm.dsl.xmission.com \
    --to=ebiederm@xmission.com \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox