git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Adding glob support to remotes
@ 2006-11-22  9:04 Andy Parkins
  2006-11-22 12:56 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Andy Parkins @ 2006-11-22  9:04 UTC (permalink / raw)
  To: git

Hello,

I'm working on adding (basic) glob support to remote fetch definitions.  This 
is to allow you to write things like

[remote "origin"]
  fetch = refs/heads/*:refs/remotes/upstream/*

I started to add code to git-parse-remote.sh:canon_refs_list_for_fetch() to 
preprocess the reflist to catch lines with a "*" in them then use the remote 
pattern to filter the output of from "git-ls-remote -h", blah, blah, you get 
the idea...

However, git-ls-remote needs the name of the remote repository (of course), 
but that isn't directly available in git-parse-remote.sh.  Should I
 a) pass it as a parameter from git-fetch.sh right through each intermediate 
function
 b) create a global?
 c) change git-check-ref-format to allow "*" in the name, then put the 
git-ls-remote call in git-fetch instead.

I don't like to do (b) as it's nasty programming behaviour; however passing a 
parameter is fairly intrusive to the existing code.  Similarly (c) means I'm 
messing in places I suspect I shouldn't be (git-check-ref-format).

git-gods - what do I do?



Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Adding glob support to remotes
  2006-11-22  9:04 Adding glob support to remotes Andy Parkins
@ 2006-11-22 12:56 ` Junio C Hamano
  2006-11-22 14:41   ` Andy Parkins
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2006-11-22 12:56 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

> I started to add code to git-parse-remote.sh:canon_refs_list_for_fetch() to 
> preprocess the reflist to catch lines with a "*" in them then use the remote 
> pattern to filter the output of from "git-ls-remote -h", blah, blah, you get 
> the idea...
>
> However, git-ls-remote needs the name of the remote repository (of course), 
> but that isn't directly available in git-parse-remote.sh.  

Is it really the case?  I do not remember the details offhand,
but I do not think canon_refs_list_for_fetch is the function you
should be messing with to implement the remote."origin".fetch
stuff.  It should be get_remote_default_refs_for_fetch().  The
function returns the list based on which remote, so it surely
knows which remote the caller is talking about.

However, I would recommend against actually running ls-remote to
help "git-fetch" inside git-parse-remote.sh.  I think you should
run ls-remote upfront early in git-fetch because there are at
least two other parts in git-fetch that wants the same ls-remote
output:

 (1) dumb protocols currently cannot deal with a remote that has
     run "packed-ref --prune" because git-fetch.sh first uses
     curl executable to download the loose ref, read it and then
     use the object name read from that to drive git-http-fetch.
     We can and should get rid of the /max_depth=5/,/done/ loop
     there and replace it with a grep of ls-remote output to
     make them work against such a remote.  When tracking many
     branches from the remote, this would reduce the number of
     http requests (one per branch vs a ls-remote which is just
     a single download of info/refs).

 (2) when doing a fetch with tracking branches (which is what
     your change is about), we would need to run ls-remote to
     find out the remote tags for tag following purposes anyway.
     Running "ls-remote -h" once for your purpose and then
     "ls-remote -t" for tag following later is obviously very
     wasteful.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Adding glob support to remotes
  2006-11-22 12:56 ` Junio C Hamano
@ 2006-11-22 14:41   ` Andy Parkins
  2006-11-22 20:50     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Andy Parkins @ 2006-11-22 14:41 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

On Wednesday 2006 November 22 12:56, Junio C Hamano wrote:

> > However, git-ls-remote needs the name of the remote repository (of
> > course), but that isn't directly available in git-parse-remote.sh.
>
> Is it really the case?  I do not remember the details offhand,
> but I do not think canon_refs_list_for_fetch is the function you
> should be messing with to implement the remote."origin".fetch
> stuff.  It should be get_remote_default_refs_for_fetch().  The
> function returns the list based on which remote, so it surely
> knows which remote the caller is talking about.

The problem is that canon_refs_list_for_fetch bombs out too early because "*" 
is not an acceptable name for a ref.

> However, I would recommend against actually running ls-remote to
> help "git-fetch" inside git-parse-remote.sh.  I think you should
> run ls-remote upfront early in git-fetch because there are at
> least two other parts in git-fetch that wants the same ls-remote
> output:

Okay.  That's what I'll do.  It means altering git-check-ref-format to prevent 
the early bomb out.  Perhaps I should move this check to somewhere after I've 
done the reflist expansion?

>  (1) dumb protocols currently cannot deal with a remote that has

I'm not sure I've understood this point.  I shall look at git-fetch.sh more 
closely to try and address this though.

>  (2) when doing a fetch with tracking branches (which is what

Accepted.



Andy

-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Adding glob support to remotes
  2006-11-22 14:41   ` Andy Parkins
@ 2006-11-22 20:50     ` Junio C Hamano
  2006-11-23  8:44       ` Andy Parkins
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2006-11-22 20:50 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

> On Wednesday 2006 November 22 12:56, Junio C Hamano wrote:
>
>> > However, git-ls-remote needs the name of the remote repository (of
>> > course), but that isn't directly available in git-parse-remote.sh.
>>
>> Is it really the case?  I do not remember the details offhand,
>> but I do not think canon_refs_list_for_fetch is the function you
>> should be messing with to implement the remote."origin".fetch
>> stuff.  It should be get_remote_default_refs_for_fetch().  The
>> function returns the list based on which remote, so it surely
>> knows which remote the caller is talking about.
>
> The problem is that canon_refs_list_for_fetch bombs out too early because "*" 
> is not an acceptable name for a ref.

I do not understand and that is why I do not understand why you
would need to touch check-ref-format.

"remote.$1.fetch" is read by get_remote_default_refs_for_fetch
and currently the value is given canon_refs_list_for_fetch
without any preprocessing.  My suggestion is to catch the
"wildcard" in get_remote_default_refs_for_fetch, do your
replacement based on an earlier ls-remote output, and give
replaced values to canon_refs_list_for_fetch.  That way I do not
think check-ref-format that is used by canon_refs_list_for_fetch
needs to see any wildcard.

Both config and remotes but not branches _could_ have wildcard
so you would need to do the same wildcard replacement in two
case arms of get_remote_default_refs_for_fetch but I think that
can be done in a common helper function.

If on the other hand if you want to do this in canon_*, then you
could do that before the "for ref" loop to set "$@" to the
wildcard-expanded form.  But there are other codepaths that use
this function and I do not know if you want to apply the
wildcarding to all of them (I haven't checked all the callers).
If you are sure all the callers want the wildcarding the same
way, it would be more appropriate to do so there than doing it
in get_remote_default_refs_for_fetch as I suggested in the
above.  I.e. something like this perhaps.

 canon_refs_list_for_fetch () {
 ...
 			    --get-all "branch.${curr_branch}.merge")
 		fi
 	fi
+
+	expanded_ref_list=
+	for ref
+	do
+		if ref is wildcard
+		then
+			compute wildcard replacement and set it to nref
+			expanded_ref_list="$expanded_ref_list $nref"
+		else
+			expanded_ref_list="$expanded_ref_list $ref"
+		fi
+	done
+	set x $expanded_ref_list
+	shift
+
 	for ref
 	do
 		force=


>>  (1) dumb protocols currently cannot deal with a remote that has
>
> I'm not sure I've understood this point.  I shall look at git-fetch.sh more 
> closely to try and address this though.

I was talking about this part of the code:

      case "$remote" in
      http://* | https://* | ftp://*)
	  if [ -n "$GIT_SSL_NO_VERIFY" ]; then
	      curl_extra_args="-k"
	  fi
	  if [ -n "$GIT_CURL_FTP_NO_EPSV" -o \
		"`git-repo-config --bool http.noEPSV`" = true ]; then
	      noepsv_opt="--disable-epsv"
	  fi
	  max_depth=5
	  depth=0
	  head="ref: $remote_name"
	  while (expr "z$head" : "zref:" && expr $depth \< $max_depth) >/dev/null
	  do
	    remote_name_quoted=$(@@PERL@@ -e '
	      my $u = $ARGV[0];
              $u =~ s/^ref:\s*//;
	      $u =~ s{([^-a-zA-Z0-9/.])}{sprintf"%%%02x",ord($1)}eg;
	      print "$u";
	  ' "$head")
	    head=$(curl -nsfL $curl_extra_args $noepsv_opt "$remote/$remote_name_quoted")
	    depth=$( expr \( $depth + 1 \) )
	  done
	  expr "z$head" : "z$_x40\$" >/dev/null ||
	      die "Failed to fetch $remote_name from $remote"
	  echo >&2 Fetching "$remote_name from $remote" using http
	  git-http-fetch -v -a "$head" "$remote/" || exit
	  ;;

The while loop is initialized with "ref: refs/heads/$branch", 
extracts the "refs/heads/$branch" from it by hand, and runs
"curl -nsfL" to get http://host/repo.git/refs/heads/$branch,
until it is not a symref anymore (refs/heads/$branch _could_
be a symref that points at refs/heads/$another, and that is the
loop is about).  At the end of the loop, $head would contain
what is read from the ref -- 40-byte object name.  That is given
to git-http-fetch.

However, the above assumes $GIT_DIR/refs/heads/$branch file
exists at the remote end for "curl -nsfL" to fetch.  When refs
are packed-and-pruned, refs/heads/$branch may not exist as a
standalone file; the right way to learn "what object does a ref
point at" is to ask ls-remote, and if you do one ls-remote
upfront then all you need to do here is a single grep.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Adding glob support to remotes
  2006-11-22 20:50     ` Junio C Hamano
@ 2006-11-23  8:44       ` Andy Parkins
  0 siblings, 0 replies; 5+ messages in thread
From: Andy Parkins @ 2006-11-23  8:44 UTC (permalink / raw)
  To: git

On Wednesday 2006 November 22 20:50, Junio C Hamano wrote:

> I do not understand and that is why I do not understand why you
> would need to touch check-ref-format.

The problem is that I didn't want to call git-ls-remote in git-parse-remote, 
as git-parse-remote doesn't know about $uploadpack or what the actual remote 
in use is.  It seemed fairly intrusive to completely alter the functions in 
git-parse-remote; which is obviously a set of library functions.

Instead I did the expansion in git-fetch.sh.  Unfortunately that meant 
disabling the check-ref-format call in canon_refs_list_for_fetch.

> point at" is to ask ls-remote, and if you do one ls-remote
> upfront then all you need to do here is a single grep.

I understand now.  I'll look again at this tonight and send a further patch.


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-11-23  8:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-22  9:04 Adding glob support to remotes Andy Parkins
2006-11-22 12:56 ` Junio C Hamano
2006-11-22 14:41   ` Andy Parkins
2006-11-22 20:50     ` Junio C Hamano
2006-11-23  8:44       ` Andy Parkins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).