git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to not download objects more than needed?
@ 2006-02-21 19:38 Radoslaw Szkodzinski
       [not found] ` <20060221161340.73a19228.seanlkml@sympatico.ca>
  2006-02-21 21:32 ` How to not download objects more than needed? Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Radoslaw Szkodzinski @ 2006-02-21 19:38 UTC (permalink / raw)
  To: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 584 bytes --]

I have a pecuilar, but common use case for git.

I have linux-2.6 repository pulled and I'd like to download some branch
(say, netdev-2.6), which uses many of the same objects,
but not to get all the objects from the git server.

I've already tried certain commands, but still can't do it,
and my bandwidth isn't too happy about it.

It seems to require some kind of HEAD rewinding,
or maybe fetching to another branch, I don't know.

Anyone cares to help?

-- 
GPG Key id:  0xD1F10BA2
Fingerprint: 96E2 304A B9C4 949A 10A0  9105 9543 0453 D1F1 0BA2

AstralStorm


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How to not download objects more than needed?
       [not found] ` <20060221161340.73a19228.seanlkml@sympatico.ca>
@ 2006-02-21 21:13   ` sean
  2006-02-22  0:42     ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: sean @ 2006-02-21 21:13 UTC (permalink / raw)
  To: Radoslaw Szkodzinski; +Cc: git

On Tue, 21 Feb 2006 20:38:42 +0100
Radoslaw Szkodzinski <astralstorm@gorzow.mm.pl> wrote:

> I have a pecuilar, but common use case for git.

It's not really that peculiar.

> I have linux-2.6 repository pulled and I'd like to download some branch
> (say, netdev-2.6), which uses many of the same objects,
> but not to get all the objects from the git server.

Just make sure you're not using the rsync protocol.   Using the
native git protocol would be best.

> I've already tried certain commands, but still can't do it,
> and my bandwidth isn't too happy about it.

For instance, make sure your current linus repository is up to date 
with a "git pull" and then:

git fetch \
   git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git \
   upstream:netdev

will take the "upstream" branch from the netdev repository and name it 
netdev in your local repository.

Sean

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How to not download objects more than needed?
  2006-02-21 19:38 How to not download objects more than needed? Radoslaw Szkodzinski
       [not found] ` <20060221161340.73a19228.seanlkml@sympatico.ca>
@ 2006-02-21 21:32 ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2006-02-21 21:32 UTC (permalink / raw)
  To: Radoslaw Szkodzinski; +Cc: git

Radoslaw Szkodzinski <astralstorm@gorzow.mm.pl> writes:

> I have a pecuilar, but common use case for git.
>
> I have linux-2.6 repository pulled and I'd like to download some branch
> (say, netdev-2.6), which uses many of the same objects,
> but not to get all the objects from the git server.
>
> I've already tried certain commands, but still can't do it,
> and my bandwidth isn't too happy about it.
>
> It seems to require some kind of HEAD rewinding,
> or maybe fetching to another branch, I don't know.
>
> Anyone cares to help?

It is not peculiar at all.  The tools already should do what you
want:

           o---o---o---...---o (netdev-2.6)
          /
         / < netdev forked some time ago.
        /
    ---o---o---o---o---...---o---o---o (linus tip)
               ^v2.6.16-rc3      ^v2.6.16-rc4 

Suppose the "global" ancestry graph was like the above.  And
netdev-2.6 has not been merged into Linus tree.

What you have, already pulled from Linus, is:

    ---o---o---o---o---...---o---o---o (linus tip)
               ^v2.6.16-rc3      ^v2.6.16-rc4 

And suppose what the netdev tree has is something like this:

           o---o---o---...---o (netdev-2.6)
          /
         / < netdev forked some time ago.
        /
    ---o---o---o
               ^v2.6.16-rc3

The point being that the netdev tree does not know about Linus
tip you have.

When you "git fetch git://.../netdev-2.6.git/", a program that
runs on your end (git-fetch-pack) and another program that runs
on the other end (git-upload-pack) discuss to find out what both
of you have in common.  Your side starts from Linus tip and go
backwards, telling the other end "I have this, I have that,
...".  At first, netdev side will not see what it knows about,
but after a while, it will see a commit both of you have
(i.e. where the branch forked from).  After they find that out,
your side tells the other side "I want your netdev-2.6 head".

The other side sends the objects needed to complete the chain up
to the requested head, assuming that your side has objects to
complete the common ancestor point (again, the fork point, but
it could be some revs after that if the graph looked like the
above picture).  Objects behind the fork point does not need to
be sent.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How to not download objects more than needed?
  2006-02-21 21:13   ` sean
@ 2006-02-22  0:42     ` Linus Torvalds
  2006-02-22  1:13       ` Jan Harkes
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2006-02-22  0:42 UTC (permalink / raw)
  To: sean; +Cc: Radoslaw Szkodzinski, git



On Tue, 21 Feb 2006, sean wrote:
> 
> > I have linux-2.6 repository pulled and I'd like to download some branch
> > (say, netdev-2.6), which uses many of the same objects,
> > but not to get all the objects from the git server.
> 
> Just make sure you're not using the rsync protocol.   Using the
> native git protocol would be best.

Side note: the "automatic tag following" is broken wrt pulling unnecessary 
objects, even with the git protocol.

What happens is that if you don't explicitly have a branch for what you 
are pulling, and you do something like

	git pull git://git.kernel.org/....

and the automatic tag following kicks in, it will first have fetched the 
objects once, and then when it tries to fetch the tag objects, it will 
fetch the objects it already fetched _again_ (plus the tags), because it 
will do the same object pull, but the temporary branch (to be merged) will 
never have been written as a branch head.

So you'll see something like

	Generating pack...
	Done counting <x> objects.
	Packing <x> objects.......................
	Unpacking <x> objects
	 100% (<x>/<x>) done
	Auto-following refs/tags/v1.2.2
	Generating pack...
	Done counting <x+1> objects.
	Packing <x+1> objects.......................
	Unpacking <x+1> objects
	 100% (<x+1>/<x+1>) done
	* refs/tags/v1.2.2: storing tag 'v1.2.2' of master.kernel.org:/pub/scm/git/git

just because we hadn't updated any refs before we started re-fetching more 
objects.

So we do have cases where we fetch unnecessarily even with the native 
protocol.

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How to not download objects more than needed?
  2006-02-22  0:42     ` Linus Torvalds
@ 2006-02-22  1:13       ` Jan Harkes
  2006-02-22  1:55         ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Harkes @ 2006-02-22  1:13 UTC (permalink / raw)
  To: git

On Tue, Feb 21, 2006 at 04:42:34PM -0800, Linus Torvalds wrote:
> 
> 	git pull git://git.kernel.org/....
> 
> and the automatic tag following kicks in, it will first have fetched the 
> objects once, and then when it tries to fetch the tag objects, it will 
> fetch the objects it already fetched _again_ (plus the tags), because it 
> will do the same object pull, but the temporary branch (to be merged) will 
> never have been written as a branch head.

Isn't this easily avoided by fetching the tags first?

Jan


diff --git a/git-fetch.sh b/git-fetch.sh
index b4325d9..9c6748f 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -363,8 +363,6 @@ fetch_main () {
 
 }
 
-fetch_main "$reflist"
-
 # automated tag following
 case "$no_tags$tags" in
 '')
@@ -389,6 +387,8 @@ case "$no_tags$tags" in
 	esac
 esac
 
+fetch_main "$reflist"
+
 # If the original head was empty (i.e. no "master" yet), or
 # if we were told not to worry, we do not have to check.
 case ",$update_head_ok,$orig_head," in

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: How to not download objects more than needed?
  2006-02-22  1:13       ` Jan Harkes
@ 2006-02-22  1:55         ` Junio C Hamano
  2006-02-22  3:11           ` Jan Harkes
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-02-22  1:55 UTC (permalink / raw)
  To: Jan Harkes; +Cc: git

Jan Harkes <jaharkes@cs.cmu.edu> writes:

> On Tue, Feb 21, 2006 at 04:42:34PM -0800, Linus Torvalds wrote:
>> 
>> 	git pull git://git.kernel.org/....
>> 
>> and the automatic tag following kicks in, it will first have fetched the 
>> objects once, and then when it tries to fetch the tag objects, it will 
>> fetch the objects it already fetched _again_ (plus the tags), because it 
>> will do the same object pull, but the temporary branch (to be merged) will 
>> never have been written as a branch head.
>
> Isn't this easily avoided by fetching the tags first?

I do not think so.

Notice how the tag following code uses cat-file to determine if
the main fetch likely has slurped the object they point at.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How to not download objects more than needed?
  2006-02-22  1:55         ` Junio C Hamano
@ 2006-02-22  3:11           ` Jan Harkes
  2006-02-22  3:22             ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Harkes @ 2006-02-22  3:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Feb 21, 2006 at 05:55:48PM -0800, Junio C Hamano wrote:
> Jan Harkes <jaharkes@cs.cmu.edu> writes:
> > On Tue, Feb 21, 2006 at 04:42:34PM -0800, Linus Torvalds wrote:
> >> 
> >> 	git pull git://git.kernel.org/....
> >> 
> >> and the automatic tag following kicks in, it will first have fetched the 
> >> objects once, and then when it tries to fetch the tag objects, it will 
> >> fetch the objects it already fetched _again_ (plus the tags), because it 
> >> will do the same object pull, but the temporary branch (to be merged) will 
> >> never have been written as a branch head.
> >
> > Isn't this easily avoided by fetching the tags first?
> 
> I do not think so.
> 
> Notice how the tag following code uses cat-file to determine if
> the main fetch likely has slurped the object they point at.

Neat, it only fetches tags that refer to things we already have. Hadn't
checked what the automatic tag fetcher was doing.

So either introduce temporary local refs that can be removed once the
tags have been fetched, or else fix it in fetch-pack with the following
change that might do the trick for this case as well. However that one
already got shot down because of possible consistency problems.

    http://marc.theaimsgroup.com/?l=git&m=113030081014456&w=2

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How to not download objects more than needed?
  2006-02-22  3:11           ` Jan Harkes
@ 2006-02-22  3:22             ` Junio C Hamano
  2006-02-22 21:12               ` [PATCH] git-fetch: follow tag only when tracking remote branch Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-02-22  3:22 UTC (permalink / raw)
  To: Jan Harkes; +Cc: git

Jan Harkes <jaharkes@cs.cmu.edu> writes:

> Neat, it only fetches tags that refer to things we already have. Hadn't
> checked what the automatic tag fetcher was doing.
>
> So either introduce temporary local refs that can be removed once the
> tags have been fetched,...

I think it is enough just to disable tag following when you are
promiscuously fetching.  That is, do the tag following only if
the main fetch is going to store a ref because it has tracking
branch for the remote side.  Otherwise the remote tags do not
matter and if you really care about them you can ask with --tags.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] git-fetch: follow tag only when tracking remote branch.
  2006-02-22  3:22             ` Junio C Hamano
@ 2006-02-22 21:12               ` Junio C Hamano
  2006-02-22 21:19                 ` Andreas Ericsson
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-02-22 21:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Jan Harkes

Unless --no-tags flag was given, git-fetch tried to always
follow remote tags that point at the commits we picked up.

It is not very useful to pick up tags from remote unless storing
the fetched branch head in a local tracking branch.  This is
especially true if the fetch is done to merge the remote branch
into our current branch as one-shot basis (i.e. "please pull"),
and is even harmful if the remote repository has many irrelevant
tags.

This proposed update disables the automated tag following unless
we are storing the a fetched branch head in a local tracking
branch.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 * Likes, dislikes?

 git-fetch.sh |   33 +++++++++++++++++++--------------
 1 files changed, 19 insertions(+), 14 deletions(-)

f41b73b4c2e0313d1638261ed87f0921e47904b2
diff --git a/git-fetch.sh b/git-fetch.sh
index b4325d9..fcc24f8 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -368,20 +368,25 @@ fetch_main "$reflist"
 # automated tag following
 case "$no_tags$tags" in
 '')
-	taglist=$(IFS=" " &&
-	git-ls-remote $upload_pack --tags "$remote" |
-	sed -ne 's|^\([0-9a-f]*\)[ 	]\(refs/tags/.*\)^{}$|\1 \2|p' |
-	while read sha1 name
-	do
-		test -f "$GIT_DIR/$name" && continue
-	  	git-check-ref-format "$name" || {
-			echo >&2 "warning: tag ${name} ignored"
-			continue
-		}
-		git-cat-file -t "$sha1" >/dev/null 2>&1 || continue
-		echo >&2 "Auto-following $name"
-		echo ".${name}:${name}"
-	done)
+	case "$reflist" in
+	*:refs/*)
+		# effective only when we are following remote branch
+		# using local tracking branch.
+		taglist=$(IFS=" " &&
+		git-ls-remote $upload_pack --tags "$remote" |
+		sed -ne 's|^\([0-9a-f]*\)[ 	]\(refs/tags/.*\)^{}$|\1 \2|p' |
+		while read sha1 name
+		do
+			test -f "$GIT_DIR/$name" && continue
+			git-check-ref-format "$name" || {
+				echo >&2 "warning: tag ${name} ignored"
+				continue
+			}
+			git-cat-file -t "$sha1" >/dev/null 2>&1 || continue
+			echo >&2 "Auto-following $name"
+			echo ".${name}:${name}"
+		done)
+	esac
 	case "$taglist" in
 	'') ;;
 	?*)
-- 
1.2.2.ga35e

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] git-fetch: follow tag only when tracking remote branch.
  2006-02-22 21:12               ` [PATCH] git-fetch: follow tag only when tracking remote branch Junio C Hamano
@ 2006-02-22 21:19                 ` Andreas Ericsson
  0 siblings, 0 replies; 10+ messages in thread
From: Andreas Ericsson @ 2006-02-22 21:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git, Jan Harkes

Junio C Hamano wrote:
> Unless --no-tags flag was given, git-fetch tried to always
> follow remote tags that point at the commits we picked up.
> 
> It is not very useful to pick up tags from remote unless storing
> the fetched branch head in a local tracking branch.  This is
> especially true if the fetch is done to merge the remote branch
> into our current branch as one-shot basis (i.e. "please pull"),
> and is even harmful if the remote repository has many irrelevant
> tags.
> 
> This proposed update disables the automated tag following unless
> we are storing the a fetched branch head in a local tracking
> branch.
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>
> 
> ---
> 
>  * Likes, dislikes?
> 

Likes a lot. This is a Good Thing.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-02-22 21:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-21 19:38 How to not download objects more than needed? Radoslaw Szkodzinski
     [not found] ` <20060221161340.73a19228.seanlkml@sympatico.ca>
2006-02-21 21:13   ` sean
2006-02-22  0:42     ` Linus Torvalds
2006-02-22  1:13       ` Jan Harkes
2006-02-22  1:55         ` Junio C Hamano
2006-02-22  3:11           ` Jan Harkes
2006-02-22  3:22             ` Junio C Hamano
2006-02-22 21:12               ` [PATCH] git-fetch: follow tag only when tracking remote branch Junio C Hamano
2006-02-22 21:19                 ` Andreas Ericsson
2006-02-21 21:32 ` How to not download objects more than needed? Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).