git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: tony.luck@intel.com
To: "Adam Kropelin" <akropel1@rochester.rr.com>
Cc: "Daniel Barkalow" <barkalow@iabervon.org>
Cc: <git@vger.kernel.org>
Subject: Re: [PATCH] Get commits from remote repositories by HTTP
Date: Sat, 16 Apr 2005 20:16:36 -0700	[thread overview]
Message-ID: <200504170316.j3H3GaZ03333@unix-os.sc.intel.com> (raw)
In-Reply-To: <011201c542d5$940bb670$03c8a8c0@kroptech.com>

>How about building a file list and doing a batch download via 'wget -i 
>/tmp/foo'? A quick test (on my ancient wget-1.7) indicates that it reuses 
>connectionss when successive URLs point to the same server.

Here's a script that does just that.  So there is a burst of individual
wget commands to get HEAD, the top commit object, and all the tree
objects.  The just one to get all the missing blobs.

Subsequent runs will do far less work as many of the tree objects will
not have changed, so we don't descend into any tree that we already have.

-Tony

Not a patch ... it is a whole file.  I called it "git-wget", but it might
also want to be called "git-pulltop".

Signed-off-by: Tony Luck <tony.luck@intel.com>

------ script starts here -----
#!/bin/sh

# Copyright (C) 2005 Tony Luck

REMOTE=http://www.kernel.org/pub/linux/kernel/people/torvalds/linux-2.6.git/

rm -rf .gittmp
# set up a temp git repository so that we can use cat-file and ls-tree on the
# objects we pull without installing them into our tree. This allows us to
# restart if the download is interrupted
mkdir .gittmp
cd .gittmp
init-db

wget -q $REMOTE/HEAD

if cmp -s ../.git/HEAD HEAD
then
	echo Already have HEAD = `cat ../.git/HEAD`
	cd ..
	rm -rf .gittmp
	exit 0
fi

sha1=`cat HEAD`
sha1file=${sha1:0:2}/${sha1:2}

if [ -f ../.git/objects/$sha1file ]
then
	echo Already have most recent commit. Update HEAD to $sha1
	cd ..
	rm -rf .gittmp
	exit 0
fi

wget -q $REMOTE/objects/$sha1file -O .git/objects/$sha1file

treesha1=`cat-file commit $sha1 | (read tag tree ; echo $tree)`

get_tree()
{
	treesha1file=${1:0:2}/${1:2}
	if [ -f ../.git/objects/$treesha1file ]
	then
		return
	fi
	wget -q $REMOTE/objects/$treesha1file -O .git/objects/$treesha1file
	ls-tree $1 | while read mode tag sha1 name
	do
		subsha1file=${sha1:0:2}/${sha1:2}
		if [  -f ../.git/objects/$subsha1file ]
		then
			continue
		fi
		if [ $mode = 40000 ]
		then
			get_tree $sha1 `expr $2 + 1`
		else
			echo objects/$subsha1file >> needbloblist
		fi
	done
}

# get all the tree objects to our .gittmp area, and create list of needed blobs
get_tree $treesha1

# now get the blobs
cd ../.git
if [ -s ../.gittmp/needbloblist ]
then
	wget -q -r -nH  --cut-dirs=6 --base=$REMOTE -i ../.gittmp/needbloblist
fi

# Now we have the blobs, move the trees and commit from .gitttmp
cd ../.gittmp/.git/objects
find ?? -type f -print | while read f
do
	mv $f ../../../.git/objects/$f
done

# update HEAD
cd ../..
mv HEAD ../.git

cd ..
rm -rf .gittmp
------ script ends here -----

  parent reply	other threads:[~2005-04-17  3:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-16 22:03 [PATCH] Get commits from remote repositories by HTTP Daniel Barkalow
2005-04-16 22:17 ` Martin Mares
2005-04-16 22:43   ` Daniel Barkalow
2005-04-16 22:24 ` Tony Luck
2005-04-16 22:33   ` Daniel Barkalow
2005-04-16 22:42   ` Adam Kropelin
2005-04-16 22:45     ` Daniel Barkalow
2005-04-16 22:52       ` Adam Kropelin
2005-04-17  3:16     ` tony.luck [this message]
2005-04-18 18:41       ` tony.luck
2005-04-18 18:47         ` Petr Baudis
2005-04-18 20:48           ` tony.luck
2005-04-16 22:32 ` Jan-Benedict Glaw
2005-04-16 22:37   ` Daniel Barkalow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200504170316.j3H3GaZ03333@unix-os.sc.intel.com \
    --to=tony.luck@intel.com \
    --cc=akropel1@rochester.rr.com \
    --cc=barkalow@iabervon.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).