* Clone a repository with only the objects needed for a single tag
@ 2005-11-02 2:02 Ben Lau
2005-11-02 7:01 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Ben Lau @ 2005-11-02 2:02 UTC (permalink / raw)
To: git
Hi,
Is there any method to clone/copy a repository with only the objects
needed
for a single tag in order to save disk space?
For example, if I want to start a new project based on a specific
version of kernel
like v2.6.14. I would run `git-clone` and then checkout a new branch
based on the tag.
However, one of the development host is a notebook computer which has
only 2GB space
leave. Therefore I hope the space occupied by the respository could be
mininiumized
while it keeps to be able to fetch/update from my master repository.
Thanks for any advise.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Clone a repository with only the objects needed for a single tag
2005-11-02 2:02 Clone a repository with only the objects needed for a single tag Ben Lau
@ 2005-11-02 7:01 ` Junio C Hamano
2005-11-02 8:27 ` Ben Lau
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2005-11-02 7:01 UTC (permalink / raw)
To: Ben Lau; +Cc: git
Ben Lau <benlau@ust.hk> writes:
> Is there any method to clone/copy a repository with only the
> objects needed for a single tag in order to save disk space?
> For example, if I want to start a new project based on a
> specific version of kernel like v2.6.14. I would run
> `git-clone` and then checkout a new branch based on the tag.
Depends on what you want to do in that shallow copy.
If the only thing you would want to do is to build it, then you
could 'git-tar-tree v2.6.14' and extract that on your notebook.
The output is just a tar so there will no history, though.
If you want to develop while on the road, but do not
particularly need to be able to inspect the history beyond the
point you started, you could create a deliberately broken
repository, using the git-shallow-clone script (attached), like
this:
$ git clone -n $mothership satellite
$ cd satellite
$ git-shallow-pack --all
$ rm -f .git/objects/pack/pack-*
$ mv pack-* .git/objects/pack/.
If the original repository you are cloning from is local, you
could instead do:
$ git clone -l -s -n $mothership satellite
$ cd satellite
$ git-shallow-pack --all
$ rm .git/objects/info/alternates
$ mv pack-* .git/objects/pack/.
You could develop in this repository, even build up your own
commit chains, and when you come back you could push from this
repository back to your 'mothership' repository. In essense,
any operation that does not require you to have full history
should work.
One important thing that would not always work would be to pull
into this repository over git-aware protocols, although pulling
from your 'mothership' repository would probably work most of
the time.
One case that would probably break is if the mothership side
reverted a commit beyond this shallow-pack boundary and then you
try to pull from there. After the revert, the trees and blobs
in that new commit you will be pulling from the mothership are
likely to be the same as the ones contained in commits before
the shallow clone is made. Because your satellite repo would
claim to have everything that is reachable from the tip (as of
the time the clone was made) of the branch, you cannot complain
if the mothership side assumes you must have those blobs and
trees and did not send them to you when you pull.
---
#!/bin/sh
# git-shallow-pack
git-rev-parse --revs-only --no-flags --default HEAD "$@" |
while read sha1
do
echo "$sha1"
while type=`git-cat-file -t "$sha1"` &&
case "$type" in tag) ;; *) break ;; esac
do
next=`git-cat-file tag "$sha1" |
sed -ne 's/^object //p' -e q`
echo "$next"
sha1="$next"
done
git-rev-parse --verify "$sha1^{tree}" 2>/dev/null &&
git-ls-tree -r "$sha1" | sed -e 's/^[0-7]* [^ ]* //'
done |
sort -k 1,1 -u |
git-pack-objects pack
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Clone a repository with only the objects needed for a single tag
2005-11-02 7:01 ` Junio C Hamano
@ 2005-11-02 8:27 ` Ben Lau
2005-11-02 8:49 ` Andreas Ericsson
2005-11-02 9:20 ` Junio C Hamano
0 siblings, 2 replies; 6+ messages in thread
From: Ben Lau @ 2005-11-02 8:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Hi Junio,
It works! Thanks a lot.
However, it has a problem when involves the gitk/git-log. Because the
parent commit is missed in the shallow repository, gitk would complain
the object is missed and exit immediately. To solve the problem, i added
a pair of
ID of the new root object into .git/info/grafts.
Example:
$ cat .git/info/grafts
741b2252a5e14d6c60a913c77a6099abe73a854a
741b2252a5e14d6c60a913c77a6099abe73a854a
git-log/gitk do not complains afterward, but it also make gitk shows
nothing during run. Any solution?
By the way, although I am not sure do any other people also require
this feature, I wish the process could be more smooth. As the
git-shallow-pack script do not destroy and modify any thing in the
cloned repository beside the newly created pack file, I would suggest it
can run the script in the 'monthership' repository, take out
the pack file to another directory or remote machine. And then build a
new git repository based on the pack file.
To achieve the process, it need another script that could create a git
repository from pack file. Do any similar script existed?
Junio C Hamano wrote:
>Ben Lau <benlau@ust.hk> writes:
>
>
>
>> Is there any method to clone/copy a repository with only the
>>objects needed for a single tag in order to save disk space?
>> For example, if I want to start a new project based on a
>>specific version of kernel like v2.6.14. I would run
>>`git-clone` and then checkout a new branch based on the tag.
>>
>>
>
>Depends on what you want to do in that shallow copy.
>
>If the only thing you would want to do is to build it, then you
>could 'git-tar-tree v2.6.14' and extract that on your notebook.
>The output is just a tar so there will no history, though.
>
>If you want to develop while on the road, but do not
>particularly need to be able to inspect the history beyond the
>point you started, you could create a deliberately broken
>repository, using the git-shallow-clone script (attached), like
>this:
>
> $ git clone -n $mothership satellite
> $ cd satellite
> $ git-shallow-pack --all
> $ rm -f .git/objects/pack/pack-*
> $ mv pack-* .git/objects/pack/.
>
>If the original repository you are cloning from is local, you
>could instead do:
>
> $ git clone -l -s -n $mothership satellite
> $ cd satellite
> $ git-shallow-pack --all
> $ rm .git/objects/info/alternates
> $ mv pack-* .git/objects/pack/.
>
>You could develop in this repository, even build up your own
>commit chains, and when you come back you could push from this
>repository back to your 'mothership' repository. In essense,
>any operation that does not require you to have full history
>should work.
>
>One important thing that would not always work would be to pull
>into this repository over git-aware protocols, although pulling
>from your 'mothership' repository would probably work most of
>the time.
>
>One case that would probably break is if the mothership side
>reverted a commit beyond this shallow-pack boundary and then you
>try to pull from there. After the revert, the trees and blobs
>in that new commit you will be pulling from the mothership are
>likely to be the same as the ones contained in commits before
>the shallow clone is made. Because your satellite repo would
>claim to have everything that is reachable from the tip (as of
>the time the clone was made) of the branch, you cannot complain
>if the mothership side assumes you must have those blobs and
>trees and did not send them to you when you pull.
>
>---
>#!/bin/sh
># git-shallow-pack
>
>git-rev-parse --revs-only --no-flags --default HEAD "$@" |
>while read sha1
>do
> echo "$sha1"
> while type=`git-cat-file -t "$sha1"` &&
> case "$type" in tag) ;; *) break ;; esac
> do
> next=`git-cat-file tag "$sha1" |
> sed -ne 's/^object //p' -e q`
> echo "$next"
> sha1="$next"
> done
> git-rev-parse --verify "$sha1^{tree}" 2>/dev/null &&
> git-ls-tree -r "$sha1" | sed -e 's/^[0-7]* [^ ]* //'
>done |
>sort -k 1,1 -u |
>git-pack-objects pack
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Clone a repository with only the objects needed for a single tag
2005-11-02 8:27 ` Ben Lau
@ 2005-11-02 8:49 ` Andreas Ericsson
2005-11-02 9:10 ` Ben Lau
2005-11-02 9:20 ` Junio C Hamano
1 sibling, 1 reply; 6+ messages in thread
From: Andreas Ericsson @ 2005-11-02 8:49 UTC (permalink / raw)
To: git
Ben Lau wrote:
> Hi Junio,
>
> It works! Thanks a lot.
>
> However, it has a problem when involves the gitk/git-log.
Both those programs use the history, which you don't have. With a
shallow repository like this some commands just won't work. git-*log and
gitk are among those.
>
> git-log/gitk do not complains afterward, but it also make gitk shows
> nothing during run. Any solution?
>
* Get a second machine with more disk-space and run the history tools
there. If you use it as mothership and push your commits to it you'll be
able to track your changes from the laptop as well.
* Get a larger disk. They're not terribly expensive now adays.
* Check out one tag (i.e. release) more than you need. Then you'll get
history back to that tag.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Clone a repository with only the objects needed for a single tag
2005-11-02 8:49 ` Andreas Ericsson
@ 2005-11-02 9:10 ` Ben Lau
0 siblings, 0 replies; 6+ messages in thread
From: Ben Lau @ 2005-11-02 9:10 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: git
Andreas Ericsson wrote:
> Ben Lau wrote:
>
>> Hi Junio,
>>
>> It works! Thanks a lot.
>>
>> However, it has a problem when involves the gitk/git-log.
>
>
>
> Both those programs use the history, which you don't have. With a
> shallow repository like this some commands just won't work. git-*log
> and gitk are among those.
yes, it is expected. However, I just think with a fake parent like what
.git/info/grafts could provide may solve the issue. At least it make
git-log be happy to show the only log message in the shallow repository
without any error.
The rest of problem is gitk could not show any item like what git-log shown.
I am not sure is it right to have a pair of same ID into
.git/info/grafts or it should be solved by a little patch to gitk.
>
>>
>> git-log/gitk do not complains afterward, but it also make gitk shows
>> nothing during run. Any solution?
>>
>
> * Get a second machine with more disk-space and run the history tools
> there. If you use it as mothership and push your commits to it you'll
> be able to track your changes from the laptop as well.
>
> * Get a larger disk. They're not terribly expensive now adays.
>
> * Check out one tag (i.e. release) more than you need. Then you'll get
> history back to that tag.
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Clone a repository with only the objects needed for a single tag
2005-11-02 8:27 ` Ben Lau
2005-11-02 8:49 ` Andreas Ericsson
@ 2005-11-02 9:20 ` Junio C Hamano
1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2005-11-02 9:20 UTC (permalink / raw)
To: Ben Lau; +Cc: git
Ben Lau <benlau@ust.hk> writes:
> However, it has a problem when involves the gitk/git-log.
That's why I said anything that requires you to have a complete
history would not work.
The shallow repository is by definition a *broken* repository;
the refs are supposed to mean the repository has everything
reachable from them, and many tools rely on that assumption, but
the shallow setup deliberately breaks that assumption, so you
need to be aware of what operations you can and cannot do
without having the full history. Although using grafts to
cauterize somewhat helps as you discovered, you are operating in
"do it at your own risk" territory.
Having said that, here are the things that *should* work without
having full history (not many):
. git diff between your index, working tree, and commits
your shallow setup happens to have.
. git commit on top of any of the tip of branches your
shallow copy started out with, including git am and git
applymbox.
. git fetch/pull over commit walker from a remote
repository.
. git push to send the work done in the shallow repository
back into your mothership repository (running git pull on
the mothership to fetch from the shallow copy probably
would not work, unless you use commit walkers).
. git whatchanged, git log, and gitk to view the work
you did in your shallow repository.
Creating packs in the mothership repository is certainly
possible. You could instead do this:
$ git-shallow-pack --all ;# in mothership
$ mkdir -p /var/tmp/shallow
$ tar cf - .git/HEAD .git/refs/ |
(cd /var/tmp/shallow; git-init-db; tar xf -)
$ mv pack-* /var/tmp/shallow/.git/objects/pack
If you want a bit deeper history, instead of giving '--all' to
git-shallow-pack, you could probably say something silly like
this:
$ (git-rev-parse --all; git-rev-list --max-count=20 HEAD) |
xargs git-shallow-pack
The shallow-pack script I sent earlier is probably not very
useful in practice. To polish it to be somewhat more useful, it
would probably need the following enhancements, at least:
. Instead of taking 'git-rev-parse' arguments, take the
names of refs;
. Instead of packing the objects contained in the
commits and trees the named refs reference, include
all blobs/trees/commits between the given refs and
their common ancestor to create a pack;
. Create a tarball that contains:
(1) the pack file created by the above procedure,
stored in .git/objects/pack/.;
(2) .git/refs/ to be used in the shallow copy; this
should include only the refs given to the command
to create the above pack.
(3) .git/info/grafts to cauterize the common ancestor
commit and side branches merged into the lines you
are taking (computing the latter may be somewhat
expensive).
Then the command can be run in the mothership repository like
this:
$ cd linux-2.6
$ git-shallow-pack v2.6.14 v2.6.14-rc2 master
to produce a tarball, which can be taken to another location.
When extracted, it would contain all commits between the three
named refs, and you could view the history across them.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-11-02 9:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-02 2:02 Clone a repository with only the objects needed for a single tag Ben Lau
2005-11-02 7:01 ` Junio C Hamano
2005-11-02 8:27 ` Ben Lau
2005-11-02 8:49 ` Andreas Ericsson
2005-11-02 9:10 ` Ben Lau
2005-11-02 9:20 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).