* updating only changed files source directory?
@ 2006-10-24 1:33 Han-Wen Nienhuys
2006-10-24 5:55 ` Shawn Pearce
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Han-Wen Nienhuys @ 2006-10-24 1:33 UTC (permalink / raw)
To: git
Hello there,
I'm just starting out with GIT. Initially, I want to use experiment
with integrating it into our binary builder structure for LilyPond.
The binary builder roughly does this:
1. get source code updates from a server to a single, local
repository. This is currently a git repository that is that
tracks our CVS server.
2. copy latest commit from a branch to separate source directory.
This copy should only update files that changed.
3. Incrementally compile from that source directory
The binary builder does this for several branches and several
platforms of the project. Due to parallel compilation, it might even
be possible that different branches of are being checked out
concurrently from a single repository.
For a VCS, this is slightly nonstandard use, as we don't do any work
in the working dir, we just compile from it, but have many working
directories.
I have some questions and remarks
* Is there a command analogous to git-clone for updating a repository?
Right now, I'm using a combination of
git-http-fetch -a <branch> <url>
wget <url>/refs/head/<branch> ## dump to <myrepo>/refs/head/<branch>
for all branches I want to know about. I was looking for a command
that would update the heads of all branches.
* Why is the order of args in git-http-fetch inconsistent with the
order in git-fetch? in fetch, the repository comes first, in
http-fetch, it comes last
* How do I update a source directory?
I can do the following
git --git-dir <myrepo> read-tree <committish>
cd <srcdir>
git --git-dir <myrepo> checkout-index -a -f
Unfortunately, this touches all files, which messes up the timestamps
triggering needless recompilation. How can I make checkout-index only
touch files that have changed? Or alternatively, make checkout-index
remember timestamps on files that didn't change?
Of course, I can store the commitish of the last version of the
srcdir, and apply the diff between both to the source directory, but
that seems somewhat convoluted. Is there a better way?
* As far as I can see, there is no reason to have only one index in a
git repository. Why isn't it possible to specify an alternate
index-file with an option similar to --git-dir ?
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: updating only changed files source directory?
2006-10-24 1:33 updating only changed files source directory? Han-Wen Nienhuys
@ 2006-10-24 5:55 ` Shawn Pearce
2006-10-24 7:48 ` Jakub Narebski
2006-10-24 19:12 ` Daniel Barkalow
2 siblings, 0 replies; 8+ messages in thread
From: Shawn Pearce @ 2006-10-24 5:55 UTC (permalink / raw)
To: Han-Wen Nienhuys; +Cc: git
Han-Wen Nienhuys <hanwen@xs4all.nl> wrote:
> For a VCS, this is slightly nonstandard use, as we don't do any work
> in the working dir, we just compile from it, but have many working
> directories.
Its not nonstandard use. A lot of projects perform rolling builds
which trigger anytime there are changes; very active projects
would always be building and thus would always want to have the
VCS only update those files which actually changed, to minimize
the compile time.
> I have some questions and remarks
>
> * Is there a command analogous to git-clone for updating a repository?
> Right now, I'm using a combination of
Yes, its called git-fetch and git-pull. Which leads us to...
> git-http-fetch -a <branch> <url>
> wget <url>/refs/head/<branch> ## dump to <myrepo>/refs/head/<branch>
>
> for all branches I want to know about. I was looking for a command
> that would update the heads of all branches.
Why not use git-fetch?
Create a .git/remotes file named 'origin' and put in there the URL
you want to fetch from and the list of branches you want to download
and keep current.
Then downloading the changes to the build repository is as simple
as running `git-fetch` with no parameters (as it defaults to reading
the origin file).
> * How do I update a source directory?
Always keep the source directory on a branch that is not listed
in the .git/remotes/origin file. This way the fetch will always
succeed without failure.
Then you can do after the fetch:
git-reset --hard <committish>
and the source directory will be updated to <committish> (which
could just be a branch name of one of those branches you fetch,
or could be a full SHA1, or a tag, etc.).
The reset --hard process will only change the files that really have
to change. This means it will run in linear time proportional to the
number of files needing to be updated; and only those files which are
different between the working directory and <committish> will have
new modification dates. Therefore incremental rebuilds will work.
> * As far as I can see, there is no reason to have only one index in a
> git repository. Why isn't it possible to specify an alternate
> index-file with an option similar to --git-dir ?
The index is key to getting the fast update of the working directory.
You can change the index with the (rather undocuments) GIT_INDEX_FILE
environment variable. I do this in a few tools I have written
around Git, but I don't do it very often.
--
Shawn.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: updating only changed files source directory?
2006-10-24 1:33 updating only changed files source directory? Han-Wen Nienhuys
2006-10-24 5:55 ` Shawn Pearce
@ 2006-10-24 7:48 ` Jakub Narebski
2006-10-24 9:50 ` Han-Wen Nienhuys
2006-10-24 19:12 ` Daniel Barkalow
2 siblings, 1 reply; 8+ messages in thread
From: Jakub Narebski @ 2006-10-24 7:48 UTC (permalink / raw)
To: git
Han-Wen Nienhuys wrote:
> I have some questions and remarks
I see that you are using fairly low level commands (plumbing commands)
> git-http-fetch -a <branch> <url>
> wget <url>/refs/head/<branch> ## dump to <myrepo>/refs/head/<branch>
instead of setting $GIT_DIR/remotes/origin file and using "git fetch".
BTW. "git fetch" will not update branch you are on, unless --update-head-ok
option is used.
> git --git-dir <myrepo> read-tree <committish>
>
> cd <srcdir>
> git --git-dir <myrepo> checkout-index -a -f
instead of
git --git-dir=<myrepo> checkout <branch>
(-f is Force a re-read of everything)
> * As far as I can see, there is no reason to have only one index in a
> git repository. Why isn't it possible to specify an alternate
> index-file with an option similar to --git-dir ?
--git-dir is alternative to setting GIT_DIR. You can use GIT_INDEX_FILE
to specify alternate index file. Documented in git(7), section
"ENVIRONMENT VARIABLES".
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: updating only changed files source directory?
2006-10-24 7:48 ` Jakub Narebski
@ 2006-10-24 9:50 ` Han-Wen Nienhuys
2006-10-24 10:13 ` Jakub Narebski
0 siblings, 1 reply; 8+ messages in thread
From: Han-Wen Nienhuys @ 2006-10-24 9:50 UTC (permalink / raw)
To: git
Jakub Narebski escreveu:
> Han-Wen Nienhuys wrote:
>
>> I have some questions and remarks
>
> I see that you are using fairly low level commands (plumbing commands)
>
>> git-http-fetch -a <branch> <url>
>> wget <url>/refs/head/<branch> ## dump to <myrepo>/refs/head/<branch>
>
> instead of setting $GIT_DIR/remotes/origin file and using "git fetch".
> BTW. "git fetch" will not update branch you are on, unless --update-head-ok
> option is used.
I tried fetch, but was put off by the warnings because I didn't have
--update-head-ok. Using lowlevel commands is my way of making sure that
Git doesn't assume it needs to do anything intelligent.
>> git --git-dir <myrepo> read-tree <committish>
>>
>> cd <srcdir>
>> git --git-dir <myrepo> checkout-index -a -f
>
> instead of
> git --git-dir=<myrepo> checkout <branch>
> (-f is Force a re-read of everything)
Yes, however,
checkout
changes the state of the repository, which is something I want to prevent.
>> * As far as I can see, there is no reason to have only one index in a
>> git repository. Why isn't it possible to specify an alternate
>> index-file with an option similar to --git-dir ?
>
> --git-dir is alternative to setting GIT_DIR. You can use GIT_INDEX_FILE
> to specify alternate index file. Documented in git(7), section
> "ENVIRONMENT VARIABLES".
Silly me, I overlooked in the manpage. Note that it is standard to put
the environment section at the end of the manpage. Right now it's
somewhere in the middle.
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: updating only changed files source directory?
2006-10-24 9:50 ` Han-Wen Nienhuys
@ 2006-10-24 10:13 ` Jakub Narebski
0 siblings, 0 replies; 8+ messages in thread
From: Jakub Narebski @ 2006-10-24 10:13 UTC (permalink / raw)
To: Han-Wen Nienhuys; +Cc: git
Han-Wen Nienhuys wrote:
> Jakub Narebski escreveu:
>> Han-Wen Nienhuys wrote:
>>
>> I see that you are using fairly low level commands (plumbing commands)
>>
>>> git-http-fetch -a <branch> <url>
>>> wget <url>/refs/head/<branch> ## dump to <myrepo>/refs/head/<branch>
>>
>> instead of setting $GIT_DIR/remotes/origin file and using "git fetch".
>> BTW. "git fetch" will not update branch you are on, unless --update-head-ok
>> option is used.
>
> I tried fetch, but was put off by the warnings because I didn't have
> --update-head-ok. Using lowlevel commands is my way of making sure that
> Git doesn't assume it needs to do anything intelligent.
You can either have additional branch which is not tracking branch
(you don't fetch into this branch), and on which you are always on,
called for example 'check-out' (and which can be used for git-reset
solution to checking out files to external directory), and use
git-fetch without --update-head-ok, or (if the repository is bare
repository, without working area) use --update-head-ok.
>>> git --git-dir <myrepo> read-tree <committish>
>>>
>>> cd <srcdir>
>>> git --git-dir <myrepo> checkout-index -a -f
>>
>> instead of
>> git --git-dir=<myrepo> checkout <branch>
>> (-f is Force a re-read of everything)
git-checkout-index(1):
-f|--force
forces overwrite of existing files
So probably you would get what you want if you lose '-f'.
> Yes, however,
>
> git checkout
>
> changes the state of the repository, which is something I want to prevent.
Well, git-reset also changes state of repository, but it changes only
the branch we have created exactly for this purpose.
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: updating only changed files source directory?
2006-10-24 1:33 updating only changed files source directory? Han-Wen Nienhuys
2006-10-24 5:55 ` Shawn Pearce
2006-10-24 7:48 ` Jakub Narebski
@ 2006-10-24 19:12 ` Daniel Barkalow
2006-10-25 11:58 ` Han-Wen Nienhuys
2 siblings, 1 reply; 8+ messages in thread
From: Daniel Barkalow @ 2006-10-24 19:12 UTC (permalink / raw)
To: Han-Wen Nienhuys; +Cc: git
On Tue, 24 Oct 2006, Han-Wen Nienhuys wrote:
>
> Hello there,
>
> I'm just starting out with GIT. Initially, I want to use experiment with
> integrating it into our binary builder structure for LilyPond.
>
> The binary builder roughly does this:
>
> 1. get source code updates from a server to a single, local
> repository. This is currently a git repository that is that
> tracks our CVS server.
>
> 2. copy latest commit from a branch to separate source directory.
> This copy should only update files that changed.
>
> 3. Incrementally compile from that source directory
The terminology in the git world is, I think, a little different from what
you expect. We call the thing that contains all of the tracked information
(what you're calling the repository) the "object database"; what we call
the "repository" is a bit different: it primarily keeps track of the heads
of branches, in addition to either containing an object database or
referencing an external one. So you need a repository for each source
directory (because it keeps track of what commit is currently in the
source directory), but it doesn't need to have its own complete object
database, which is what you're trying to share between all of them.
You have a single repository with no source directory that contains the
database and the heads according to the upstream source, and then each
source directory has a repository that contains the head as far as you've
built it in that directory. You fetch into the single bare repository
from upstream, and then pull into each source directory from the bare
repository; this will do the minimal update to the contents of the source
directory automatically.
I think that you want to request a few git features:
- support having a bare repository not on a branch, so that it can fetch
all heads from its upstream. You're not doing anything branch-specific
in the bare repository anyway, but git currently wants a valid HEAD to
accept a path as containing a git repository
- support getting an origin remote configuration with a bare repository
- support cloning a branch of a repository, such that the clone's
"origin" is the upstream's chosen branch, not its "master".
- support cloning without generating a "master" branch in the clone, and
instead starting on "origin"
Then you do:
git clone --bare --no-head --with-origin <upstream> REPOSITORY.git
for each branch:
git clone --shared --branch=<branch> --no-master REPOSITORY.git <branch>
When you want to update:
GIT_DIR=REPOSITORY.git git fetch
for each branch:
(cd <branch>; git pull; make)
Note that all of the features you need are in "clone" for setting things
up nicely automatically; if you arrange everything by hand just right, you
can already to the updating procedure I give.
-Daniel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: updating only changed files source directory?
2006-10-24 19:12 ` Daniel Barkalow
@ 2006-10-25 11:58 ` Han-Wen Nienhuys
2006-10-25 19:35 ` Daniel Barkalow
0 siblings, 1 reply; 8+ messages in thread
From: Han-Wen Nienhuys @ 2006-10-25 11:58 UTC (permalink / raw)
To: Daniel Barkalow; +Cc: git
Daniel Barkalow escreveu:
>> I'm just starting out with GIT. Initially, I want to use experiment with
>> integrating it into our binary builder structure for LilyPond.
>>
>> The binary builder roughly does this:
>>
>> 1. get source code updates from a server to a single, local
>> repository. This is currently a git repository that is that
>> tracks our CVS server.
>>
>> 2. copy latest commit from a branch to separate source directory.
>> This copy should only update files that changed.
>>
>> 3. Incrementally compile from that source directory
>
> The terminology in the git world is, I think, a little different from what
> you expect. We call the thing that contains all of the tracked information
> (what you're calling the repository) the "object database"; what we call
yes, you hit the nail on the head.
> referencing an external one. So you need a repository for each source
> directory (because it keeps track of what commit is currently in the
> source directory), but it doesn't need to have its own complete object
> database, which is what you're trying to share between all of them.
OK. This makes sense; thanks for this pointer.
How can I set the object database? I found GIT_OBJECT_DIRECTORY, but
can I write a config file entry for that?
> built it in that directory. You fetch into the single bare repository
> from upstream, and then pull into each source directory from the bare
> repository; this will do the minimal update to the contents of the source
> directory automatically.
yes, this works. Thanks!
--
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: updating only changed files source directory?
2006-10-25 11:58 ` Han-Wen Nienhuys
@ 2006-10-25 19:35 ` Daniel Barkalow
0 siblings, 0 replies; 8+ messages in thread
From: Daniel Barkalow @ 2006-10-25 19:35 UTC (permalink / raw)
To: Han-Wen Nienhuys; +Cc: git
On Wed, 25 Oct 2006, Han-Wen Nienhuys wrote:
> How can I set the object database? I found GIT_OBJECT_DIRECTORY, but can I
> write a config file entry for that?
If you clone with --shared, it'll do the right thing automatically, which
is to have the clone's .git/objects/info/alternates be the objects
directory of the bare repository.
(Note that any new objects you create in the clone go into the clone's own
objects database. This shouldn't matter for you, unless your build system
is tagging things or something, but if you end up doing development in a
similarly structured system, it's worth knowing that this doesn't affect
the bare repository at all.)
> yes, this works. Thanks!
No problem. :)
-Daniel
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-10-25 19:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-24 1:33 updating only changed files source directory? Han-Wen Nienhuys
2006-10-24 5:55 ` Shawn Pearce
2006-10-24 7:48 ` Jakub Narebski
2006-10-24 9:50 ` Han-Wen Nienhuys
2006-10-24 10:13 ` Jakub Narebski
2006-10-24 19:12 ` Daniel Barkalow
2006-10-25 11:58 ` Han-Wen Nienhuys
2006-10-25 19:35 ` Daniel Barkalow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).