* Looking for a way to set up Git correctly @ 2010-11-11 3:25 Dennis 2010-11-11 9:38 ` Alex Riesen 2010-11-11 13:25 ` Enrico Weigelt 0 siblings, 2 replies; 5+ messages in thread From: Dennis @ 2010-11-11 3:25 UTC (permalink / raw) To: git I have a situation. I have started a web project (call it branch1), and have maintained it without a version control system for quite some time. Then, I copied it to another folder (branch2) and while the project remained essentially the same, I have changed a few of internal paths and some variable names inside the files. Then, a few months later on, I copied branch2 to a folder called branch3 and also modified some of the variable names and some of the internal structure of the files. Thus I ended up with 3 folders on my local HDD with pretty much the same file names and folder structure and everything, and most of the file content, except those small deltas that made those files different for each branch. I guess it's never too late, and now I want to put these 3 projects into a version control system, and I chose git. Now, this can be either really simple or really complicated. My first question is: how do I set the repository up in the proper way where I could work on all 3 projects separately, with additional possibility of working on branch1 only and later committing my changes to branch2 and branch3. (Since projects are virtually identical, a fix in one branch usually needs to be propagated to other branches) First, I assume I will use a single repository for this. Then, do I simply set up 3 branches and start using them, or is there a way to set git up to capitalize on the projects being nearly identical? My second question is that each branch has a huge folder with image data. By huge I mean 1 to 4Gb, depending on the branch. Since images are not directly relevant to the development work, is there a way to not include those folders in git? To be honest though, I probably should include them, but I wanted to ask about this separately as git repository may be get large, since all 3 branches may grow to 9Gb or so. Thus I am looking for a git way to handle my situation. Is this simple or is is hard? Are there any recommendations before I jump in? Dennis ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Looking for a way to set up Git correctly 2010-11-11 3:25 Looking for a way to set up Git correctly Dennis @ 2010-11-11 9:38 ` Alex Riesen 2010-11-11 13:25 ` Enrico Weigelt 1 sibling, 0 replies; 5+ messages in thread From: Alex Riesen @ 2010-11-11 9:38 UTC (permalink / raw) To: Dennis; +Cc: git On Thu, Nov 11, 2010 at 04:25, Dennis <denny@dennymagicsite.com> wrote: > I have started a web project (call it branch1), and have maintained it > without a version control system for quite some time. > Then, I copied it to another folder (branch2) and while the project remained > essentially the same, I have changed a few of internal paths and some > variable names inside the files. > Then, a few months later on, I copied branch2 to a folder called branch3 and > also modified some of the variable names and some of the internal structure > of the files. > > Thus I ended up with 3 folders on my local HDD with pretty much the same > file names and folder structure and everything, and most of the file > content, except those small deltas that made those files different for each > branch. > > I guess it's never too late, and now I want to put these 3 projects into a > version control system, and I chose git. > > Now, this can be either really simple or really complicated. My first > question is: how do I set the repository up in the proper way where I could > work on all 3 projects separately, with additional possibility of working on > branch1 only and later committing my changes to branch2 and branch3. (Since > projects are virtually identical, a fix in one branch usually needs to be > propagated to other branches) > First, I assume I will use a single repository for this. Then, do I simply > set up 3 branches and start using them, or is there a way to set git up to > capitalize on the projects being nearly identical? Assuming I've got the relationships of your "branches" right: $ cp -a branch1 branch && cd branch $ git init $ echo /huge-images/ >.gitignore $ git add .gitignore; git add .; git commit; git branch branch1 $ git checkout -b branch2 $ cp -a ../branch2 . $ git add .; git commit $ git checkout -b branch3 $ cp -a ../branch3 . $ git add .; git commit > My second question is that each branch has a huge folder with image data. By > huge I mean 1 to 4Gb, depending on the branch. Since images are not > directly relevant to the development work, is there a way to not include > those folders in git? To be honest though, I probably should include them, > but I wanted to ask about this separately as git repository may be get > large, since all 3 branches may grow to 9Gb or so. > > Thus I am looking for a git way to handle my situation. Is this simple or > is is hard? If you add the images you will eventually run into problems (heavy swapping, for one). Git is not really setup to work with big binary files (a file must fit into memory completely). ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Looking for a way to set up Git correctly 2010-11-11 3:25 Looking for a way to set up Git correctly Dennis 2010-11-11 9:38 ` Alex Riesen @ 2010-11-11 13:25 ` Enrico Weigelt 2010-11-11 16:46 ` Jonathan Nieder 1 sibling, 1 reply; 5+ messages in thread From: Enrico Weigelt @ 2010-11-11 13:25 UTC (permalink / raw) To: git * Dennis <denny@dennymagicsite.com> wrote: Hi, > Now, this can be either really simple or really complicated. My first > question is: how do I set the repository up in the proper way where I > could work on all 3 projects separately, with additional possibility of > working on branch1 only and later committing my changes to branch2 and > branch3. As first step you could create 3 separate git repos in each directory and add everything to it (git init, git add -A, git commit). Then rename the branches properly (so instead of "master", they'll be called "branch1", "branch2", "branch2" or something like that). Create another (maybe bare) repo elsewhere, add it as remote to the three other ones and push their branches upwards. Now you have 4 repos, 3 for working on the individual branches and another for collecting them all (hub model). You could also choose to throw the first three away and only work in the last one. > (Since projects are virtually identical, a fix in one branch > usually needs to be propagated to other branches) In your case, cherry-pick might be the right for you. You could also do a little bit refactoring, making a 4th branch which the other 3 are then rebased onto. Then you could do your fixes in that branch and merged into or rebase the other 3 onto that one. > My second question is that each branch has a huge folder with image data. > By huge I mean 1 to 4Gb, depending on the branch. Since images are not > directly relevant to the development work, is there a way to not include > those folders in git? see .gitignore file. nevertheless it might be useful to also have all the images in the repo for backup reasons. BTW: if you're concerned about disk space, you could add the object dir of the 4th (hub) repository to the 3 working repos (run git-gc in the hub repo before that!). Next gc runs will remove the objects that are already present in the hub. But beware! If you remove something in the hub repo and run git-gc there, you could loose objects in the other repos! (maybe it would be wise to add the 3 working repos as remotes in the hub and always run an git remote update before git-gc in the hub). cu -- ---------------------------------------------------------------------- Enrico Weigelt, metux IT service -- http://www.metux.de/ phone: +49 36207 519931 email: weigelt@metux.de mobile: +49 151 27565287 icq: 210169427 skype: nekrad666 ---------------------------------------------------------------------- Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme ---------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Looking for a way to set up Git correctly 2010-11-11 13:25 ` Enrico Weigelt @ 2010-11-11 16:46 ` Jonathan Nieder [not found] ` <20101111190724.00vcimqm8w0cw8s0@dennymagicsite.com> 0 siblings, 1 reply; 5+ messages in thread From: Jonathan Nieder @ 2010-11-11 16:46 UTC (permalink / raw) To: Dennis; +Cc: git, Alex Riesen, Enrico Weigelt (+cc: Dennis again, Alex) Hi, Enrico Weigelt wrote: > * Dennis <denny@dennymagicsite.com> wrote: >> Now, this can be either really simple or really complicated. My first >> question is: how do I set the repository up in the proper way where I >> could work on all 3 projects separately, with additional possibility of >> working on branch1 only and later committing my changes to branch2 and >> branch3. > > As first step you could create 3 separate git repos in each directory > and add everything to it (git init, git add -A, git commit). Then > rename the branches properly (so instead of "master", they'll be called > "branch1", "branch2", "branch2" or something like that). Create another > (maybe bare) repo elsewhere, add it as remote to the three other ones > and push their branches upwards. So this looks like so: for i in project1 project2 project3 do ( cd "$i" git init git add . git commit ) done git init main cd main for i in project1 project2 project3 do git fetch ../$i master:$i done mv project1 project2 project3 away/ If you would like multiple worktrees (one for each branch, maybe) for the main repo, you might want to look into the new-workdir script in contrib/workdir (but do consider the caveats[1]). >> (Since projects are virtually identical, a fix in one branch >> usually needs to be propagated to other branches) > > In your case, cherry-pick might be the right for you. e.g., when project3 gets a new fix: git checkout project1 git cherry-pick project3 > You could also do a little bit refactoring, making a 4th branch which > the other 3 are then rebased onto. Right, what is the actual relationship between these projects? Do they actually represent branches in the history of a single project? Suppose project1 is historically an ancestor to project2, project3, and project4, which are independent. (Maybe project1 is the initial version and projects 2,3,4 are ports to other platforms.) You could take this into account when initially setting up the branches, like this: git init main cd main GIT_DIR=$(pwd)/.git; export GIT_DIR GIT_WORK_TREE=../project1 git add . GIT_WORK_TREE=../project1 git commit git branch -m project1 for i in project2 project3 project4 do git checkout -b $i project1 GIT_WORK_TREE=../$i git add -A GIT_WORK_TREE=../$i git commit done (and use gitk --all when done to make sure everything looks right) Alternatively, you can rearrange the history afterwards: $ git cat-file commit project2 | tee project2 tree 76db51024713f6ef191928a8445d48d39ab55434 author Junio C Hamano <gitster@pobox.com> 1289324716 -0800 committer Junio C Hamano <gitster@pobox.com> 1289324716 -0800 project2: an excellent project $ git rev-parse project1 $ vi project2 ... add a "parent <object id>" line after the tree line, where <object id> is the full object name rev-parse printed ... $ git hash-object -t commit -w project2 $ git branch -f branch2 <the object name hash-object prints> ... repeat for project3 and project4 ... $ gitk --all; # to make sure everything looks right This is less convenient than it ought to be. It would be nice to add a "git graft" command to automate this procedure, which - interacts well with "git replace" - doesn't interact poorly with "git fetch" like .git/info/grafts does - could be more convenient to use than .git/info/grafts. As the gitworkflows man page mentions, if you make your fixes on the oldest branch they apply to (project1) and then merge to all later branches, then the fixes will propagate forward correctly. See the "Graduation" and "Merging upwards" sections of gitworkflows for details. >> My second question is that each branch has a huge folder with image data. >> By huge I mean 1 to 4Gb, depending on the branch. Since images are not >> directly relevant to the development work, is there a way to not include >> those folders in git? I would suggest tracking a symlink to another repository (or to a directory tracked through other means, like unison). Hope that helps, Jonathan [1] If you have two worktrees for the same project with the same branch checked out at a given moment, the results can be confusing (changes made in one worktree will look like they have been commited and undone in the other). The "detached HEAD" feature (which git-checkout.1 explains) and multiple worktrees do not interact so well: the need to preserve commits while no branch was checked out in one worktree will not be taken into account when "git gc" runs (explicitly or implicitly!) on the other. This can be very disconcerting. ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20101111190724.00vcimqm8w0cw8s0@dennymagicsite.com>]
* Re: Looking for a way to set up Git correctly [not found] ` <20101111190724.00vcimqm8w0cw8s0@dennymagicsite.com> @ 2010-11-11 19:38 ` Jonathan Nieder 0 siblings, 0 replies; 5+ messages in thread From: Jonathan Nieder @ 2010-11-11 19:38 UTC (permalink / raw) To: denny; +Cc: git, Alex Riesen, Enrico Weigelt denny@dennymagicsite.com wrote: > I am still looking through your replies and getting familiar with > git commands. By the way, please ignore that GIT_WORK_TREE stuff I did. It probably works, but it's ugly. :) That example could have been written better as git init everything GIT_DIR=$(pwd)/everything/.git; export GIT_DIR ( cd common-ancestor git add -A git commit git branch -m ancestor ) ( cd project1 git checkout -b project1 ancestor git add -A git commit ) ... etc .. unset GIT_DIR cd everything git checkout project1 [...] > From a developer's point of view, working on projectX means making > some changes and committing them to the repo for that project. The > developer may not be aware of other pojects existing. For concreteness, I am imagining these directories represent various versions of the Almquist shell. The common ancestor is the BSD4.3/Net-2 version and various projects may have built from there in different directions: NetBSD sh, FreeBSD sh, dash. (Yes, I am oversimplifying. :)) Now suppose they have diverged so wildly that it is never possible to synchronize code with each other. Instead, they can copy fixes, and this is especially convenient when the fixes are phrased as diffs to the common ancestor. To facilitate this, Alice revives the BSD4.3/Net-2 sh project with a "fixes only" policy. Her daily work might look like this: $ git fetch netbsd $ git log netbsd/for-alice@{1}..netbsd/for-alice; # any good patches today? $ git cherry-pick -s 67fd89980; # a good patch. ... quick test ... $ git cherry-pick -s 897ac8; # another good patch. ... quick test ... ... $ git fetch freebsd ... and similarly for the rest of the patch submitters ... $ git am emailed-patch Then to more thoroughly test the result: $ git checkout -b throwaway; # new throw-away branch.[1] $ git merge netbsd/master; # will the changes work for netbsd? ... thorough test ... $ git reset --keep master $ git merge freebsd/master; # how about freebsd? ... etc ... And finally she pushes the changes out. > Without knowing anything about git for a moment, one ideal workflow > is where a developer makes changes to projectX that touch the base > and projectX specific features. Then the developer commits them and > pushes them to the main repo. The main repo contains all projects. > During the commit, chages to the base automagically get pushed to > all projects that share that base If it is a matter of what files are touched, then maybe the base is actually something like a library, which should be managed as a separate project. See the "git submodule" manual if you would like to try something like this but still keep the projects coupled. On the other hand, remaining in the situation from before: Suppose Sam is the NetBSD sh maintainer. The first step in working on a new release might be $ git fetch ancestor $ git log -p HEAD..FETCH_HEAD; # fixes look okay? $ git pull ancestor since Alice tends to include only safe, well tested fixes. Many changes Sam makes are specific to his project, but today he comes up with a fix that might be useful for other ash descendants. So instead of commiting directly, he can try: $ git checkout for-alice; # carry the fix to the for-alice branch ... test ... $ git commit -a; # commit it. If it is not an urgent fix, at this point he might do $ git checkout master; # back to the main NetBSD branch, without the fix and give the other projects some time to work on the patch and come up with a better fix. Or he might cherry-pick the commit from for-alice, and even publish it and encourage others to cherry-pick directly from him to get the fix out ASAP. Notice that not all changes to the base files are necessarily useful for other descendants of the ancestral program. So in this example, propagation of changes between projects is fairly explicit. [1] "git checkout HEAD^0" would be more convenient. See DETACHED HEAD in the git checkout manual if interested. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-11-11 19:39 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-11-11 3:25 Looking for a way to set up Git correctly Dennis 2010-11-11 9:38 ` Alex Riesen 2010-11-11 13:25 ` Enrico Weigelt 2010-11-11 16:46 ` Jonathan Nieder [not found] ` <20101111190724.00vcimqm8w0cw8s0@dennymagicsite.com> 2010-11-11 19:38 ` Jonathan Nieder
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).