* how to delete the entire history before a certain commit @ 2010-05-03 22:23 Gelonida 2010-05-03 22:45 ` Jakub Narebski 0 siblings, 1 reply; 5+ messages in thread From: Gelonida @ 2010-05-03 22:23 UTC (permalink / raw) To: git Hi, I noticed, that this post never arrived :-( . So here again. We have a git repository, whose size we want to reduce drastically due to frequent clone operations and a slow network connection. The idea is following: * archive the git repository just in case we really have to go back in history. create a new git repository, which shall only contain last month's activity. all changes before should be squashed together. It would be no problem if the very first commit remains unmodified. I made some attempts with git rebase -i but I always encounter errors. Example error is a cherry pick which can't be applied. Is git rebase the correct way to delete en entire history section or are there smarter ways to do this? (e.g. create a new repository with last months state as starting point and some 'magic' to replay from this point on with all branches and merges_ thanks for any suggestion. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit 2010-05-03 22:23 how to delete the entire history before a certain commit Gelonida @ 2010-05-03 22:45 ` Jakub Narebski 2010-05-03 23:11 ` Gelonida 0 siblings, 1 reply; 5+ messages in thread From: Jakub Narebski @ 2010-05-03 22:45 UTC (permalink / raw) To: Gelonida; +Cc: git Gelonida <gelonida@gmail.com> writes: > We have a git repository, whose size we want to reduce drastically due > to frequent clone operations and a slow network connection. Why frequent *clone* operations, instead of using "git fetch" or equivalent ("git pull" which is fetch+merge, or "git remote update")? If network is slow, you can do what others did in similar situations: use hook to allow only not to large fetches (to prevent cloning) directly on server, and provide bundle (see git-bundle(1)) to "seed" the clone; it can be on dumb server (served resumably), and can be also served by BitTorrent or equivalent. > The idea is following: > > * archive the git repository just in case we really have to go back in > history. > > > create a new git repository, which shall only contain last month's activity. > > all changes before should be squashed together. > It would be no problem if the very first commit remains unmodified. If you want to simply _remove_ history before specified commit, instead of squashing it, the best solution would be to use grafts to cauterize (cut) history, check using [graphical] history viewer that you cut it correctly, and then then use git-filter-branch to make this cut permanent. You can later use grafts or refs/replaces/ mechanism to join "current" history with historical repository. HTH. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit 2010-05-03 22:45 ` Jakub Narebski @ 2010-05-03 23:11 ` Gelonida 2010-05-03 23:42 ` Jakub Narebski 0 siblings, 1 reply; 5+ messages in thread From: Gelonida @ 2010-05-03 23:11 UTC (permalink / raw) To: git Hi Jakub, Jakub Narebski wrote: > Gelonida <gelonida@gmail.com> writes: > >> We have a git repository, whose size we want to reduce drastically due >> to frequent clone operations and a slow network connection. > > Why frequent *clone* operations, instead of using "git fetch" or > equivalent ("git pull" which is fetch+merge, or "git remote update")? The clone is part of the deployment process and would IIRC be equivalent to a 'svn export' Almost certainly one can also improve this, but this should probably discussed in another thread. The sequence on some remote hosts is. - git clone tag dirname - rm -rf dirname/.git - tar cvfz dirname.tgz dirname > > If network is slow, you can do what others did in similar situations: > use hook to allow only not to large fetches (to prevent cloning) > directly on server, and provide bundle (see git-bundle(1)) to "seed" > the clone; it can be on dumb server (served resumably), and can be > also served by BitTorrent or equivalent. The server NW is fast, but the clients' network connection not therefore no need to offload the server. > >> The idea is following: >> >> * archive the git repository just in case we really have to go back in >> history. >> >> >> create a new git repository, which shall only contain last month's activity. >> >> all changes before should be squashed together. >> It would be no problem if the very first commit remains unmodified. > > If you want to simply _remove_ history before specified commit, > instead of squashing it, the best solution would be to use grafts to > cauterize (cut) history, check using [graphical] history viewer that > you cut it correctly, and then then use git-filter-branch to make this > cut permanent. This sounds exactly as what I'd like to do. I used "git gui" => "Visualize All Branch History" y to choose a nice single cutoff point. I just didn't know how to apply the cut. So the command to look for is git-filter-branch, right ? I'll read the doc. > > You can later use grafts or refs/replaces/ mechanism to join "current" > history with historical repository. Probably we wont need this, but this sounds rather interesting and is good to know. Thanks a lot ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit 2010-05-03 23:11 ` Gelonida @ 2010-05-03 23:42 ` Jakub Narebski 2010-05-03 23:58 ` Gelonida 0 siblings, 1 reply; 5+ messages in thread From: Jakub Narebski @ 2010-05-03 23:42 UTC (permalink / raw) To: Gelonida; +Cc: git Gelonida <gelonida@gmail.com> writes: > Jakub Narebski wrote: >> Gelonida <gelonida@gmail.com> writes: >> >>> We have a git repository, whose size we want to reduce drastically due >>> to frequent clone operations and a slow network connection. >> >> Why frequent *clone* operations, instead of using "git fetch" or >> equivalent ("git pull" which is fetch+merge, or "git remote update")? > > The clone is part of the deployment process and would IIRC be equivalent > to a 'svn export' > Almost certainly one can also improve this, but this should probably > discussed in another thread. > > The sequence on some remote hosts is. > - git clone tag dirname > - rm -rf dirname/.git > - tar cvfz dirname.tgz dirname Why not simply (after enabling 'upload-archive' service in git-daemon if you serve via git:// URL, and probably similar in the case of SSH access management by gitosis or gitolite) $ git archive --remote=<repo> <tag> (where <repo> is <dirname> in your example)? >>> The idea is following: >>> >>> * archive the git repository just in case we really have to go back in >>> history. >>> >>> >>> create a new git repository, which shall only contain last month's activity. >>> >>> all changes before should be squashed together. >>> It would be no problem if the very first commit remains unmodified. >> >> If you want to simply _remove_ history before specified commit, >> instead of squashing it, the best solution would be to use grafts to >> cauterize (cut) history, check using [graphical] history viewer that >> you cut it correctly, and then then use git-filter-branch to make this >> cut permanent. > > This sounds exactly as what I'd like to do. > I used "git gui" => "Visualize All Branch History" y to choose a nice > single cutoff point. > I just didn't know how to apply the cut. You can read about grafts in git-filter-branch(1) manpage, in gitrepository-layout(5) git repository layout description, and in gitglossary(7) a git glossary. In short, each line in .git/info/grafts consist of sha1 id of object, followed by space-separated list of its effective (grafted) parents. So to cut history e.g. after commit a3eb250f996bf5e, you need to put line containing only this SHA-1 in .git/info/grafts file, e.g.: $ git rev-parse --verify a3eb250f996bf5e >> .git/info/grafts > So the command to look for is git-filter-branch, right ? > I'll read the doc. As you would see in git-filter-branch(1) documentation, simple $ git filter-branch --all (no filter) would make history described by grafts permanent. Note that this will be rewriting history, and you would make it (much) harder on any contributor who based his/her work on commits from before "rebase". >> >> You can later use grafts or refs/replaces/ mechanism to join "current" >> history with historical repository. > > Probably we wont need this, but this sounds rather interesting and is > good to know. Grafts were for example used to fuse (join) current and historical Linux kernel repositories, after Linux kernel moved from BitKeeper to Git. The 'git-replace' mechanism is meant as modern, transferable and safe replacements for grafts file. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit 2010-05-03 23:42 ` Jakub Narebski @ 2010-05-03 23:58 ` Gelonida 0 siblings, 0 replies; 5+ messages in thread From: Gelonida @ 2010-05-03 23:58 UTC (permalink / raw) To: git; +Cc: git Hi Jakub, Jakub Narebski wrote: > Gelonida <gelonida@gmail.com> writes: >> Jakub Narebski wrote: >>> Gelonida <gelonida@gmail.com> writes: >>> >>>> We have a git repository, whose size we want to reduce drastically >>> If you want to simply _remove_ history before specified commit, >>> instead of squashing it, the best solution would be to use grafts to >>> cauterize (cut) history, check using [graphical] history viewer that >>> you cut it correctly, and then then use git-filter-branch to make this >>> cut permanent. > You can read about grafts in git-filter-branch(1) manpage, in > gitrepository-layout(5) git repository layout description, and in > gitglossary(7) a git glossary. > > In short, each line in .git/info/grafts consist of sha1 id of object, > followed by space-separated list of its effective (grafted) parents. > So to cut history e.g. after commit a3eb250f996bf5e, you need to put > line containing only this SHA-1 in .git/info/grafts file, e.g.: > > $ git rev-parse --verify a3eb250f996bf5e >> .git/info/grafts > >> So the command to look for is git-filter-branch, right ? >> I'll read the doc. > > > As you would see in git-filter-branch(1) documentation, simple > > $ git filter-branch --all The command git filter-branch --all did not work for me. it just fdisplays the help text. however without '--all' git filter-branch seems to have worked. Thanks a lot :-) ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-05-03 23:58 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-05-03 22:23 how to delete the entire history before a certain commit Gelonida 2010-05-03 22:45 ` Jakub Narebski 2010-05-03 23:11 ` Gelonida 2010-05-03 23:42 ` Jakub Narebski 2010-05-03 23:58 ` Gelonida
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).