* how to delete the entire history before a certain commit
@ 2010-05-03 22:23 Gelonida
2010-05-03 22:45 ` Jakub Narebski
0 siblings, 1 reply; 5+ messages in thread
From: Gelonida @ 2010-05-03 22:23 UTC (permalink / raw)
To: git
Hi,
I noticed, that this post never arrived :-( . So here again.
We have a git repository, whose size we want to reduce drastically due
to frequent clone operations and a slow network connection.
The idea is following:
* archive the git repository just in case we really have to go back in
history.
create a new git repository, which shall only contain last month's activity.
all changes before should be squashed together.
It would be no problem if the very first commit remains unmodified.
I made some attempts with
git rebase -i
but I always encounter errors.
Example error is a cherry pick which can't be applied.
Is git rebase the correct way to delete en entire history section or are
there smarter ways to do this? (e.g. create a new repository with last
months state as starting point and some 'magic' to replay from this
point on with all branches and merges_
thanks for any suggestion.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit
2010-05-03 22:23 how to delete the entire history before a certain commit Gelonida
@ 2010-05-03 22:45 ` Jakub Narebski
2010-05-03 23:11 ` Gelonida
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Narebski @ 2010-05-03 22:45 UTC (permalink / raw)
To: Gelonida; +Cc: git
Gelonida <gelonida@gmail.com> writes:
> We have a git repository, whose size we want to reduce drastically due
> to frequent clone operations and a slow network connection.
Why frequent *clone* operations, instead of using "git fetch" or
equivalent ("git pull" which is fetch+merge, or "git remote update")?
If network is slow, you can do what others did in similar situations:
use hook to allow only not to large fetches (to prevent cloning)
directly on server, and provide bundle (see git-bundle(1)) to "seed"
the clone; it can be on dumb server (served resumably), and can be
also served by BitTorrent or equivalent.
> The idea is following:
>
> * archive the git repository just in case we really have to go back in
> history.
>
>
> create a new git repository, which shall only contain last month's activity.
>
> all changes before should be squashed together.
> It would be no problem if the very first commit remains unmodified.
If you want to simply _remove_ history before specified commit,
instead of squashing it, the best solution would be to use grafts to
cauterize (cut) history, check using [graphical] history viewer that
you cut it correctly, and then then use git-filter-branch to make this
cut permanent.
You can later use grafts or refs/replaces/ mechanism to join "current"
history with historical repository.
HTH.
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit
2010-05-03 22:45 ` Jakub Narebski
@ 2010-05-03 23:11 ` Gelonida
2010-05-03 23:42 ` Jakub Narebski
0 siblings, 1 reply; 5+ messages in thread
From: Gelonida @ 2010-05-03 23:11 UTC (permalink / raw)
To: git
Hi Jakub,
Jakub Narebski wrote:
> Gelonida <gelonida@gmail.com> writes:
>
>> We have a git repository, whose size we want to reduce drastically due
>> to frequent clone operations and a slow network connection.
>
> Why frequent *clone* operations, instead of using "git fetch" or
> equivalent ("git pull" which is fetch+merge, or "git remote update")?
The clone is part of the deployment process and would IIRC be equivalent
to a 'svn export'
Almost certainly one can also improve this, but this should probably
discussed in another thread.
The sequence on some remote hosts is.
- git clone tag dirname
- rm -rf dirname/.git
- tar cvfz dirname.tgz dirname
>
> If network is slow, you can do what others did in similar situations:
> use hook to allow only not to large fetches (to prevent cloning)
> directly on server, and provide bundle (see git-bundle(1)) to "seed"
> the clone; it can be on dumb server (served resumably), and can be
> also served by BitTorrent or equivalent.
The server NW is fast, but the clients' network connection not therefore
no need to offload the server.
>
>> The idea is following:
>>
>> * archive the git repository just in case we really have to go back in
>> history.
>>
>>
>> create a new git repository, which shall only contain last month's activity.
>>
>> all changes before should be squashed together.
>> It would be no problem if the very first commit remains unmodified.
>
> If you want to simply _remove_ history before specified commit,
> instead of squashing it, the best solution would be to use grafts to
> cauterize (cut) history, check using [graphical] history viewer that
> you cut it correctly, and then then use git-filter-branch to make this
> cut permanent.
This sounds exactly as what I'd like to do.
I used "git gui" => "Visualize All Branch History" y to choose a nice
single cutoff point.
I just didn't know how to apply the cut.
So the command to look for is git-filter-branch, right ?
I'll read the doc.
>
> You can later use grafts or refs/replaces/ mechanism to join "current"
> history with historical repository.
Probably we wont need this, but this sounds rather interesting and is
good to know.
Thanks a lot
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit
2010-05-03 23:11 ` Gelonida
@ 2010-05-03 23:42 ` Jakub Narebski
2010-05-03 23:58 ` Gelonida
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Narebski @ 2010-05-03 23:42 UTC (permalink / raw)
To: Gelonida; +Cc: git
Gelonida <gelonida@gmail.com> writes:
> Jakub Narebski wrote:
>> Gelonida <gelonida@gmail.com> writes:
>>
>>> We have a git repository, whose size we want to reduce drastically due
>>> to frequent clone operations and a slow network connection.
>>
>> Why frequent *clone* operations, instead of using "git fetch" or
>> equivalent ("git pull" which is fetch+merge, or "git remote update")?
>
> The clone is part of the deployment process and would IIRC be equivalent
> to a 'svn export'
> Almost certainly one can also improve this, but this should probably
> discussed in another thread.
>
> The sequence on some remote hosts is.
> - git clone tag dirname
> - rm -rf dirname/.git
> - tar cvfz dirname.tgz dirname
Why not simply (after enabling 'upload-archive' service in git-daemon
if you serve via git:// URL, and probably similar in the case of
SSH access management by gitosis or gitolite)
$ git archive --remote=<repo> <tag>
(where <repo> is <dirname> in your example)?
>>> The idea is following:
>>>
>>> * archive the git repository just in case we really have to go back in
>>> history.
>>>
>>>
>>> create a new git repository, which shall only contain last month's activity.
>>>
>>> all changes before should be squashed together.
>>> It would be no problem if the very first commit remains unmodified.
>>
>> If you want to simply _remove_ history before specified commit,
>> instead of squashing it, the best solution would be to use grafts to
>> cauterize (cut) history, check using [graphical] history viewer that
>> you cut it correctly, and then then use git-filter-branch to make this
>> cut permanent.
>
> This sounds exactly as what I'd like to do.
> I used "git gui" => "Visualize All Branch History" y to choose a nice
> single cutoff point.
> I just didn't know how to apply the cut.
You can read about grafts in git-filter-branch(1) manpage, in
gitrepository-layout(5) git repository layout description, and in
gitglossary(7) a git glossary.
In short, each line in .git/info/grafts consist of sha1 id of object,
followed by space-separated list of its effective (grafted) parents.
So to cut history e.g. after commit a3eb250f996bf5e, you need to put
line containing only this SHA-1 in .git/info/grafts file, e.g.:
$ git rev-parse --verify a3eb250f996bf5e >> .git/info/grafts
> So the command to look for is git-filter-branch, right ?
> I'll read the doc.
As you would see in git-filter-branch(1) documentation, simple
$ git filter-branch --all
(no filter) would make history described by grafts permanent.
Note that this will be rewriting history, and you would make it (much)
harder on any contributor who based his/her work on commits from
before "rebase".
>>
>> You can later use grafts or refs/replaces/ mechanism to join "current"
>> history with historical repository.
>
> Probably we wont need this, but this sounds rather interesting and is
> good to know.
Grafts were for example used to fuse (join) current and historical
Linux kernel repositories, after Linux kernel moved from BitKeeper to
Git.
The 'git-replace' mechanism is meant as modern, transferable and safe
replacements for grafts file.
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to delete the entire history before a certain commit
2010-05-03 23:42 ` Jakub Narebski
@ 2010-05-03 23:58 ` Gelonida
0 siblings, 0 replies; 5+ messages in thread
From: Gelonida @ 2010-05-03 23:58 UTC (permalink / raw)
To: git; +Cc: git
Hi Jakub,
Jakub Narebski wrote:
> Gelonida <gelonida@gmail.com> writes:
>> Jakub Narebski wrote:
>>> Gelonida <gelonida@gmail.com> writes:
>>>
>>>> We have a git repository, whose size we want to reduce drastically
>>> If you want to simply _remove_ history before specified commit,
>>> instead of squashing it, the best solution would be to use grafts to
>>> cauterize (cut) history, check using [graphical] history viewer that
>>> you cut it correctly, and then then use git-filter-branch to make this
>>> cut permanent.
> You can read about grafts in git-filter-branch(1) manpage, in
> gitrepository-layout(5) git repository layout description, and in
> gitglossary(7) a git glossary.
>
> In short, each line in .git/info/grafts consist of sha1 id of object,
> followed by space-separated list of its effective (grafted) parents.
> So to cut history e.g. after commit a3eb250f996bf5e, you need to put
> line containing only this SHA-1 in .git/info/grafts file, e.g.:
>
> $ git rev-parse --verify a3eb250f996bf5e >> .git/info/grafts
>
>> So the command to look for is git-filter-branch, right ?
>> I'll read the doc.
>
>
> As you would see in git-filter-branch(1) documentation, simple
>
> $ git filter-branch --all
The command
git filter-branch --all
did not work for me. it just fdisplays the help text.
however without '--all'
git filter-branch
seems to have worked.
Thanks a lot :-)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-05-03 23:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-03 22:23 how to delete the entire history before a certain commit Gelonida
2010-05-03 22:45 ` Jakub Narebski
2010-05-03 23:11 ` Gelonida
2010-05-03 23:42 ` Jakub Narebski
2010-05-03 23:58 ` Gelonida
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).