git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Gelonida <gelonida@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: how to delete the entire history before a certain commit
Date: Mon, 03 May 2010 16:42:58 -0700 (PDT)	[thread overview]
Message-ID: <m3k4rkft7k.fsf@localhost.localdomain> (raw)
In-Reply-To: <hrnl7o$nnf$1@dough.gmane.org>

Gelonida <gelonida@gmail.com> writes:
> Jakub Narebski wrote:
>> Gelonida <gelonida@gmail.com> writes:
>> 
>>> We have a git repository, whose size we want to reduce drastically due
>>> to frequent clone operations and a slow network connection.
>>  
>> Why frequent *clone* operations, instead of using "git fetch" or
>> equivalent ("git pull" which is fetch+merge, or "git remote update")?
> 
> The clone is part of the deployment process and would IIRC be equivalent
> to a 'svn export'
> Almost certainly one can also improve this, but this should probably
> discussed in another thread.
> 
> The sequence on some remote hosts is.
> - git clone tag dirname
> - rm -rf dirname/.git
> - tar cvfz dirname.tgz dirname
 
Why not simply (after enabling 'upload-archive' service in git-daemon
if you serve via git:// URL, and probably similar in the case of
SSH access management by gitosis or gitolite)

  $ git archive --remote=<repo> <tag>

(where <repo> is <dirname> in your example)?

>>> The idea is following:
>>>
>>> * archive the git repository just in case we really have to go back in
>>> history.
>>>
>>>
>>> create a new git repository, which shall only contain last month's activity.
>>>
>>> all changes before should be squashed together.
>>> It would be no problem if the very first commit remains unmodified.
>> 
>> If you want to simply _remove_ history before specified commit,
>> instead of squashing it, the best solution would be to use grafts to
>> cauterize (cut) history, check using [graphical] history viewer that
>> you cut it correctly, and then then use git-filter-branch to make this
>> cut permanent.
> 
> This sounds exactly as what I'd like to do.
> I used "git gui" => "Visualize All Branch History" y to choose a nice
> single cutoff point.
> I just didn't know how to apply the cut.

You can read about grafts in git-filter-branch(1) manpage, in
gitrepository-layout(5) git repository layout description, and in
gitglossary(7) a git glossary.

In short, each line in .git/info/grafts consist of sha1 id of object,
followed by space-separated list of its effective (grafted) parents.
So to cut history e.g. after commit a3eb250f996bf5e, you need to put
line containing only this SHA-1 in .git/info/grafts file, e.g.:

  $ git rev-parse --verify a3eb250f996bf5e >> .git/info/grafts
 
> So the command to look for is git-filter-branch, right ?
> I'll read the doc.


As you would see in git-filter-branch(1) documentation, simple

  $ git filter-branch --all

(no filter) would make history described by grafts permanent.

Note that this will be rewriting history, and you would make it (much)
harder on any contributor who based his/her work on commits from
before "rebase".
 
>> 
>> You can later use grafts or refs/replaces/ mechanism to join "current"
>> history with historical repository.
> 
> Probably we wont need this, but this sounds rather interesting and is
> good to know.

Grafts were for example used to fuse (join) current and historical
Linux kernel repositories, after Linux kernel moved from BitKeeper to
Git.

The 'git-replace' mechanism is meant as modern, transferable and safe
replacements for grafts file.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

  reply	other threads:[~2010-05-03 23:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-03 22:23 how to delete the entire history before a certain commit Gelonida
2010-05-03 22:45 ` Jakub Narebski
2010-05-03 23:11   ` Gelonida
2010-05-03 23:42     ` Jakub Narebski [this message]
2010-05-03 23:58       ` Gelonida

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3k4rkft7k.fsf@localhost.localdomain \
    --to=jnareb@gmail.com \
    --cc=gelonida@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).