All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Potapov <dpotapov@gmail.com>
To: Andreas Ericsson <ae@op5.se>
Cc: Elijah Newren <newren@gmail.com>,
	Alex Riesen <raa.lkml@gmail.com>,
	Frank Lichtenheld <frank@lichtenheld.de>,
	git@vger.kernel.org
Subject: Re: Trying to use git-filter-branch to compress history by removing large, obsolete binary files
Date: Mon, 8 Oct 2007 16:40:17 +0400	[thread overview]
Message-ID: <20071008124017.GA22129@potapov> (raw)
In-Reply-To: <4709F805.8050704@op5.se>

On Mon, Oct 08, 2007 at 11:27:33AM +0200, Andreas Ericsson wrote:
> Dmitry Potapov wrote:
> >OTOH, if you want to have a clean repository immediately, I believe
> >'git clone' is a better option. After you made a local clone using
> >it, 'git gc' should remove old garbage.
> >
> 
> A clone only fetches revs reachable from a ref, so pruning immediately
> after a clone is completely pointless.

Not true. git-clone copies the whole pack, so it can contain unreachable
objects. Here is a simple script that demonstrates that without garbage
collection the size of the cloned repository will be the same as the
original one.

===========================================
# Make a small repo
mkdir test
cd test
git init
echo hi > there
git add there
git commit -m 'Small repo'

# Add a random 10M binary file
dd if=/dev/urandom of=testme.txt count=10 bs=1M
git add testme.txt
git commit -m 'Add big binary file'

# Remove the 10M binary file
git rm testme.txt
git commit -m 'Remove big binary file'

# Compress the repo, see how big the repo is
git gc --aggressive --prune
du -ks .                       # 10348
du -ks .git                    # 10344

git-whatchanged

# Try to rewrite history to remove the binary file
git-filter-branch --tree-filter 'rm -f testme.txt' HEAD
git reset --hard

# Remove original refs
rm .git/refs/original/refs/heads/master

# Remove back
cd ..

# Clone repository
git-clone -l test/.git test2

cd test2
du -ks .git # 10360

# Now run garbage collection
git gc
du -ks .git # 96

===========================================

Dmitry

  parent reply	other threads:[~2007-10-08 12:46 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-07 21:23 Trying to use git-filter-branch to compress history by removing large, obsolete binary files Elijah Newren
2007-10-07 21:38 ` Frank Lichtenheld
2007-10-07 22:00   ` Elijah Newren
2007-10-07 22:19     ` Alex Riesen
2007-10-07 22:24       ` Elijah Newren
2007-10-07 23:40         ` Alex Riesen
2007-10-08  0:09           ` Elijah Newren
2007-10-08  6:15             ` Alex Riesen
2007-10-08  9:23               ` Andreas Ericsson
2007-10-07 23:43         ` Dmitry Potapov
2007-10-08  0:22           ` Elijah Newren
2007-10-08  1:06             ` Dmitry Potapov
2007-10-08  9:27               ` Andreas Ericsson
2007-10-08 10:05                 ` Karl Hasselström
2007-10-08 12:40                 ` Dmitry Potapov [this message]
2007-10-08 13:01                   ` Karl Hasselström
2007-10-07 23:19     ` Johannes Schindelin
2007-10-07 23:24       ` Elijah Newren
2007-10-07 23:28         ` Johannes Schindelin
2007-10-07 23:38           ` Elijah Newren
2007-10-08  0:34             ` Johannes Schindelin
2007-10-08  0:47               ` Elijah Newren
2007-10-08  2:28                 ` Sam Vilain
2007-10-08  1:00               ` J. Bruce Fields
2007-10-08  1:06                 ` Johannes Schindelin
2007-10-08  6:22                   ` Johannes Sixt
2007-10-08 14:36                     ` J. Bruce Fields
2007-10-08 16:37                       ` Theodore Tso
2007-10-08 19:05                         ` J. Bruce Fields
2007-10-09 10:37                         ` Johannes Schindelin
2007-10-07 22:08 ` Alex Riesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071008124017.GA22129@potapov \
    --to=dpotapov@gmail.com \
    --cc=ae@op5.se \
    --cc=frank@lichtenheld.de \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=raa.lkml@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.