git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Elijah Newren" <newren@gmail.com>
To: git@vger.kernel.org
Subject: Trying to use git-filter-branch to compress history by removing large, obsolete binary files
Date: Sun, 7 Oct 2007 15:23:59 -0600	[thread overview]
Message-ID: <51419b2c0710071423y1b194f22gb6ccaa57303029d1@mail.gmail.com> (raw)

Hi,

I'm using git-cvsimport to import some CVS repos, which unfortunately
included dozens of large regression test output files in their ancient
history...some of which measure hundreds of megabytes in size.  I'd
like to prune them out of the git history (I don't have access to
prune them out of the CVS history), but I'm running into problems.

The following set of instructions will duplicate my problem with a
smaller repo; why is the local git repository bigger after running
git-filter-branch rather than smaller as I'd expect?  I'm probably
missing something obvious, but I have no idea what it is.

The steps:

# Make a small repo
mkdir test
cd test
git init
echo hi > there
git add there
git commit -m 'Small repo'

# Add a random 10M binary file
dd if=/dev/urandom of=testme.txt count=10 bs=1M
git add testme.txt
git commit -m 'Add big binary file'

# Remove the 10M binary file
git rm testme.txt
git commit -m 'Remove big binary file'

# Compress the repo, see how big the repo is
git gc --aggressive --prune
du -ks .                       # 10548K
du -ks .git                    # 10532K

# Try to rewrite history to remove the binary file
git-filter-branch --tree-filter 'rm -f testme.txt' HEAD
git reset --hard

# Try to recompress and clean up, then check the new size
git gc --aggressive --prune
du -ks .                       # 10580K !?!?!?
du -ks .git                    # 10564K


Thanks,
Elijah

             reply	other threads:[~2007-10-07 21:24 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-07 21:23 Elijah Newren [this message]
2007-10-07 21:38 ` Trying to use git-filter-branch to compress history by removing large, obsolete binary files Frank Lichtenheld
2007-10-07 22:00   ` Elijah Newren
2007-10-07 22:19     ` Alex Riesen
2007-10-07 22:24       ` Elijah Newren
2007-10-07 23:40         ` Alex Riesen
2007-10-08  0:09           ` Elijah Newren
2007-10-08  6:15             ` Alex Riesen
2007-10-08  9:23               ` Andreas Ericsson
2007-10-07 23:43         ` Dmitry Potapov
2007-10-08  0:22           ` Elijah Newren
2007-10-08  1:06             ` Dmitry Potapov
2007-10-08  9:27               ` Andreas Ericsson
2007-10-08 10:05                 ` Karl Hasselström
2007-10-08 12:40                 ` Dmitry Potapov
2007-10-08 13:01                   ` Karl Hasselström
2007-10-07 23:19     ` Johannes Schindelin
2007-10-07 23:24       ` Elijah Newren
2007-10-07 23:28         ` Johannes Schindelin
2007-10-07 23:38           ` Elijah Newren
2007-10-08  0:34             ` Johannes Schindelin
2007-10-08  0:47               ` Elijah Newren
2007-10-08  2:28                 ` Sam Vilain
2007-10-08  1:00               ` J. Bruce Fields
2007-10-08  1:06                 ` Johannes Schindelin
2007-10-08  6:22                   ` Johannes Sixt
2007-10-08 14:36                     ` J. Bruce Fields
2007-10-08 16:37                       ` Theodore Tso
2007-10-08 19:05                         ` J. Bruce Fields
2007-10-09 10:37                         ` Johannes Schindelin
2007-10-07 22:08 ` Alex Riesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51419b2c0710071423y1b194f22gb6ccaa57303029d1@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).