git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robert Dailey <rcdailey.lists@gmail.com>
To: Git <git@vger.kernel.org>
Subject: Unable to shrink repository size
Date: Wed, 5 Mar 2014 20:55:30 -0600	[thread overview]
Message-ID: <CAHd499AW6nev81iVVhuoYfT0us28SSBDwbHCBa3teYB=cJR99g@mail.gmail.com> (raw)

I have a git-svn clone that I've been working on which is a full and
complete conversion of our SVN repository at work.

It started out as 1.4GB (git count-objects -v, looking at
'size-pack'). I have run the following script to clean up a directory
in the repo history that I suspect are huge (we had a third party
library checked in that, uncompressed, was about 1.2GB in size):

-------------------------------
files=$@
echo "Removing: $files..."
git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch
$files" -- --all

# remove the temporary history git-filter-branch otherwise leaves
behind for a long time
rm -rf .git/refs/original/ && git reflog expire --expire=now --all &&
git gc --aggressive --prune=now
-------------------------------

Even though I seem to have removed it, the repository size (looking at
'size-pack' again) only went down about 200MB, so it's at 1.2GB now.
There is about 3-5 years of commit history in this repository.

What I'd like to do is somehow hunt down the largest commit (*not*
blob) in the entire history of the repository to hopefully find out
where huge directories have been checked in.

I can't do a search for largest file (which most google results seem
to show to do) since the culprit is really thousands of unnecessary
files checked into a single subdirectory somewhere in history.

Can anyone offer me some advice to help me reduce the size of my repo
further? Thanks.

             reply	other threads:[~2014-03-06  2:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-06  2:55 Robert Dailey [this message]
2014-03-06  5:21 ` Unable to shrink repository size Elijah Newren
2014-03-06  7:46 ` Fredrik Gustafsson
2014-03-06 12:56 ` Duy Nguyen
2014-03-06 15:25 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHd499AW6nev81iVVhuoYfT0us28SSBDwbHCBa3teYB=cJR99g@mail.gmail.com' \
    --to=rcdailey.lists@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).