From: Jon Nelson <jnelson@jamponi.net>
To: unlisted-recipients:; (no To-header on input)
Cc: git@vger.kernel.org
Subject: Re: git gc / git repack not removing unused objects?
Date: Fri, 5 Feb 2010 15:04:13 -0600 [thread overview]
Message-ID: <cccedfc61002051304t6030d3f7if4bb14709ee6c918@mail.gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1002051539080.1681@xanadu.home>
On Fri, Feb 5, 2010 at 2:51 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Fri, 5 Feb 2010, Jon Nelson wrote:
>
>> [Using git 1.6.4.2]
>>
>> In one repo I have (136G objects directory, fully packed) I'm having
>> some trouble.
>> I've run git-gc --prune=now, git repack -Adf, and so on a half-dozen
>> times and each time I do so it gets bigger, not smaller.
>
> Please tell us more.
I'll tell you whatever I can -- as soon as I know what it is you want.
>> Setting that aside for the moment, however, I've run into a stranger problem.
>>
>> So I use "git verify-pack -v > gvp.out" and "sort -k3nr < gvp.out |
>> head -n 20" to find the top 20 largest blobs.
>> So I have a blob, b32c3d8e8e24d8d3035cf52f606c2873315fe2b8, and now I
>> want to know what tree (or trees) it is in, so I try this:
>>
>>
>> for i in $( git branch -a | sed -e 's/\*//g' | grep -v branch ); do if
>> git ls-tree -l -r -t $i | grep
>> b32c3d8e8e24d8d3035cf52f606c2873315fe2b8 > /dev/null; then echo $i;
>> fi; done
>>
>> The results: no branch or tree appears to contain that blob.
>
> What you did above is simply to list trees that are reachable from the
> _heads_ of your branches. If the blob belongs to a commit which isn't
> the latest revision of any of your branches then you won't see it like
> that.
>
>> So I tried a different approach:
>>
>> for i in $( grep tree gvp.out | awk '{ print $1 }' ); do if git
>> ls-tree $i | grep b32c3d8e8e24d8d3035cf52f606c2873315fe2b8 >
>> /dev/null; then echo $i; fi ; done
>>
>> This time, I find (at least) one tree
>> (d813af1537358496ca34958bbff08b87590607bf) with the blob.
>> But which branches might that tree appear in? None.
>>
>> For each branch, I ran "git ls-tree -l -r -t" and saved the output in
>> a file (one per branch).
>> Then I grepped each file for the tree (
>> (d813af1537358496ca34958bbff08b87590607bf) - no luck.
>> I grepped each file for the blob (b32...) - no luck.
>>
>> The results seem to suggest that I have packed trees which reference
>> blobs, but that the trees themselves are not referenced in any branch
>> and therefore I would expect that they would be pruned.
>
> NO. If those trees and blobs are stil there then they do get
> referenced. But not from the latest commit on any of your branches.
> You need to dig further down in history to find a commit that actually
> references that blob/tree. One easy method is to do:
>
> git log --raw --all
>
> and within the pager ('less' by default) simply search for "b32c3d8".
OK. I'm piping "git log --raw --all" to a file this very moment. It'll
take a while. However, one thing I did not mention is that there
*should* be a 1:1 correlation between branches and commits. As in,
every time I did a commit, the commit was on a new branch. I'll look
into this, as I've fiddled with the repo a bunch of different ways
lately. I suspect the answer will be found in the logs.
Thanks for the response!
--
Jon
next prev parent reply other threads:[~2010-02-05 21:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-05 19:45 git gc / git repack not removing unused objects? Jon Nelson
2010-02-05 20:51 ` Nicolas Pitre
2010-02-05 21:04 ` Jon Nelson [this message]
2010-02-05 21:45 ` Nicolas Pitre
2010-02-06 13:53 ` Jon Nelson
2010-02-07 1:16 ` Nicolas Pitre
2010-02-07 17:48 ` Jon Nelson
2010-02-07 23:40 ` Jon Nelson
2010-02-08 2:11 ` Nicolas Pitre
2010-02-08 17:12 ` Jon Nelson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cccedfc61002051304t6030d3f7if4bb14709ee6c918@mail.gmail.com \
--to=jnelson@jamponi.net \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).