git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Blobs not referenced by file (anymore) are not removed by GC
@ 2014-12-08 16:22 Martin Scherer
       [not found] ` <CAFY1edaEq1zYV0vgSfiPAXU6bqVBzaA-apVnSn8DBMbzcAa2tQ@mail.gmail.com>
  2014-12-09 14:14 ` Jeff King
  0 siblings, 2 replies; 9+ messages in thread
From: Martin Scherer @ 2014-12-08 16:22 UTC (permalink / raw)
  To: git

Hi,

after using BFG on a repo given certain directory globs, all of those
files(names) are gone from history, but can not be collected by garbage
collection anymore. So the blobs of the underlying files are not deleted
and only the file names are not associated with the blob anymore. I
wonder, if I discovered a bug (at least in bfg). But I expect git to
discover that this blobs are not used in any way (so they have to
associated to something right?)

# invoke bfg --delete-folders something multiple times with different
pattern.

# try to cleanup

git gc --aggressive --prune=now # big blobs still in history
git fsck # no results
git fsck --full  --unreachable --dangling # no results

to verify if the blobs are still there, see the output of

git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+
blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects
.txt

head bigobjects.txt # outputs 9451427d7335395779b91864418630d2f0af780a
blob   7895212 1869047 7657491


Also if bfg is being told to remove the biggest blob (bfg -B 1) with
no-blob-protection, it does not succeed in removing it.

--- output of bfg -B 1

Found 1 blob ids for large blobs - biggest=7895212 smallest=7895212
....

BFG aborting: No refs to update - no dirty commits found??
---

The repo can be found here.

https://github.com/marscher/stallone_stale_objects

I will restart all over to cleanup the history, but I guess this might
be interesting for git developers.


Best,
Martin

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-12-10 23:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-08 16:22 Blobs not referenced by file (anymore) are not removed by GC Martin Scherer
     [not found] ` <CAFY1edaEq1zYV0vgSfiPAXU6bqVBzaA-apVnSn8DBMbzcAa2tQ@mail.gmail.com>
2014-12-08 16:47   ` Roberto Tyley
2014-12-09 14:14 ` Jeff King
2014-12-09 16:01   ` Roberto Tyley
2014-12-09 16:11     ` Jeff King
2014-12-09 22:15       ` Roberto Tyley
2014-12-10  7:11         ` Jeff King
2014-12-10 16:07           ` Junio C Hamano
2014-12-10 23:41             ` Roberto Tyley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).