All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nix <nix@esperi.org.uk>
To: git@vger.kernel.org
Subject: 2.36.0: enormous numbers of loose objects after enabling partial clone filter: why? how to clean up?
Date: Wed, 04 May 2022 21:26:24 +0100	[thread overview]
Message-ID: <877d71cadr.fsf@esperi.org.uk> (raw)

So I turned on promisor remotes for my local Chromium tree last week:
the relevant remote now has partialclonefilter = blob:limit=10m. I
wanted to save a bit of space (well, OK, I was hoping for a lot).

This turned out to be... a bad idea. The first fetch after that exploded
some 45GiB of loose objects (nearly six million of them). I spotted an
ancient historical config option gc.pruneExpire=never and removed it:
that would explain the worst of the accumulation, if the files weren't
all dated at around the time of my first pull after turning on partial
clone.

A git repack -ad, git prune --expire=now, and similar expiry of all the
reflogs has got them down to a mere 112957 objects, 597304 kilobytes (!)
but I can't get the count lower than that, despite deleting every remote
(to get rid of the remote-tracking branches) and doing a full repack
(non-aggressive), prune, and a git prune-packed (which had no effect at
all). I've done not a great deal of work in this tree and no rebases:
there should be hardly any legitimate loose objects, I'd have thought.

Indeed, git fsck --verbose --name-objects shows that all these objects
are referenced, mostly in a few fairly recent branches. They also
clearly reside upstream, in the remote-tracking branches. So... why is
git repack refusing to pack them? Is there any way to find out? It's
fairly opaque in its decisions, which usually I don't worry about, but
this space bloat is kind of ridiculous.

(There are no alternates or anything like that pointing at this repo, so
it's not that.)

-- 
NULL && (void)

                 reply	other threads:[~2022-05-04 21:25 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877d71cadr.fsf@esperi.org.uk \
    --to=nix@esperi.org.uk \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.