From: Mike Hommey <mh@glandium.org>
To: Jeff King <peff@peff.net>
Cc: "Duy Nguyen" <pclouds@gmail.com>,
"John Keeping" <john@keeping.me.uk>,
"Дилян Палаузов" <dilyan.palauzov@aegee.org>,
"Git Mailing List" <git@vger.kernel.org>
Subject: Re: git pull & git gc
Date: Thu, 19 Mar 2015 13:26:17 +0900 [thread overview]
Message-ID: <20150319042617.GA1072@glandium.org> (raw)
In-Reply-To: <20150319041453.GB29437@peff.net>
On Thu, Mar 19, 2015 at 12:14:53AM -0400, Jeff King wrote:
> On Thu, Mar 19, 2015 at 11:01:17AM +0900, Mike Hommey wrote:
>
> > > I don't think packing the unreachables is a good plan. They just end up
> > > accumulating then, and they never expire, because we keep refreshing
> > > their mtime at each pack (unless you pack them once and then leave them
> > > to expire, but then you end up with a large number of packs).
> >
> > Note, sometimes I wish unreachables were packed. Recently, I ended up in
> > a situation where running gc created something like 3GB of data as per
> > du, because I suddenly had something like 600K unreachable objects, each
> > of them, as a loose object, taking at least 4K on disk. This made my
> > .git take 5GB instead of 2GB. That surely didn't feel like garbage
> > collection.
>
> That's definitely a thing that happens, but it is a bit of a corner
> case. It's unusual to have such a large number of unreferenced objects
> all at once.
>
> I don't suppose you happen to remember the details, but would a lower
> expiration time (e.g., 1 day or 1 hour) have made all of those objects
> go away? Or were they really from some extremely recent event (of
> course, "event" here might just have been "I did a full repack right
> before rewriting history" which would freshen the mtimes on everything
> in the pack).
Unfortunately, I don't know the exact details. But yes, I guess a lower
expiration time might have helped.
> Certainly the "loosening" behavior for unreachable objects has corner
> cases like this, and they suck when you hit one. Leaving the objects
> packed would be better, but IMHO is not a viable alternative unless
> somebody comes up with a plan for segregating the "old" objects in a way
> that they actually expire eventually, and don't just keep getting
> repacked and freshened over and over.
It sure is a corner case, otoh, when it happens, every single git
operation calls git gc --auto, which happily spends 5 minutes sucking
CPU to end up doing nothing in practice. And add more salt on the
injury if you are on battery
6700 loose objects seems easy to reach on a repo with 6M objects...
Mike
next prev parent reply other threads:[~2015-03-19 4:26 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-18 13:53 git pull & git gc Дилян Палаузов
2015-03-18 14:16 ` Duy Nguyen
2015-03-18 14:23 ` Дилян Палаузов
2015-03-18 14:33 ` Duy Nguyen
2015-03-18 14:41 ` Duy Nguyen
2015-03-18 14:58 ` John Keeping
2015-03-18 21:04 ` Jeff King
2015-03-19 0:31 ` Duy Nguyen
2015-03-19 1:27 ` Jeff King
2015-03-19 2:01 ` Mike Hommey
2015-03-19 4:14 ` Jeff King
2015-03-19 4:26 ` Mike Hommey [this message]
2015-03-19 2:27 ` Junio C Hamano
2015-03-19 4:09 ` Jeff King
2015-03-19 4:15 ` Duy Nguyen
2015-03-19 4:20 ` Jeff King
2015-03-19 4:29 ` Duy Nguyen
2015-03-19 4:34 ` Jeff King
2015-03-19 9:47 ` Duy Nguyen
2015-03-18 14:48 ` Дилян Палаузов
2015-03-18 21:07 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150319042617.GA1072@glandium.org \
--to=mh@glandium.org \
--cc=dilyan.palauzov@aegee.org \
--cc=git@vger.kernel.org \
--cc=john@keeping.me.uk \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).