From: Nicolas Pitre <nico@fluxnic.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>,
git@vger.kernel.org, Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>,
Jay Soffian <jaysoffian@gmail.com>,
Shawn Pearce <spearce@spearce.org>
Subject: Re: gc --aggressive
Date: Tue, 01 May 2012 15:22:27 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.2.02.1205011504160.21030@xanadu.home> (raw)
In-Reply-To: <7vr4v391s1.fsf@alter.siamese.dyndns.org>
On Tue, 1 May 2012, Junio C Hamano wrote:
> Nicolas Pitre <nico@fluxnic.net> writes:
>
> > One final quick test if you feel like it: I've never been sure that
> > the last comparison in type_size_sort() is correct. Maybe it should be
> > the other way around. Currently it reads:
> >
> > return a < b ? -1 : (a > b);
> >
> > While keeping the size comparison commented out, you could try to
> > replace this line with:
> >
> > return b < a ? -1 : (b > a);
> >
> > If this doesn't improve things then it would be clear that this avenue
> > should be abandoned.
>
> Very interesting. The difference between the two should only matter if
> there are many blobs with exactly the same size, and most of them delta
> horribly with each other. Does the problematic repository exhibit such
> a characteristic?
Not precisely. This is just to verify some hypothesis that could
explain the difference in behavior with the phpmyadmin repo.
My hypothesis was that recency order could be skewed by the object size
when many small changes are made to the same files without varying their
size much. So I suggested that a repack run be performed with the
object size removed from the sort criteria. However it is important
that the last comparison be done in the right direction. Hence my
suggestion above.
> The original tie-breaks based on the address (the earlier object we read
> in the original input comes earlier in the output) and yours make the
> objects later we read (which in turn are from older parts of the history)
> come early, but adjacency between two objects of the same type and the
> same size would not change (if A and B were next to each other in this
> order, your updated sorter will give B and then A still next to each
> other), so I suspect not much would change in the candidate selection.
Note that the size comparison is commented out in those tests. The idea
was to get pure recency order.
Even for objects of the same size, the delta orientation would change
which might or might not provide a clue.
But this is really just a wild guess without much thinking at this
point, before giving up on this approach.
Nicolas
next prev parent reply other threads:[~2012-05-01 19:22 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-17 16:16 gc --aggressive Jay Soffian
2012-04-17 17:53 ` Jay Soffian
2012-04-17 20:52 ` Matthieu Moy
2012-04-17 21:58 ` Jeff King
2012-04-28 12:25 ` Jeff King
2012-04-28 17:11 ` Nicolas Pitre
2012-04-29 11:34 ` Jeff King
2012-04-29 13:53 ` Nicolas Pitre
2012-05-01 16:28 ` Jeff King
2012-05-01 17:16 ` Jeff King
2012-05-01 17:59 ` Nicolas Pitre
2012-05-01 18:47 ` Junio C Hamano
2012-05-01 19:22 ` Nicolas Pitre [this message]
2012-05-01 20:01 ` Jeff King
2012-05-01 19:35 ` Jeff King
2012-05-01 20:02 ` Nicolas Pitre
2012-05-01 17:17 ` Nicolas Pitre
2012-05-01 17:22 ` Jeff King
2012-05-01 17:47 ` Nicolas Pitre
2012-04-28 16:56 ` Nicolas Pitre
2012-04-17 22:08 ` Jeff King
2012-04-17 22:17 ` Junio C Hamano
2012-04-17 22:18 ` Jeff King
2012-04-17 22:34 ` Junio C Hamano
2012-04-28 16:42 ` Nicolas Pitre
2012-04-18 8:49 ` Andreas Ericsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.02.1205011504160.21030@xanadu.home \
--to=nico@fluxnic.net \
--cc=Matthieu.Moy@grenoble-inp.fr \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jaysoffian@gmail.com \
--cc=peff@peff.net \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).