From: Linus Torvalds <torvalds@osdl.org>
To: Nicolas Pitre <nico@cam.org>
Cc: Junio C Hamano <junkio@cox.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Horrible re-packing?
Date: Mon, 5 Jun 2006 14:40:02 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0606051432270.5498@g5.osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0606051637490.24152@localhost.localdomain>
On Mon, 5 Jun 2006, Nicolas Pitre wrote:
>
> In other words, the pack shrunk to less than half the size of the
> previous one !
Ok, that's a bit more extreme than expected.
It's obviously great news, and says that the approach of sorting by
"reversed name" is a great heuristic, but at the same time it makes me
worry a bit that this thing that is supposed to be a heuristic ends up
being _so_ important from a pack size standpoint. I was happier when it
was more about saving a couple of percent.
Now, your repo may be a strange case, and it just happens to fit the
suggested hash, but on the other hand it's nice to see three totally
different repositories that all improve, albeit with wildly different
numbers.
I'm wondering if we could have some "incremental optimizer" thing that
would take a potentially badly packed archive, and just start looking for
better delta chain possibilities? That way we would still try to get a
good initial pack with some heuristic, but we could have people run the
incremental improver every once in a while looking for good deltas that it
missed due to the project not fitting the heuristics..
The fact that we normally do incremental repacking (and "-f" is unusual)
is obviously one thing that makes us less susceptible to bad patterns (and
is also what allows us to run the incremental optimizer - any good delta
choice will automatically percolate into subsequent versions, including
packs that have been cloned).
So the packing strategy itself seems to be very stable (and partly _due_
to the "optimization" to re-use earlier pack choices), but we currently
lack the thing that fixes up any initial bad assumptions in case they
happen.
Linus
next prev parent reply other threads:[~2006-06-05 21:41 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-05 17:08 Horrible re-packing? Linus Torvalds
2006-06-05 18:44 ` Linus Torvalds
2006-06-05 19:03 ` Linus Torvalds
2006-06-05 19:37 ` Junio C Hamano
2006-06-05 19:57 ` Linus Torvalds
2006-06-05 23:54 ` Junio C Hamano
2006-06-06 0:14 ` Junio C Hamano
2006-06-05 21:14 ` Olivier Galibert
2006-06-05 21:22 ` Nicolas Pitre
2006-06-06 0:18 ` Chris Wedgwood
2006-06-06 0:35 ` Linus Torvalds
2006-06-05 21:27 ` Linus Torvalds
2006-06-05 21:20 ` Nicolas Pitre
2006-06-05 21:40 ` Linus Torvalds [this message]
2006-06-05 23:13 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0606051432270.5498@g5.osdl.org \
--to=torvalds@osdl.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=nico@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).