git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Some more pack-objects tweaks
@ 2006-02-24 10:38 Junio C Hamano
  0 siblings, 0 replies; only message in thread
From: Junio C Hamano @ 2006-02-24 10:38 UTC (permalink / raw)
  To: git

I've been working more pack-objects improvements.  There will be
two tweaks in the "next" branch I'll be pushing out tonight.

 * rev-list reports full pathnames not just basenames for
   contained trees and blobs.  pack-objects hashes the incoming
   names (and names obtained from "negative" trees when
   --objects-edge aka "thin pack" is used) taking into account
   the dirname and basename part.

   Earlier, I had a patch that hashes the whole pathname, and
   found it perform worse than the original "hash just the
   basename" approach, so I never published it.  The idea in
   this round is to give "Makefile" and "t/Makefile" a different
   but close hash values.  Type-size sort groups "Makefile"s
   from different revs together, and another group of bunch of
   "t/Makefile"s are found close by.

 * when creating "thin" pack, disable the code to avoid too
   long a delta chain to be made due to reused delta (see
   15b4d57 and ab7cd7b commit log for details).

   This is because limiting delta chain is more costly than let
   it grow by using preexisting delta, and "thin" pack is usable
   by first exploding it, so at that point delta depth does not
   matter.

In Linux 2.6 repository, I've created a thin pack between
v2.6.14..v2.6.15-rc1 (36k objects).  Here are the results:

    [without either patch]
    15463034 bytes
    Total 36248, written 36248 (delta 29046), reused 28306 (delta 22512)
    real    1m38.157s       user    1m32.520s       sys     0m5.440s

    [with full names]
    11138621 bytes
    Total 36248, written 36248 (delta 30368), reused 27918 (delta 22512)
    real    1m36.254s       user    1m28.650s       sys     0m5.470s

    [with full names, and allowing deeper delta]
    9971223 bytes
    Total 36248, written 36248 (delta 30868), reused 27429 (delta 22512)
    real    1m36.923s       user    1m29.770s       sys     0m5.470s

All of these tests were done with the last patch in Nico's delta
enhancement series reverted, because the dataset used in this
test triggers a corner case performance disaster in it (I've
sent a message separately).

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2006-02-24 10:38 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-24 10:38 Some more pack-objects tweaks Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).