git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Preferring shallower deltas on repack
@ 2007-07-09  4:43 Brian Downing
  2007-07-09  4:45 ` [PATCH] pack-objects: Prefer shallower deltas if the size is equal Brian Downing
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Brian Downing @ 2007-07-09  4:43 UTC (permalink / raw)
  To: git

SBCL, a Lisp implementation, has a file in their (CVS) repository called
"version.lisp-expr" which is updated every commit.  This file looks like:

------------------------------------------------------------------------
;;; This is the master value for LISP-IMPLEMENTATION-VERSION. It's
;;; separated into its own file here so that it's easy for
;;; text-munging make-ish or cvs-ish scripts to find and tweak it. For
;;; the convenience of such scripts, only a simple subset of Lisp
;;; reader syntax should be used here: semicolon-delimited comments,
;;; possible blank lines or other whitespace, and a single
;;; double-quoted string value alone on its own line.
;;;
;;; ANSI says LISP-IMPLEMENTATION-VERSION can be NIL "if no
;;; appropriate and relevant result can be produced", but as long as
;;; we control the build, we can always assign an appropriate and
;;; relevant result, so this must be a string, not NIL.
;;;
;;; Conventionally a string like "0.6.6", with three numeric fields,
;;; is used for released versions, and a string like "0.6.5.xyzzy",
;;; with something arbitrary in the fourth field, is used for CVS
;;; checkins which aren't released. (And occasionally for internal
;;; versions, especially for internal versions off the main CVS
;;; branch, it gets hairier, e.g. "0.pre7.14.flaky4.13".)
"1.0.7.10"
------------------------------------------------------------------------

Only very rarely does anything but the last line change.

The current repack implementation, when given values like "--window=100
--depth=1000", happily generates an incredibly deep tree for this file,
maxing out the depth.  (I'm only using such a high depth to demonstrate
the pessimal behavior.)  I noticed that it just takes the first delta
it finds that is the smallest; it then only looks for deltas with a
max_size of the old size - 1, so it can never find better matches with
regard to depth.

I modified this to prefer shallower deltas of the same size.  This made
the deltas for this file a very wide tree with a maximum depth of about
65.  Other (much smaller) improvements were seen elsewhere in the pack.
Runtime does not seem to have been affected, as most of the work had
already been done when it was tossing deltas before.

Some simple statistics:

SBCL, standard pack-objects, window 100, depth 1000:
  Max depth: 980
  Mean depth: 100.223622114502
  Median depth: 12
  Standard deviation: 188.214331919176

SBCL, patched pack-objects, window 100, depth 1000:
  Max depth: 787
  Mean depth: 61.5669990817656
  Median depth: 11
  Standard deviation: 127.644652607399

git, standard pack-objects, window 100, depth 1000:
  Max depth: 925
  Mean depth: 77.184264479754
  Median depth: 8
  Standard deviation: 150.112998198182

git, patched pack-objects, window 100, depth 1000:
  Max depth: 913
  Mean depth: 74.9981877425496
  Median depth: 7
  Standard deviation: 147.900721785959

The only negative effect I could see from this patch might be pack
locality.  Unfortunately I don't know enough (read: anything) about pack
access patterns to determine if this is the case.

Patch to follow.

-bcd

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-07-12  3:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-09  4:43 Preferring shallower deltas on repack Brian Downing
2007-07-09  4:45 ` [PATCH] pack-objects: Prefer shallower deltas if the size is equal Brian Downing
2007-07-09  5:31 ` Preferring shallower deltas on repack Junio C Hamano
2007-07-09  5:43   ` Junio C Hamano
2007-07-09  6:52   ` Brian Downing
2007-07-09  7:27     ` Junio C Hamano
2007-07-09  7:36       ` Brian Downing
2007-07-09 15:58   ` Nicolas Pitre
2007-07-09 16:39     ` Junio C Hamano
2007-07-09 18:53       ` Brian Downing
2007-07-09 19:13         ` Nicolas Pitre
2007-07-09 19:24           ` Brian Downing
2007-07-09 19:49             ` Brian Downing
2007-07-09 20:22               ` Nicolas Pitre
2007-07-09 20:23               ` Brian Downing
2007-07-09 19:30         ` [PATCH] Shoddy pack information tool Brian Downing
2007-07-11 21:55           ` Junio C Hamano
2007-07-12  3:02             ` [PATCH] Pack " Brian Downing
2007-07-09  5:41 ` Preferring shallower deltas on repack Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).