All of lore.kernel.org
 help / color / mirror / Atom feed
From: bdowning@lavos.net (Brian Downing)
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Jakub Narebski <jnareb@gmail.com>,
	Brandon Casey <casey@nrlssc.navy.mil>,
	Nicolas Pitre <nico@cam.org>, Jan Holesovsky <kendy@suse.cz>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] RFC: git lazy clone proof-of-concept
Date: Thu, 14 Feb 2008 17:57:47 -0600	[thread overview]
Message-ID: <20080214235747.GV27535@lavos.net> (raw)
In-Reply-To: <20080214235129.GU27535@lavos.net>

On Thu, Feb 14, 2008 at 05:51:29PM -0600, Brian Downing wrote:
> Do you by chance have repack.usedeltabaseoffset turned on?  That has the
> unfortunate side effect of changing the output of verify-pack -v to be
> almost useless for my packinfo script (specifically, it no longer
> reports the parent SHA1 hash for deltas, and the script is basically all
> about deltra tree statistics.)  I suppose that should probably be fixed,
> but I never looked into it.

That being said, the most useful output for figuring out where all the
space in the pack is going in my experience is gotten from:

git-verify-pack -v | packinfo.pl -tree -filenames

That will produce a huge amount of output, which is basically the tree
structure of the delta chains in the file.  If things aren't being
deltified together properly, it's usually pretty obvious.

A delta chain in this output looks approximately like this:

#   0   blob 03156f21...     1767     1767 Documentation/git-lost-found.txt @ tags/v1.2.0~142
#   1    blob f52a9d7f...       10     1777 Documentation/git-lost-found.txt @ tags/v1.5.0-rc1~74
#   2     blob a8cc5739...       51     1828 Documentation/git-lost+found.txt @ tags/v0.99.9h^0
#   3      blob 660e90b1...       15     1843 Documentation/git-lost+found.txt @ master~3222^2~2
#   4       blob 0cb8e3bb...       33     1876 Documentation/git-lost+found.txt @ master~3222^2~3
#   2     blob e48607f0...      311     2088 Documentation/git-lost-found.txt @ tags/v1.5.2-rc3~4
#      size: count 6 total 2187 min 10 max 1767 mean 364.50 median 51 std_dev 635.85
# path size: count 6 total 11179 min 1767 max 2088 mean 1863.17 median 1843 std_dev 107.26

# The first number after the sha1 is the object size, the second
# number is the path size.  The statistics are across all objects in
# the previous delta tree.  Obviously they are omitted for trees of
# one object.

# A path size is the sum of the size of the delta chain, including the
# base object.  In other words, it's how many bytes need be read to
# reassemble the file from deltas.

This is also quite slow, as it runs git-ls-tree -t -r on every commit in
the repository to assign file names to blobs.  You can leave out the
-filenames option to not do this (if you don't care about seeing
filenames, that is).

-bcd

  reply	other threads:[~2008-02-14 23:58 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-08 17:28 [PATCH] RFC: git lazy clone proof-of-concept Jan Holesovsky
2008-02-08 18:03 ` Nicolas Pitre
2008-02-09 14:25   ` Jan Holesovsky
2008-02-09 22:05     ` Mike Hommey
2008-02-09 23:38       ` Nicolas Pitre
2008-02-10  7:23     ` Marco Costalba
2008-02-10 12:08       ` Johannes Schindelin
2008-02-10 16:46         ` David Symonds
2008-02-10 17:45           ` Johannes Schindelin
2008-02-10 19:45             ` Nicolas Pitre
2008-02-10 20:32               ` Johannes Schindelin
2008-02-08 18:14 ` Harvey Harrison
2008-02-09 14:27   ` Jan Holesovsky
2008-02-08 18:20 ` Johannes Schindelin
2008-02-08 18:49 ` Mike Hommey
2008-02-08 19:04   ` Johannes Schindelin
2008-02-09 15:06   ` Jan Holesovsky
2008-02-08 19:00 ` Jakub Narebski
2008-02-08 19:26   ` Jon Smirl
2008-02-08 20:09     ` Nicolas Pitre
2008-02-11 10:13       ` Andreas Ericsson
2008-02-12  2:55         ` [PATCH 1/2] pack-objects: Allow setting the #threads equal to #cpus automatically Brandon Casey
2008-02-12  5:53           ` Andreas Ericsson
     [not found]         ` <1202784078-23700-1-git-send-email-casey@nrlssc.navy.mil>
2008-02-12  2:59           ` [PATCH 2/2] pack-objects: Default to zero threads, meaning auto-assign to #cpus Brandon Casey
2008-02-12  4:57             ` Nicolas Pitre
2008-02-08 20:19     ` [PATCH] RFC: git lazy clone proof-of-concept Harvey Harrison
2008-02-08 20:24       ` Jon Smirl
2008-02-08 20:25         ` Harvey Harrison
2008-02-08 20:41           ` Jon Smirl
2008-02-09 15:27   ` Jan Holesovsky
2008-02-10  3:10     ` Nicolas Pitre
2008-02-10  4:59       ` Sean
2008-02-10  5:22         ` Nicolas Pitre
2008-02-10  5:35           ` Sean
2008-02-11  1:42             ` Jakub Narebski
2008-02-11  2:04               ` Nicolas Pitre
2008-02-11 10:11                 ` Jakub Narebski
2008-02-10  9:34         ` Joachim B Haga
2008-02-10 16:43       ` Johannes Schindelin
2008-02-10 17:01         ` Jon Smirl
2008-02-10 17:36           ` Johannes Schindelin
2008-02-10 18:47         ` Johannes Schindelin
2008-02-10 19:42           ` Nicolas Pitre
2008-02-10 20:11             ` Jon Smirl
2008-02-12 20:37           ` Johannes Schindelin
2008-02-12 21:05             ` Nicolas Pitre
2008-02-12 21:08             ` Linus Torvalds
2008-02-12 21:36               ` Jon Smirl
2008-02-12 21:59                 ` Linus Torvalds
2008-02-12 22:25                   ` Linus Torvalds
2008-02-12 22:43                     ` Jon Smirl
2008-02-12 23:39                       ` Linus Torvalds
2008-02-12 21:25             ` Jon Smirl
2008-02-14 19:20             ` Johannes Schindelin
2008-02-14 20:05               ` Jakub Narebski
2008-02-14 20:16                 ` Nicolas Pitre
2008-02-14 21:04                 ` Johannes Schindelin
2008-02-14 21:59                   ` Jakub Narebski
2008-02-14 23:38                     ` Johannes Schindelin
2008-02-14 23:51                       ` Brian Downing
2008-02-14 23:57                         ` Brian Downing [this message]
2008-02-15  0:08                         ` Johannes Schindelin
2008-02-15  1:41                           ` Nicolas Pitre
2008-02-17  8:18                             ` Shawn O. Pearce
2008-02-17  9:05                               ` Junio C Hamano
2008-02-17 18:44                               ` Nicolas Pitre
2008-02-15  1:07                       ` Jakub Narebski
2008-02-15  9:43                     ` Jan Holesovsky
2008-02-14 21:08                 ` Brandon Casey
2008-02-15  9:34               ` Jan Holesovsky
2008-02-10 19:50         ` Nicolas Pitre
2008-02-14 19:41           ` Brandon Casey
2008-02-14 19:58             ` Johannes Schindelin
2008-02-14 20:11             ` Nicolas Pitre
2008-02-11  1:20     ` Jakub Narebski
2008-02-08 20:16 ` Johannes Schindelin
2008-02-08 21:35   ` Jakub Narebski
2008-02-08 21:52     ` Johannes Schindelin
2008-02-08 22:03       ` Mike Hommey
2008-02-08 22:34         ` Johannes Schindelin
2008-02-08 22:50           ` Mike Hommey
2008-02-08 23:14             ` Johannes Schindelin
2008-02-08 23:38               ` Mike Hommey
2008-02-09 21:20                 ` Jan Hudec
2008-02-09 15:54       ` Jan Holesovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080214235747.GV27535@lavos.net \
    --to=bdowning@lavos.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=casey@nrlssc.navy.mil \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=kendy@suse.cz \
    --cc=nico@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.