All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: git@vger.kernel.org
Subject: Re: 1.3.0 creating bigger packs than 1.2.3
Date: Thu, 20 Apr 2006 12:43:51 -0400	[thread overview]
Message-ID: <20060420164351.GB31738@spearce.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0604200857460.3701@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> wrote:
> 
> 
> On Thu, 20 Apr 2006, Shawn Pearce wrote:
> > 
> > So with 1.3.0.g56c1 "git repack -a -d -f" did worse:
> > 
> >   Total 46391, written 46391 (delta 6649), reused 39742 (delta 0)
> >   129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
> > 
> > I just tried -f on v1.2.3 and it did slightly better then before:
> > 
> >   Total 46391, written 46391 (delta 6847), reused 38012 (delta 0)
> >    59M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

Oddly enough repacking the v1.2.3 pack using 1.3.0.g56c1 created an
even smaller pack ("git-repack -a -d"):

  Total 46391, written 46391 (delta 8253), reused 44985 (delta 6847)
   49M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

and repacking again with "git-repack -a -d" chopped another 1M:

  Total 46391, written 46391 (delta 8258), reused 46386 (delta 8253)
   48M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pac
  
but then adding -f definately gives us the 2x explosion again:

  Total 46391, written 46391 (delta 6649), reused 37894 (delta 0)
  129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

> Interesting. The bigger packs do generate fewer deltas, but they don't 
> seem to be _that_ much fewer. And the deltas themselves certainly 
> shouldn't be bigger.
> 
> It almost sounds like there's a problem with choosing what to delta 
> against, not necessarily a delta algorithm problem. Although that sounds a 
> bit strange, because I wouldn't have thought we actually changed the 
> packing algorithm noticeably since 1.2.3.
> 
> Hmm. Doing "gitk v1.2.3.. -- pack-objects.c" shows that I was wrong. Junio 
> did the "hash basename and direname a bit differently" thing, which would 
> appear to change the "find objects to delta against" a lot. That could be 
> it. 
> 
> You could try to revert that change:
> 
> 	git revert eeef7135fed9b8784627c4c96e125241c06c65e1
> 
> which needs a trivial manual fixup (remove the conflict entirely: 
> everything between the "<<<<" and ">>>>>" lines should go), and see if 
> that's it.

Whoa.  I did that revert and fixup on top of 'next'.  The pack
from "git-repack -a -d -f" is now even larger due to even less
delta reuse:

  Total 46391, written 46391 (delta 5148), reused 39565 (delta 0)
  171M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

> You can also try to see if
> 
> 	git repack -a -d -f --window=50
> 
> makes for a better pack (at the cost of a much slower repack). It makes 
> git try more objects to delta against, and can thus hide a bad sort order.

With --window=50 on 'next' (without the revert'):

  Total 46391, written 46391 (delta 6666), reused 39723 (delta 0)
  129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

For added measure I tried --window=100 and 500 with pretty much
the same result (slightly higher delta but still a 129M pack).

-- 
Shawn.

  reply	other threads:[~2006-04-20 16:44 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-20 13:36 1.3.0 creating bigger packs than 1.2.3 Shawn Pearce
2006-04-20 14:47 ` Linus Torvalds
2006-04-20 15:03   ` Shawn Pearce
2006-04-20 16:07     ` Linus Torvalds
2006-04-20 16:43       ` Shawn Pearce [this message]
2006-04-20 17:03         ` Linus Torvalds
2006-04-20 17:24           ` Junio C Hamano
2006-04-20 17:31           ` Shawn Pearce
2006-04-20 17:54             ` Nicolas Pitre
2006-04-20 21:31             ` Junio C Hamano
2006-04-20 21:53               ` Shawn Pearce
2006-04-20 21:56               ` Jakub Narebski
2006-04-20 17:41           ` Nicolas Pitre
2006-04-20 17:55           ` Shawn Pearce
2006-04-20 18:24             ` Nicolas Pitre
2006-04-20 18:49               ` Junio C Hamano
2006-04-20 21:02                 ` Nicolas Pitre
2006-04-20 21:40                   ` Junio C Hamano
2006-04-20 22:02                     ` Shawn Pearce
2006-04-20 22:35                       ` Junio C Hamano
2006-04-21  1:01                         ` Shawn Pearce
2006-04-20 22:59                       ` Linus Torvalds
2006-04-21  0:52                     ` Nicolas Pitre
2006-04-21  1:20                     ` Shawn Pearce
2006-04-21  2:28                       ` Nicolas Pitre
2006-04-21  2:40                         ` Shawn Pearce
2006-04-21  3:07                           ` Nicolas Pitre
2006-04-21  2:32                       ` Shawn Pearce
2006-04-20 23:02                   ` Junio C Hamano
2006-04-20 16:09 ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060420164351.GB31738@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.