git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Brian Downing <bdowning@lavos.net>
Cc: git@vger.kernel.org
Subject: Re: [BUG] fast-import producing very deep tree deltas
Date: Tue, 13 Nov 2007 03:53:07 -0500	[thread overview]
Message-ID: <20071113085307.GC14735@spearce.org> (raw)
In-Reply-To: <20071112110354.GP6212@lavos.net>

Brian Downing <bdowning@lavos.net> wrote:
> I've happened upon a case where fast-import produces deep tree deltas.
> How deep?  Really deep.  6035 entries deep to be precise for this case:
> 
>     depths: count 135970 total 120567366 min 0 max 6035 mean 886.72 median 3 std_dev 1653.48
> 
>     27b8a20bdf39fecd917e8401d3499013e49449d0 tree   32 99609547 6035 0000000000000000000000000000000000000000
> 
> This was with git-fast-import from 'next' as of a couple days ago,
> run with the default options (no --depth passed in).
> 
> Needless to say the pack that resulted was just about useless.  Trying to
> repack it resulted in the "counting objects" phase running at about five
> objects per second.

Heh.

I think what's happening here is your active branch cache isn't
big enough.  We're swapping out the branch and thus recycling the
tree information (struct tree_content) back into the free pool.
When we later reload the tree we set the delta_depth to 0 but we
kept the tree we just reloaded as a delta base.

So if the tree we reloaded was already at the maximum we wouldn't
know it and make the new tree a delta.  Multiply the number of times
the branch cache has to swap out the tree times max_depth (10) and
you get the maximum delta depth of a tree created by fast-import.
Given your above data of 6035 I'm guessing your active branch cache
had to swap the branch out 603/604 times during this import.

I think the fix is going to involve caching the depth within struct
object_entry so we can restore it when the tree is reloaded.

-- 
Shawn.

  parent reply	other threads:[~2007-11-13  8:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-12 11:03 [BUG] fast-import producing very deep tree deltas Brian Downing
2007-11-12 11:13 ` [BUG] fast-import quoting broken for renames Brian Downing
2007-11-12 20:26 ` [BUG] fast-import producing very deep tree deltas Linus Torvalds
2007-11-13  8:53 ` Shawn O. Pearce [this message]
2007-11-13  9:27   ` Shawn O. Pearce
2007-11-13 14:36     ` Brian Downing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071113085307.GC14735@spearce.org \
    --to=spearce@spearce.org \
    --cc=bdowning@lavos.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).