All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Nicolas Pitre <nico@fluxnic.net>
Cc: git@vger.kernel.org
Subject: Re: [BAD PATCH 0/9] v4-aware tree walker API
Date: Fri, 11 Oct 2013 19:22:59 +0700	[thread overview]
Message-ID: <20131011122259.GA7776@lanh> (raw)
In-Reply-To: <alpine.LFD.2.03.1310091137310.3023@syhkavp.arg>

On Wed, Oct 09, 2013 at 12:51:26PM -0400, Nicolas Pitre wrote:
> Now let's mitigate the deep delta chaining effect in the tree encoding:
> 
> $ rm .git/objects/pack/pack-foo.*
> $ ../../git/test-packv4 --min-tree-copy=50 orig/pack-*.pack .git/objects/pack/pack-foo.pack
> Scanning objects: 100% (162785/162785), done.
> Writing objects: 100% (162785/162785), done.
> $ time git rev-list --objects --all > /dev/null
> 
> real    0m9.451s
> user    0m9.393s
> sys     0m0.036s
> 
> Using --min-tree-copy=50 produces a pack which is still smaller than 
> pack v2 but any tree copy sequence must refer to a minimum of 50 
> entries.  This significantly reduces the CPU usage in decode_entries() 
> by reducing the needless chaining effect I mentioned here:
> 
> http://article.gmane.org/gmane.comp.version-control.git/234975

Yeah I was frustrated and did not think about trying --min-tree-copy.

> So, there are 2 conclusions here:
> 
> 1: The current tree delta euristic is inefficient for pack v4.
> 
> 2- Something must be very wrong in your latest patches as they make it
>    close to 3 times more expensive than without them.

For one I know that get_tree_offset_cache() is called a lot more with
the new API. I do rev-list on v1.8.4. With compatibility layer on,
it's 211m calls (*). With new API it's 655m calls. The new API is
basically do decode_entries() on every tree_entry() call. Perhaps I
screwed up something in decode_entries() itself..

(*) for 15m tree entries, 211m are a lot of calls, which might
translate to a lot of copy sequences..

> > Maybe we could make an exception and allow the tree walker to pass
> > pv4_tree_cache* directly to decode_entries so it does not need to do
> > the first lookup every time..
> > 
> > Suggestions?
> 
> I'll try to have a look at your patches in more details soon.

Shameful fixup (though it does not seem to impact the timing)

diff --git a/packv4-parse.c b/packv4-parse.c
index 6f6152c..9d7589e 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -825,8 +825,8 @@ static struct object **get_packed_objs(struct pv4_tree_desc *desc)
 	if (!desc->p || !desc->sha1_index)
 		return NULL;
 	if (desc->p->version >= 4 && !desc->p->objs)
-		desc->p->objs =
-			xmalloc(sizeof(struct object *) * desc->p->num_objects);
+		desc->p->objs = xcalloc(desc->p->num_objects,
+					sizeof(struct object *));
 	return desc->p->objs;
 }
 

  reply	other threads:[~2013-10-11 12:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-09 14:46 [BAD PATCH 0/9] v4-aware tree walker API Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 1/9] sha1_file: provide real packed type in object_info_extended Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 2/9] pack v4: move v2 tree entry generation code out of decode_entries Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 3/9] pv4_tree_desc: introduce new struct for pack v4 tree walker Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 4/9] pv4_tree_desc: use struct tree_desc from pv4_tree_desc Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 5/9] pv4_tree_desc: allow decode_entries to return v4 trees, one at a time Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 6/9] pv4_tree_desc: complete interface Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 7/9] pv4_tree_desc: don't bother looking for v4 trees if no v4 packs are present Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 8/9] pv4_tree_desc: avoid lookup_object() when possible Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 9/9] list-object.c: take "advantage" of new pv4_tree_desc interface Nguyễn Thái Ngọc Duy
2013-10-09 16:51 ` [BAD PATCH 0/9] v4-aware tree walker API Nicolas Pitre
2013-10-11 12:22   ` Duy Nguyen [this message]
2013-10-11 13:05     ` Duy Nguyen
2013-10-12 14:42       ` Nicolas Pitre
2013-10-12 15:59         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131011122259.GA7776@lanh \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=nico@fluxnic.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.