All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: pclouds@gmail.com
Cc: Ben.Peart@microsoft.com, git@vger.kernel.org, gitster@pobox.com,
	peartben@gmail.com, peff@peff.net
Subject: [PATCH v3 0/4] Speed up unpack_trees()
Date: Sat,  4 Aug 2018 07:37:19 +0200	[thread overview]
Message-ID: <20180804053723.4695-1-pclouds@gmail.com> (raw)
In-Reply-To: <20180729103306.16403-1-pclouds@gmail.com>

This is a minor update to address Ben's comments and add his
measurements in the commit message of 2/4 for the record.

I've also checked about the lookahead thing in unpack_trees() to see
if we accidentally break something there, which is my biggest worry.
See [1] and [2] for context, but I believe since we can't have D/F
conflicts, the situation where lookahead is needed will not occur. So
we should be safe.

[1] da165f470e (unpack-trees.c: prepare for looking ahead in the index - 2010-01-07)
[2] 730f72840c (unpack-trees.c: look ahead in the index - 2009-09-20)

range-diff:

1:  789f7e2872 ! 1:  05eb762d2d unpack-trees.c: add performance tracing
    @@ -1,6 +1,6 @@
     Author: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
     
    -    unpack-trees.c: add performance tracing
    +    unpack-trees: add performance tracing
     
         We're going to optimize unpack_trees() a bit in the following
         patches. Let's add some tracing to measure how long it takes before
2:  589bed1366 ! 2:  02286ad123 unpack-trees: optimize walking same trees with cache-tree
    @@ -32,6 +32,24 @@
             0.111793866   0.032933140 s: diff-index
             0.587933288   0.398924370 s: git command: /home/pclouds/w/git/git
     
    +    Another measurement from Ben's running "git checkout" with over 500k
    +    trees (on the whole series):
    +
    +        baseline        new
    +      ----------------------------------------------------------------------
    +        0.535510167     0.556558733     s: read cache .git/index
    +        0.3057373       0.3147105       s: initialize name hash
    +        0.0184082       0.023558433     s: preload index
    +        0.086910967     0.089085967     s: refresh index
    +        7.889590767     2.191554433     s: unpack trees
    +        0.120760833     0.131941267     s: update worktree after a merge
    +        2.2583504       2.572663167     s: repair cache-tree
    +        0.8916137       0.959495233     s: write index, changed mask = 28
    +        3.405199233     0.2710663       s: unpack trees
    +        0.000999667     0.0021554       s: update worktree after a merge
    +        3.4063306       0.273318333     s: diff-index
    +        16.9524923      9.462943133     s: git command: git.exe checkout
    +
         This command calls unpack_trees() twice, the first time on 2way merge
         and the second 1way merge. In both times, "unpack trees" time is
         reduced to one third. Overall time reduction is not that impressive of
    @@ -39,7 +57,6 @@
         repair cache-tree line.
     
         Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     diff --git a/unpack-trees.c b/unpack-trees.c
     --- a/unpack-trees.c
    @@ -170,7 +187,7 @@
      }
      
     +/*
    -+ * Note that traverse_by_cache_tree() duplicates some logic in this funciton
    ++ * Note that traverse_by_cache_tree() duplicates some logic in this function
     + * without actually calling it. If you change the logic here you may need to
     + * check and change there as well.
     + */
    @@ -189,12 +206,3 @@
      static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, struct name_entry *names, struct traverse_info *info)
      {
      	struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, };
    -@@
    - 	uint64_t start = getnanotime();
    - 
    - 	if (len > MAX_UNPACK_TREES)
    --		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
    -+		die(_("unpack_trees takes at most %d trees"), MAX_UNPACK_TREES);
    - 
    - 	memset(&el, 0, sizeof(el));
    - 	if (!core_apply_sparse_checkout || !o->update)
3:  7c6f863fc0 = 3:  c87b82ffee unpack-trees: reduce malloc in cache-tree walk
4:  6ca17b1138 ! 4:  e791cdfc82 unpack-trees: cheaper index update when walking by cache-tree
    @@ -40,7 +40,6 @@
         attempt should be on that "repair cache-tree" line.
     
         Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     diff --git a/cache.h b/cache.h
     --- a/cache.h
    @@ -119,20 +118,6 @@
      	free(tree_ce);
      	if (o->debug_unpack)
      		printf("Unpacked %d entries from %s to %s using cache-tree\n",
    -@@
    - 		if (!ret) {
    - 			if (!o->result.cache_tree)
    - 				o->result.cache_tree = cache_tree();
    -+			/*
    -+			 * TODO: Walk o.src_index->cache_tree, quickly check
    -+			 * if o->result.cache has the exact same content for
    -+			 * any valid cache-tree in o.src_index, then we can
    -+			 * just copy the cache-tree over instead of hashing a
    -+			 * new tree object.
    -+			 */
    - 			if (!cache_tree_fully_valid(o->result.cache_tree))
    - 				cache_tree_update(&o->result,
    - 						  WRITE_TREE_SILENT |
     
     diff --git a/unpack-trees.h b/unpack-trees.h
     --- a/unpack-trees.h

Nguyễn Thái Ngọc Duy (4):
  unpack-trees: add performance tracing
  unpack-trees: optimize walking same trees with cache-tree
  unpack-trees: reduce malloc in cache-tree walk
  unpack-trees: cheaper index update when walking by cache-tree

 cache-tree.c   |   2 +
 cache.h        |   1 +
 read-cache.c   |   3 +-
 unpack-trees.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++
 unpack-trees.h |   1 +
 5 files changed, 158 insertions(+), 1 deletion(-)

-- 
2.18.0.656.gda699b98b3

  parent reply	other threads:[~2018-08-04  5:37 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18 20:45 [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-18 20:45 ` [PATCH v1 1/3] add unbounded Multi-Producer-Multi-Consumer queue Ben Peart
2018-07-18 20:57   ` Stefan Beller
2018-07-19 19:11   ` Junio C Hamano
2018-07-18 20:45 ` [PATCH v1 2/3] add performance tracing around traverse_trees() in unpack_trees() Ben Peart
2018-07-18 20:45 ` [PATCH v1 3/3] Add initial parallel version of unpack_trees() Ben Peart
2018-07-18 22:56   ` Junio C Hamano
2018-07-18 21:02 ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Stefan Beller
2018-07-18 21:34 ` Jeff King
2018-07-23 15:48   ` Ben Peart
2018-07-23 17:03     ` Duy Nguyen
2018-07-23 20:51       ` Ben Peart
2018-07-24  4:20         ` Jeff King
2018-07-24 15:33           ` Duy Nguyen
2018-07-25 20:56             ` Ben Peart
2018-07-26  5:30               ` Duy Nguyen
2018-07-26 16:30                 ` Duy Nguyen
2018-07-26 19:40                   ` Junio C Hamano
2018-07-27 15:42                     ` Duy Nguyen
2018-07-27 16:22                       ` Ben Peart
2018-07-27 18:00                         ` Duy Nguyen
2018-07-27 17:14                       ` Junio C Hamano
2018-07-27 17:52                         ` Duy Nguyen
2018-07-29  6:24                           ` Duy Nguyen
2018-07-29 10:33                       ` [PATCH v2 0/4] Speed up unpack_trees() Nguyễn Thái Ngọc Duy
2018-07-29 10:33                         ` [PATCH v2 1/4] unpack-trees.c: add performance tracing Nguyễn Thái Ngọc Duy
2018-07-30 20:16                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-07-30 20:52                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-07-30 20:58                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:46                           ` Elijah Newren
2018-08-10 16:39                             ` Duy Nguyen
2018-08-10 18:39                               ` Elijah Newren
2018-08-10 19:30                                 ` Duy Nguyen
2018-08-10 19:40                                   ` Elijah Newren
2018-08-10 19:48                                     ` Duy Nguyen
2018-07-30 18:10                         ` [PATCH v2 0/4] Speed up unpack_trees() Ben Peart
2018-07-31 15:31                           ` Duy Nguyen
2018-07-31 16:50                             ` Ben Peart
2018-07-31 17:31                               ` Ben Peart
2018-08-01 16:38                                 ` Duy Nguyen
2018-08-08 20:53                                   ` Ben Peart
2018-08-09  8:16                                     ` Ben Peart
2018-08-10 16:08                                       ` Duy Nguyen
2018-08-10 15:51                                     ` Duy Nguyen
2018-07-30 21:04                         ` Ben Peart
2018-08-04  5:37                         ` Nguyễn Thái Ngọc Duy [this message]
2018-08-04  5:37                           ` [PATCH v3 1/4] unpack-trees: add performance tracing Nguyễn Thái Ngọc Duy
2018-08-04  5:37                           ` [PATCH v3 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:23                             ` Elijah Newren
2018-08-10 16:29                               ` Duy Nguyen
2018-08-10 18:48                                 ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-08 18:30                             ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-06 15:48                           ` [PATCH v3 0/4] Speed up unpack_trees() Junio C Hamano
2018-08-06 15:59                             ` Duy Nguyen
2018-08-06 18:59                               ` Junio C Hamano
2018-08-08 17:00                                 ` Ben Peart
2018-08-08 17:46                               ` Junio C Hamano
2018-08-08 18:12                                 ` Junio C Hamano
2018-08-08 18:39                                   ` Junio C Hamano
2018-08-10 16:53                                     ` Duy Nguyen
2018-08-12  8:15                           ` [PATCH v4 0/5] " Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 1/5] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-13 18:39                               ` Ben Peart
2018-08-12  8:15                             ` [PATCH v4 2/5] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-12 10:05                               ` Thomas Adam
2018-08-13 18:50                                 ` Junio C Hamano
2018-08-13 18:44                               ` Ben Peart
2018-08-13 19:25                               ` Jeff King
2018-08-13 19:36                                 ` Stefan Beller
2018-08-13 20:11                                   ` Ben Peart
2018-08-13 19:52                                 ` Duy Nguyen
2018-08-13 21:47                                   ` Jeff King
2018-08-13 22:41                                 ` Junio C Hamano
2018-08-14 18:19                                   ` Jeff Hostetler
2018-08-14 18:32                                     ` Duy Nguyen
2018-08-14 18:44                                       ` Stefan Beller
2018-08-14 18:51                                         ` Duy Nguyen
2018-08-14 19:54                                           ` Jeff King
2018-08-14 20:52                                           ` Junio C Hamano
2018-08-15 16:32                                             ` Duy Nguyen
2018-08-15 18:28                                               ` Junio C Hamano
2018-08-14 20:14                                         ` Jeff Hostetler
2018-08-12  8:15                             ` [PATCH v4 3/5] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-13 18:58                               ` Ben Peart
2018-08-15 16:38                                 ` Duy Nguyen
2018-08-12  8:15                             ` [PATCH v4 4/5] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 5/5] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-13 15:48                               ` Elijah Newren
2018-08-13 15:57                                 ` Duy Nguyen
2018-08-13 16:05                                 ` Ben Peart
2018-08-13 16:25                                   ` Duy Nguyen
2018-08-13 17:15                                     ` Ben Peart
2018-08-13 19:01                             ` [PATCH v4 0/5] Speed up unpack_trees() Junio C Hamano
2018-08-14 19:19                             ` Ben Peart
2018-08-18 14:41                             ` [PATCH v5 0/7] " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 1/7] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 2/7] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 3/7] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-20 12:43                                 ` Ben Peart
2018-08-18 14:41                               ` [PATCH v5 4/7] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 5/7] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 6/7] unpack-trees: add missing cache invalidation Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 7/7] cache-tree: verify valid cache-tree in the test suite Nguyễn Thái Ngọc Duy
2018-08-18 21:45                                 ` Elijah Newren
2018-08-18 22:01                               ` [PATCH v5 0/7] Speed up unpack_trees() Elijah Newren
2018-08-19  5:09                                 ` Duy Nguyen
2018-08-25 12:18                               ` [PATCH] Document update for nd/unpack-trees-with-cache-tree Nguyễn Thái Ngọc Duy
2018-08-25 12:31                                 ` Martin Ågren
2018-08-25 13:02                                 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-07-27 15:50                     ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-26 16:35               ` Duy Nguyen
2018-07-24  5:54         ` Junio C Hamano
2018-07-24 15:13         ` Duy Nguyen
2018-07-24 21:21           ` Jeff King
2018-07-25 16:09           ` Ben Peart
2018-07-24  4:27       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180804053723.4695-1-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peartben@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.