From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, newren@gmail.com,
"Derrick Stolee" <stolee@gmail.com>,
"René Scharfe" <l.s.r@web.de>,
"Derrick Stolee" <derrickstolee@github.com>
Subject: [PATCH v2 0/9] Cleanups around index operations
Date: Mon, 04 Jan 2021 03:09:09 +0000 [thread overview]
Message-ID: <pull.829.v2.git.1609729758.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.829.git.1609356413.gitgitgadget@gmail.com>
I've taken a professional interest in the index lately, and I've been trying
mostly to learn about it and measure different operations. Along the way,
I've seen some possible improvements in documentation, behavior, and
tracing.
This series collects most of my thoughts so far, including:
1. Adding extra trace2 regions and statistics (similar to [1]) (Patches
1-5).
2. Update documentation about the cached tree extension (Patches 6-7).
3. Improved the performance of verify_cache() (Patches 8-9).
Thanks, -Stolee
[1]
https://lore.kernel.org/git/pull.828.git.1609302714183.gitgitgadget@gmail.com/
UPDATES IN V2
=============
* Instead of completely dropping the second loop in verify_cache(), improve
the performance. I include René's patch (unaltered except for my
sign-off) in this series for clarity.
* Fixed the unnecessary whitespace change in patch 1. Updated the commit
message to refer to a similar effort in changed-path Bloom filters.
* The range enter/leave block in patch 5 now spans the entire method.
Derrick Stolee (8):
tree-walk: report recursion counts
unpack-trees: add trace2 regions
cache-tree: use trace2 in cache_tree_update()
cache-tree: trace regions for I/O
cache-tree: trace regions for prime_cache_tree
index-format: update preamble to cached tree extension
index-format: discuss recursion of cached-tree better
cache-tree: speed up consecutive path comparisons
René Scharfe (1):
cache-tree: use ce_namelen() instead of strlen()
Documentation/technical/index-format.txt | 39 +++++++++++++++++++-----
cache-tree.c | 30 +++++++++++++-----
tree-walk.c | 33 ++++++++++++++++++++
unpack-trees.c | 5 +++
4 files changed, 93 insertions(+), 14 deletions(-)
base-commit: 71ca53e8125e36efbda17293c50027d31681a41f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-829%2Fderrickstolee%2Fcache-tree%2Fbasics-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-829/derrickstolee/cache-tree/basics-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/829
Range-diff vs v1:
1: f727880add6 ! 1: 0e500c86f39 tree-walk: report recursion counts
@@ Metadata
## Commit message ##
tree-walk: report recursion counts
- The traverse_trees() method recusively walks through trees, but also
+ The traverse_trees() method recursively walks through trees, but also
prunes the tree-walk based on a callback. Some callers, such as
unpack_trees(), are quite complicated and can have wildly different
performance between two different commands.
@@ Commit message
instances of traverse_trees(), but they provide reproducible values for
demonstrating improvements to the pruning algorithm when possible.
+ This change is modeled after a similar statistics reporting in 42e50e78
+ (revision.c: add trace2 stats around Bloom filter usage, 2020-04-06).
+
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
## tree-walk.c ##
@@ tree-walk.c: int traverse_trees(struct index_state *istate,
return error;
}
-
- ## unpack-trees.c ##
-@@ unpack-trees.c: static void populate_from_existing_patterns(struct unpack_trees_options *o,
- free(sparse);
- }
-
--
- static int verify_absent(const struct cache_entry *,
- enum unpack_trees_error_types,
- struct unpack_trees_options *);
2: 6923e6211aa = 2: 4157b91acf8 unpack-trees: add trace2 regions
3: 802718084a7 = 3: 8959d57abdd cache-tree: use trace2 in cache_tree_update()
4: 65feaa497b2 = 4: 1d8a797ee26 cache-tree: trace regions for I/O
5: 5d1c9c8a356 ! 5: 2b2e70bb77c cache-tree: trace regions for prime_cache_tree
@@ Commit message
## cache-tree.c ##
@@ cache-tree.c: void prime_cache_tree(struct repository *r,
+ struct index_state *istate,
+ struct tree *tree)
{
++ trace2_region_enter("cache-tree", "prime_cache_tree", the_repository);
cache_tree_free(&istate->cache_tree);
istate->cache_tree = cache_tree();
+
-+ trace2_region_enter("cache-tree", "prime_cache_tree", the_repository);
prime_cache_tree_rec(r, istate->cache_tree, tree);
-+ trace2_region_leave("cache-tree", "prime_cache_tree", the_repository);
istate->cache_changed |= CACHE_TREE_CHANGED;
++ trace2_region_leave("cache-tree", "prime_cache_tree", the_repository);
}
+ /*
6: fb9d5468184 = 6: 75b51483d3c index-format: update preamble to cached tree extension
7: 65fb9f72251 = 7: b2bb141a254 index-format: discuss recursion of cached-tree better
8: 20ea7050324 < -: ----------- cache-tree: avoid path comparison loop when silent
-: ----------- > 8: 5298694786e cache-tree: use ce_namelen() instead of strlen()
-: ----------- > 9: 72edd7bb427 cache-tree: speed up consecutive path comparisons
--
gitgitgadget
next prev parent reply other threads:[~2021-01-04 3:11 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-30 19:26 [PATCH 0/8] Cleanups around index operations Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 1/8] tree-walk: report recursion counts Derrick Stolee via GitGitGadget
2020-12-30 19:42 ` Elijah Newren
2020-12-30 19:51 ` Derrick Stolee
2020-12-30 19:26 ` [PATCH 2/8] unpack-trees: add trace2 regions Derrick Stolee via GitGitGadget
2020-12-30 19:45 ` Elijah Newren
2020-12-30 19:26 ` [PATCH 3/8] cache-tree: use trace2 in cache_tree_update() Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 4/8] cache-tree: trace regions for I/O Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 5/8] cache-tree: trace regions for prime_cache_tree Derrick Stolee via GitGitGadget
2020-12-30 19:48 ` Elijah Newren
2020-12-30 19:53 ` Derrick Stolee
2020-12-30 19:26 ` [PATCH 6/8] index-format: update preamble to cached tree extension Derrick Stolee via GitGitGadget
2020-12-30 20:00 ` Elijah Newren
2020-12-30 19:26 ` [PATCH 7/8] index-format: discuss recursion of cached-tree better Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 8/8] cache-tree: avoid path comparison loop when silent Derrick Stolee via GitGitGadget
2020-12-30 20:14 ` Elijah Newren
2021-01-06 8:55 ` Junio C Hamano
2021-01-06 12:08 ` Derrick Stolee
2020-12-31 12:34 ` René Scharfe
2020-12-31 16:46 ` Derrick Stolee
2021-01-01 13:30 ` René Scharfe
2021-01-02 15:19 ` [PATCH] cache-tree: use ce_namelen() instead of strlen() René Scharfe
2021-01-04 1:26 ` Derrick Stolee
2021-01-05 12:05 ` Junio C Hamano
2021-01-02 15:31 ` [PATCH 8/8] cache-tree: avoid path comparison loop when silent René Scharfe
2020-12-30 20:19 ` [PATCH 0/8] Cleanups around index operations Elijah Newren
2020-12-30 20:24 ` Derrick Stolee
2021-01-04 3:09 ` Derrick Stolee via GitGitGadget [this message]
2021-01-04 3:09 ` [PATCH v2 1/9] tree-walk: report recursion counts Derrick Stolee via GitGitGadget
2021-01-04 3:09 ` [PATCH v2 2/9] unpack-trees: add trace2 regions Derrick Stolee via GitGitGadget
2021-01-04 3:09 ` [PATCH v2 3/9] cache-tree: use trace2 in cache_tree_update() Derrick Stolee via GitGitGadget
2021-01-04 3:09 ` [PATCH v2 4/9] cache-tree: trace regions for I/O Derrick Stolee via GitGitGadget
2021-01-04 3:09 ` [PATCH v2 5/9] cache-tree: trace regions for prime_cache_tree Derrick Stolee via GitGitGadget
2021-01-04 3:09 ` [PATCH v2 6/9] index-format: update preamble to cached tree extension Derrick Stolee via GitGitGadget
2021-01-07 2:10 ` Junio C Hamano
2021-01-07 11:51 ` Derrick Stolee
2021-01-07 20:12 ` Junio C Hamano
2021-01-07 21:26 ` Junio C Hamano
2021-01-04 3:09 ` [PATCH v2 7/9] index-format: discuss recursion of cached-tree better Derrick Stolee via GitGitGadget
2021-01-04 3:09 ` [PATCH v2 8/9] cache-tree: use ce_namelen() instead of strlen() René Scharfe via GitGitGadget
2021-01-04 3:09 ` [PATCH v2 9/9] cache-tree: speed up consecutive path comparisons Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 00/10] Cleanups around index operations Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 01/10] tree-walk: report recursion counts Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 02/10] unpack-trees: add trace2 regions Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 03/10] cache-tree: use trace2 in cache_tree_update() Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 04/10] cache-tree: trace regions for I/O Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 05/10] cache-tree: trace regions for prime_cache_tree Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 06/10] index-format: use 'cache tree' over 'cached tree' Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 07/10] index-format: update preamble to cache tree extension Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 08/10] index-format: discuss recursion of cached-tree better Derrick Stolee via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 09/10] cache-tree: use ce_namelen() instead of strlen() René Scharfe via GitGitGadget
2021-01-07 16:32 ` [PATCH v3 10/10] cache-tree: speed up consecutive path comparisons Derrick Stolee via GitGitGadget
2021-01-16 6:58 ` [PATCH v3 00/10] Cleanups around index operations Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.829.v2.git.1609729758.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=l.s.r@web.de \
--cc=newren@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).