From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
Eric Sunshine <sunshine@sunshineco.com>,
Elijah Newren <newren@gmail.com>,
Derrick Stolee <stolee@gmail.com>,
Elijah Newren <newren@gmail.com>
Subject: [PATCH v3 0/9] Final optimization batch (#15): use memory pools
Date: Fri, 30 Jul 2021 11:47:35 +0000 [thread overview]
Message-ID: <pull.990.v3.git.1627645664.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.990.v2.git.1627531121.gitgitgadget@gmail.com>
This series textually depends on en/ort-perf-batch-14, but the ideas are
orthogonal to it and orthogonal to previous series. It can be reviewed
independently.
Changes since v1, addressing Eric's feedback:
* Fixed a comment that became out-of-date in patch 1
* Swapped commits 2 and 3 so that one can better motivate the other.
Changes since v2, addressing Peff's feedback
* Rebased on en/ort-perf-batch-14 (resolving a trivial conflict with the
new string_list_init_nodup() usage)
* Added a new preliminary patch renaming str*_func() to str*_clear_func()
* Added a new final patch that hardcodes that we'll just use memory pools
=== Basic Optimization idea ===
In this series, I make use of memory pools to get faster allocations and
deallocations for many data structures that tend to all be deallocated at
the same time anyway.
=== Results ===
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28), the
changes in just this series improves the performance as follows:
Before Series After Series
no-renames: 204.2 ms ± 3.0 ms 198.3 ms ± 2.9 ms
mega-renames: 1.076 s ± 0.015 s 661.8 ms ± 5.9 ms
just-one-mega: 364.1 ms ± 7.0 ms 264.6 ms ± 2.5 ms
As a reminder, before any merge-ort/diffcore-rename performance work, the
performance results we started with were:
no-renames-am: 6.940 s ± 0.485 s
no-renames: 18.912 s ± 0.174 s
mega-renames: 5964.031 s ± 10.459 s
just-one-mega: 149.583 s ± 0.751 s
=== Overall Results across all optimization work ===
This is my final prepared optimization series. It might be worth reviewing
how my optimizations fared overall, comparing the original merge-recursive
timings with three things: how much merge-recursive improved (as a
side-effect of optimizing merge-ort), how much improvement we would have
gotten from a hypothetical infinite parallelization of rename detection, and
what I achieved at the end with merge-ort:
Timings
Infinite
merge- merge- Parallelism
recursive recursive of rename merge-ort
v2.30.0 current detection current
---------- --------- ----------- ---------
no-renames: 18.912 s 18.030 s 11.699 s 198.3 ms
mega-renames: 5964.031 s 361.281 s 203.886 s 661.8 ms
just-one-mega: 149.583 s 11.009 s 7.553 s 264.6 ms
Speedup factors
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
no-renames: 1 1.05 1.6 95
mega-renames: 1 16.5 29 9012
just-one-mega: 1 13.6 20 565
And, for partial clone users:
Factor reduction in number of objects needed
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
mega-renames: 1 1 1 181.3
=== Caveat ===
It may be worth noting, though, that my optimization numbers above for
merge-ort use test-tool fast-rebase. git rebase -s ort on the three
testcases above is 5-20 times slower (taking 3.835s, 6.798s, and 1.235s,
respectively). At this point, any further optimization work should go into
making a faster full-featured rebase by copying the ideas from fast-rebase:
avoid unnecessary process forking, avoid updating the index and working copy
until either the rebase is finished or you hit a conflict (and don't write
rebase metadata to disk until that point either), get rid of the glacially
slow revision walking of the upstream side of history (nuke
can_fast_forward(), make --reapply-cherry-picks the default) or at least
don't revision walk so many times (multiple calls to get_merge_bases in
can_fast_forward() plus a is_linear_history() walk, checking for upstream
cherry-picks, probably more), turn off per-commit hooks that probably should
have never been on anyway, etc.
Elijah Newren (9):
merge-ort: rename str{map,intmap,set}_func()
diffcore-rename: use a mem_pool for exact rename detection's hashmap
merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
merge-ort: set up a memory pool
merge-ort: switch our strmaps over to using memory pools
diffcore-rename, merge-ort: add wrapper functions for filepair
alloc/dealloc
merge-ort: store filepairs and filespecs in our mem_pool
merge-ort: reuse path strings in pool_alloc_filespec
merge-ort: remove compile-time ability to turn off usage of memory
pools
diffcore-rename.c | 68 +++++++++++--
diffcore.h | 3 +
merge-ort.c | 238 ++++++++++++++++++++++++++++++++--------------
3 files changed, 231 insertions(+), 78 deletions(-)
base-commit: 8b09a900a1f1f00d4deb04f567994ae8f1804b5e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-990%2Fnewren%2Fort-perf-batch-15-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-990/newren/ort-perf-batch-15-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/990
Range-diff vs v2:
-: ----------- > 1: e075d985f26 merge-ort: rename str{map,intmap,set}_func()
1: ea08b34d29b = 2: 8416afa89fb diffcore-rename: use a mem_pool for exact rename detection's hashmap
2: fdfc2b93ba4 ! 3: 2c0b90eaba5 merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
@@ Metadata
## Commit message ##
merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
- We need functions which will either call
+ Make the code more flexible so that it can handle both being run with or
+ without a memory pool by adding utility functions which will either call
xmalloc, xcalloc, xstrndup
or
mem_pool_alloc, mem_pool_calloc, mem_pool_strndup
- depending on whether we have a non-NULL memory pool. Add these
- functions; a subsequent commit will make use of these.
+ depending on whether we have a non-NULL memory pool. A subsequent
+ commit will make use of these.
+
+ (We will actually be dropping these functions soon and just assuming we
+ always have a memory pool, but the flexibility was very useful during
+ development of merge-ort so I want to be able to restore it if needed.)
Signed-off-by: Elijah Newren <newren@gmail.com>
3: c7150869107 = 4: 6646f6fd1ca merge-ort: set up a memory pool
4: dd8839b2843 ! 5: 7c49aa601d0 merge-ort: switch our strmaps over to using memory pools
@@ Commit message
## merge-ort.c ##
@@ merge-ort.c: static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
- void (*strset_func)(struct strset *) =
+ void (*strset_clear_func)(struct strset *) =
reinitialize ? strset_partial_clear : strset_clear;
- /*
@@ merge-ort.c: static void clear_or_reinit_internal_opts(struct merge_options_inte
- * to deallocate them.
- */
- free_strmap_strings(&opti->paths);
-- strmap_func(&opti->paths, 1);
+- strmap_clear_func(&opti->paths, 1);
+ if (opti->pool)
-+ strmap_func(&opti->paths, 0);
++ strmap_clear_func(&opti->paths, 0);
+ else {
+ /*
+ * We marked opti->paths with strdup_strings = 0, so that
@@ merge-ort.c: static void clear_or_reinit_internal_opts(struct merge_options_inte
+ * to these strings, it is time to deallocate them.
+ */
+ free_strmap_strings(&opti->paths);
-+ strmap_func(&opti->paths, 1);
++ strmap_clear_func(&opti->paths, 1);
+ }
/*
* All keys and values in opti->conflicted are a subset of those in
@@ merge-ort.c: static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
*/
- strmap_func(&opti->conflicted, 0);
+ strmap_clear_func(&opti->conflicted, 0);
- /*
- * opti->paths_to_free is similar to opti->paths; we created it with
@@ merge-ort.c: static void merge_start(struct merge_options *opt, struct merge_res
*/
- strmap_init_with_options(&opt->priv->paths, NULL, 0);
- strmap_init_with_options(&opt->priv->conflicted, NULL, 0);
-- string_list_init(&opt->priv->paths_to_free, 0);
+- string_list_init_nodup(&opt->priv->paths_to_free);
+ strmap_init_with_options(&opt->priv->paths, pool, 0);
+ strmap_init_with_options(&opt->priv->conflicted, pool, 0);
+ if (!opt->priv->pool)
-+ string_list_init(&opt->priv->paths_to_free, 0);
++ string_list_init_nodup(&opt->priv->paths_to_free);
/*
* keys & strbufs in output will sometimes need to outlive "paths",
5: 560800a80ef = 6: 08cf2498f96 diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc
6: 94d60c8a476 = 7: 4ffa5af8b57 merge-ort: store filepairs and filespecs in our mem_pool
7: fda885dabe6 = 8: 1556f0443c3 merge-ort: reuse path strings in pool_alloc_filespec
-: ----------- > 9: de30dbac25e merge-ort: remove compile-time ability to turn off usage of memory pools
--
gitgitgadget
next prev parent reply other threads:[~2021-07-30 11:47 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-23 12:54 [PATCH 0/7] Final optimization batch (#15): use memory pools Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 1/7] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-23 21:59 ` Eric Sunshine
2021-07-23 22:03 ` Elijah Newren
2021-07-23 12:54 ` [PATCH 2/7] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 3/7] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-23 22:07 ` Eric Sunshine
2021-07-26 14:36 ` Derrick Stolee
2021-07-28 22:49 ` Elijah Newren
2021-07-29 15:26 ` Jeff King
2021-07-30 2:27 ` Elijah Newren
2021-07-30 16:12 ` Jeff King
2021-07-23 12:54 ` [PATCH 4/7] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 5/7] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 6/7] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 7/7] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-26 14:44 ` [PATCH 0/7] Final optimization batch (#15): use memory pools Derrick Stolee
2021-07-28 22:52 ` Elijah Newren
2021-07-29 3:58 ` [PATCH v2 " Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 1/7] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 2/7] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 3/7] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 4/7] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-29 15:28 ` Jeff King
2021-07-29 18:37 ` Elijah Newren
2021-07-29 20:09 ` Jeff King
2021-07-30 2:30 ` Elijah Newren
2021-07-30 16:12 ` Jeff King
2021-07-30 13:30 ` Ævar Arnfjörð Bjarmason
2021-07-30 14:36 ` Elijah Newren
2021-07-30 16:23 ` Ævar Arnfjörð Bjarmason
2021-07-29 3:58 ` [PATCH v2 5/7] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 6/7] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 7/7] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-29 14:58 ` [PATCH v2 0/7] Final optimization batch (#15): use memory pools Derrick Stolee
2021-07-29 16:20 ` Jeff King
2021-07-29 16:23 ` Jeff King
2021-07-29 19:46 ` Junio C Hamano
2021-07-29 20:48 ` Junio C Hamano
2021-07-29 21:05 ` Elijah Newren
2021-07-29 20:46 ` Elijah Newren
2021-07-29 21:14 ` Jeff King
2021-07-30 11:47 ` Elijah Newren via GitGitGadget [this message]
2021-07-30 11:47 ` [PATCH v3 1/9] merge-ort: rename str{map,intmap,set}_func() Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 2/9] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 3/9] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 4/9] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 5/9] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 6/9] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 7/9] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 8/9] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 9/9] merge-ort: remove compile-time ability to turn off usage of memory pools Elijah Newren via GitGitGadget
2021-07-30 16:24 ` Jeff King
2021-07-31 17:27 ` [PATCH v4 0/9] Final optimization batch (#15): use " Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 1/9] merge-ort: rename str{map,intmap,set}_func() Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 2/9] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 3/9] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 4/9] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 5/9] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 6/9] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 7/9] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 8/9] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 9/9] merge-ort: remove compile-time ability to turn off usage of memory pools Elijah Newren via GitGitGadget
2021-08-02 15:27 ` [PATCH v4 0/9] Final optimization batch (#15): use " Derrick Stolee
2021-08-03 15:45 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.990.v3.git.1627645664.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).