From: "Justin Tobler via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v2 0/3] reftable/stack: use geometric table compaction
Date: Thu, 21 Mar 2024 22:40:16 +0000 [thread overview]
Message-ID: <pull.1683.v2.git.1711060819.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1683.git.1709669025722.gitgitgadget@gmail.com>
Hello again,
This is the second version my patch series that refactors the reftable
compaction strategy to instead follow a geometric sequence. Changes compared
to v1:
* Added GIT_TEST_REFTABLE_NO_AUTOCOMPACTION environment variable to disable
reftable compaction when testing.
* Refactored worktree tests in t0610-reftable-basics.sh to properly assert
git-pack-refs(1) works as expected.
* Added test to validate that alternating table sizes are compacted.
* Added benchmark to compare compaction strategies.
* Moved change that made compaction segment end inclusive to its own
commit.
* Added additional explanation in commits and comments and fixed typos.
Thanks for taking a look!
Justin
Justin Tobler (3):
reftable/stack: add env to disable autocompaction
reftable/stack: use geometric table compaction
reftable/segment: make segment end inclusive
reftable/stack.c | 113 ++++++++++++++++---------------------
reftable/stack.h | 3 -
reftable/stack_test.c | 66 +++++-----------------
reftable/system.h | 1 +
t/t0610-reftable-basics.sh | 43 +++++++++-----
5 files changed, 94 insertions(+), 132 deletions(-)
base-commit: 3bd955d26919e149552f34aacf8a4e6368c26cec
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1683%2Fjltobler%2Fjt%2Freftable-geometric-compaction-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1683/jltobler/jt/reftable-geometric-compaction-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1683
Range-diff vs v1:
-: ----------- > 1: cb6b152e5c8 reftable/stack: add env to disable autocompaction
1: 7a518853a10 ! 2: def70084523 reftable/stack: use geometric table compaction
@@ Commit message
occurring until a separate operation produces a table matching the
previous table log value.
- To avoid unbounded growth of the table list, walk through each table and
- evaluate if it needs to be included in the compaction segment to restore
- a geometric sequence.
+ Instead, to avoid unbounded growth of the table list, the compaction
+ strategy is updated to ensure tables follow a geometric sequence after
+ each operation. This is done by walking the table list in reverse index
+ order to identify the compaction segment start and end. The compaction
+ segment end is found by identifying the first table which has a
+ preceding table size less than twice the current table. Next, the
+ compaction segment start is found iterating through the remaining tables
+ in the list checking if the previous table size is less than twice the
+ cumulative of tables from the segment end. This ensures the correct
+ segment start is found and that the newly compacted table does not
+ violate the geometric sequence.
+
+ When creating 10 thousand references, the new strategy has no
+ performance impact:
+
+ Benchmark 1: update-ref: create refs sequentially (revision = HEAD~)
+ Time (mean ± σ): 26.516 s ± 0.047 s [User: 17.864 s, System: 8.491 s]
+ Range (min … max): 26.447 s … 26.569 s 10 runs
+
+ Benchmark 2: update-ref: create refs sequentially (revision = HEAD)
+ Time (mean ± σ): 26.417 s ± 0.028 s [User: 17.738 s, System: 8.500 s]
+ Range (min … max): 26.366 s … 26.444 s 10 runs
+
+ Summary
+ update-ref: create refs sequentially (revision = HEAD) ran
+ 1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~)
Some tests in `t0610-reftable-basics.sh` assert the on-disk state of
tables and are therefore updated to specify the correct new table count.
@@ reftable/stack.c: static int segment_size(struct segment *s)
+ * until a valid segment end is found. If the preceding table is smaller
+ * than the current table multiplied by the geometric factor (2), the
+ * current table is set as the compaction segment end.
++ *
++ * Tables after the ending point are not added to the byte count because
++ * they are already valid members of the geometric sequence. Due to the
++ * properties of a geometric sequence, it is not possible for the sum of
++ * these tables to exceed the value of the ending point table.
+ */
+ for (i = n - 1; i > 0; i--) {
+ if (sizes[i - 1] < sizes[i] * 2) {
-+ seg.end = i;
++ seg.end = i + 1;
+ bytes = sizes[i];
break;
+ }
@@ reftable/stack.c: static int segment_size(struct segment *s)
+
+ /*
+ * Find the starting table of the compaction segment by iterating
-+ * through the remaing tables and keeping track of the accumulated size
-+ * of all tables seen from the segment end table.
++ * through the remaining tables and keeping track of the accumulated
++ * size of all tables seen from the segment end table.
+ *
+ * Note that we keep iterating even after we have found the first
-+ * first starting point. This is because there may be tables in the
-+ * stack preceding that first starting point which violate the geometric
++ * starting point. This is because there may be tables in the stack
++ * preceding that first starting point which violate the geometric
+ * sequence.
+ */
+ for (; i > 0; i--) {
@@ reftable/stack.c: static int segment_size(struct segment *s)
}
static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
-@@ reftable/stack.c: int reftable_stack_auto_compact(struct reftable_stack *st)
- suggest_compaction_segment(sizes, st->merged->stack_len);
- reftable_free(sizes);
- if (segment_size(&seg) > 0)
-- return stack_compact_range_stats(st, seg.start, seg.end - 1,
-+ return stack_compact_range_stats(st, seg.start, seg.end,
- NULL);
-
- return 0;
## reftable/stack.h ##
@@ reftable/stack.h: int read_lines(const char *filename, char ***lines);
@@ reftable/stack_test.c: static void test_reftable_stack_hash_id(void)
- EXPECT(min.start == 2);
- EXPECT(min.end == 7);
+ EXPECT(min.start == 1);
-+ EXPECT(min.end == 9);
++ EXPECT(min.end == 10);
}
static void test_suggest_compaction_segment_nothing(void)
@@ t/t0610-reftable-basics.sh: test_expect_success 'ref transaction: writes cause a
test_commit -C repo --no-tag B &&
test_line_count = 1 repo/.git/reftable/tables.list
+ '
+
++test_expect_success 'ref transaction: alternating table sizes are compacted' '
++ test_when_finished "rm -rf repo" &&
++ git init repo &&
++ test_commit -C repo A &&
++ for i in $(test_seq 20)
++ do
++ git -C repo branch -f foo &&
++ git -C repo branch -d foo || return 1
++ done &&
++ test_line_count = 2 repo/.git/reftable/tables.list
++'
++
+ check_fsync_events () {
+ local trace="$1" &&
+ shift &&
@@ t/t0610-reftable-basics.sh: test_expect_success 'ref transaction: writes are synced' '
git -C repo -c core.fsync=reference \
-c core.fsyncMethod=fsync update-ref refs/heads/branch HEAD &&
@@ t/t0610-reftable-basics.sh: do
git -C repo pack-refs &&
test_expect_perms "-rw-rw-r--" repo/.git/reftable/tables.list &&
@@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: pack-refs in main repo packs main refs' '
+ test_when_finished "rm -rf repo worktree" &&
+ git init repo &&
test_commit -C repo A &&
- git -C repo worktree add ../worktree &&
+- git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
- test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
- test_line_count = 4 repo/.git/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/reftable/tables.list &&
++ test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
++ test_line_count = 3 repo/.git/reftable/tables.list &&
git -C repo pack-refs &&
- test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
++ test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
test_line_count = 1 repo/.git/reftable/tables.list
'
@@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: pack-refs in worktree packs worktree refs' '
+ test_when_finished "rm -rf repo worktree" &&
+ git init repo &&
test_commit -C repo A &&
- git -C repo worktree add ../worktree &&
+- git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
- test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
- test_line_count = 4 repo/.git/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/reftable/tables.list &&
++ test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
++ test_line_count = 3 repo/.git/reftable/tables.list &&
git -C worktree pack-refs &&
test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
- test_line_count = 4 repo/.git/reftable/tables.list
-+ test_line_count = 1 repo/.git/reftable/tables.list
++ test_line_count = 3 repo/.git/reftable/tables.list
'
test_expect_success 'worktree: creating shared ref updates main stack' '
+ test_when_finished "rm -rf repo worktree" &&
+ git init repo &&
+ test_commit -C repo A &&
++ test_commit -C repo B &&
+
+ git -C repo worktree add ../worktree &&
+ git -C repo pack-refs &&
@@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: creating shared ref updates main stack' '
+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
+ test_line_count = 1 repo/.git/reftable/tables.list &&
- git -C worktree update-ref refs/heads/shared HEAD &&
+- git -C worktree update-ref refs/heads/shared HEAD &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/heads/shared HEAD &&
test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
-- test_line_count = 2 repo/.git/reftable/tables.list
-+ test_line_count = 1 repo/.git/reftable/tables.list
+ test_line_count = 2 repo/.git/reftable/tables.list
'
-
- test_expect_success 'worktree: creating per-worktree ref updates worktree stack' '
-: ----------- > 3: a23e3fc6972 reftable/segment: make segment end inclusive
--
gitgitgadget
next prev parent reply other threads:[~2024-03-21 22:40 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-05 20:03 [PATCH] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-03-06 12:30 ` Patrick Steinhardt
2024-03-06 12:37 ` Patrick Steinhardt
2024-03-21 22:48 ` Justin Tobler
2024-03-21 22:40 ` Justin Tobler via GitGitGadget [this message]
2024-03-21 22:40 ` [PATCH v2 1/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-03-22 1:25 ` Patrick Steinhardt
2024-03-21 22:40 ` [PATCH v2 2/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-03-22 1:25 ` Patrick Steinhardt
2024-03-27 13:24 ` Karthik Nayak
2024-03-21 22:40 ` [PATCH v2 3/3] reftable/segment: make segment end inclusive Justin Tobler via GitGitGadget
2024-03-22 1:25 ` [PATCH v2 0/3] reftable/stack: use geometric table compaction Patrick Steinhardt
2024-04-03 10:13 ` Han-Wen Nienhuys
2024-04-03 10:18 ` Patrick Steinhardt
2024-04-03 15:14 ` Justin Tobler
2024-04-03 16:40 ` Junio C Hamano
2024-03-29 4:16 ` [PATCH v3 " Justin Tobler via GitGitGadget
2024-03-29 4:16 ` [PATCH v3 1/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-03-29 18:25 ` Junio C Hamano
2024-03-29 21:56 ` Junio C Hamano
2024-04-02 7:23 ` Patrick Steinhardt
2024-04-02 17:23 ` Junio C Hamano
2024-03-29 4:16 ` [PATCH v3 2/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-02 7:23 ` Patrick Steinhardt
2024-03-29 4:16 ` [PATCH v3 3/3] reftable/stack: make segment end inclusive Justin Tobler via GitGitGadget
2024-03-29 18:36 ` Junio C Hamano
2024-04-02 7:23 ` Patrick Steinhardt
2024-04-03 0:20 ` [PATCH v4 0/2] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-03 0:20 ` [PATCH v4 1/2] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-04-03 0:20 ` [PATCH v4 2/2] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-03 4:47 ` [PATCH v4 0/2] " Patrick Steinhardt
2024-04-03 11:12 ` Karthik Nayak
2024-04-03 16:56 ` Junio C Hamano
2024-04-04 18:29 ` [PATCH v5 0/3] " Justin Tobler via GitGitGadget
2024-04-04 18:29 ` [PATCH v5 1/3] reftable/stack: allow disabling of auto-compaction Justin Tobler via GitGitGadget
2024-04-08 6:12 ` Patrick Steinhardt
2024-04-04 18:29 ` [PATCH v5 2/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-04-08 6:12 ` Patrick Steinhardt
2024-04-08 16:18 ` Junio C Hamano
2024-04-04 18:29 ` [PATCH v5 3/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-08 6:12 ` [PATCH v5 0/3] " Patrick Steinhardt
2024-04-08 16:17 ` Justin Tobler
2024-04-08 16:16 ` [PATCH v6 " Justin Tobler via GitGitGadget
2024-04-08 16:16 ` [PATCH v6 1/3] reftable/stack: expose option to disable auto-compaction Justin Tobler via GitGitGadget
2024-04-08 16:16 ` [PATCH v6 2/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-04-08 16:16 ` [PATCH v6 3/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-08 16:20 ` [PATCH v6 0/3] " Patrick Steinhardt
2024-04-08 19:12 ` Junio C Hamano
2024-04-03 19:12 ` [PATCH v2 " Junio C Hamano
2024-04-03 19:30 ` Patrick Steinhardt
2024-04-04 5:34 ` Patrick Steinhardt
2024-04-04 18:28 ` Justin Tobler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1683.v2.git.1711060819.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=jltobler@gmail.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.