* [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups
@ 2025-04-17 21:12 Taylor Blau
2025-04-17 21:12 ` [PATCH 1/4] pack-bitmap: write lookup table extension by default Taylor Blau
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Taylor Blau @ 2025-04-17 21:12 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Jeff King, Junio C Hamano
This is a short series I extracted from a larger topic on reusing
"external"[^1] deltas during verbatim pack reuse.
As part of performance-testing that series, I realized that bitmap
lookup tables are not written by default. Since it has been a
significant period of time since their introduction, the first patch of
this series makes writing the lookup table extension the default
behavior. This is:
* pack-bitmap: write lookup table extension by default
The next three patches clean up some t/perf scripts that were redundant
now that lookup tables are the default behavior. Those are:
* p5312: removed duplicate performance test script
* t/perf: avoid testing bitmaps without lookup table
* t/perf/lib-bitmap.sh: avoid test_perf during setup
Thanks in advance for your review :-).
[^1]: The term I'm using to describe delta/base pairs which either (a)
are represented from different packs in a MIDX bitmap, or (b) the client
is known to already have the base.
Taylor Blau (4):
pack-bitmap: write lookup table extension by default
p5312: removed duplicate performance test script
t/perf: avoid testing bitmaps without lookup table
t/perf/lib-bitmap.sh: avoid test_perf during setup
Documentation/config/pack.adoc | 2 +-
builtin/multi-pack-index.c | 1 +
builtin/pack-objects.c | 2 +-
t/perf/lib-bitmap.sh | 2 +-
t/perf/p5310-pack-bitmaps.sh | 47 +++++-------
t/perf/p5311-pack-bitmaps-fetch.sh | 76 +++++++++----------
t/perf/p5312-pack-bitmaps-revs.sh | 34 ---------
t/perf/p5326-multi-pack-bitmaps.sh | 107 ++++++++++++---------------
t/perf/p5333-pseudo-merge-bitmaps.sh | 1 -
9 files changed, 106 insertions(+), 166 deletions(-)
delete mode 100755 t/perf/p5312-pack-bitmaps-revs.sh
base-commit: c152ae3ef50dc7bbbf5089571df5bba404a96e0d
--
2.49.0.226.g0e6cae136d
^ permalink raw reply [flat|nested] 18+ messages in thread* [PATCH 1/4] pack-bitmap: write lookup table extension by default 2025-04-17 21:12 [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Taylor Blau @ 2025-04-17 21:12 ` Taylor Blau 2025-04-17 22:04 ` Junio C Hamano 2025-04-17 21:12 ` [PATCH 2/4] p5312: removed duplicate performance test script Taylor Blau ` (3 subsequent siblings) 4 siblings, 1 reply; 18+ messages in thread From: Taylor Blau @ 2025-04-17 21:12 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jeff King, Junio C Hamano The lookup table extension for reachability bitmaps was first introduced via 3fe0121479 (Merge branch 'ac/bitmap-lookup-table', 2022-09-05). For each bitmapped commit, the lookup table encodes three unsigned integers: - The pack position of the commit. - An offset within the *.bitmap itself where that commit's bitmap lives. - An index into the same table where information on that commit's XOR pair can be found. Lookup tables make bitmap operations faster because they no longer have to read the entire *.bitmap file to discover what commits have corresponding reachability bitmaps. When bitmap lookup tables were first introduced, we established a baseline level of performance in p5310 with and without lookup tables. Here is the baseline without: Test this tree ----------------------------------------------------------------------- 5310.6: simulated clone 14.04(5.78+1.79) 5310.7: simulated fetch 1.95(3.05+0.20) 5310.8: pack to file (bitmap) 44.73(20.55+7.45) 5310.9: rev-list (commits) 0.78(0.46+0.10) 5310.10: rev-list (objects) 4.07(3.97+0.08) 5310.11: rev-list with tag negated via --not 0.06(0.02+0.03) --all (objects) 5310.12: rev-list with negative tag (objects) 0.21(0.15+0.05) 5310.13: rev-list count with blob:none 0.24(0.17+0.06) 5310.14: rev-list count with blob:limit=1k 7.07(5.92+0.48) 5310.15: rev-list count with tree:0 0.25(0.17+0.07) 5310.16: simulated partial clone 5.67(3.28+0.64) 5310.18: clone (partial bitmap) 16.05(8.34+1.86) 5310.19: pack to file (partial bitmap) 59.76(27.22+7.43) 5310.20: rev-list with tree filter (partial bitmap) 0.90(0.18+0.16) , and here is the same set of tests, this time with the lookup table enabled: Test this tree ----------------------------------------------------------------------- 5310.26: simulated clone 13.69(5.72+1.78) 5310.27: simulated fetch 1.84(3.02+0.16) 5310.28: pack to file (bitmap) 45.63(20.67+7.50) 5310.29: rev-list (commits) 0.56(0.39+0.8) 5310.30: rev-list (objects) 3.77(3.74+0.08) 5310.31: rev-list with tag negated via --not 0.05(0.02+0.03) --all (objects) 5310.32: rev-list with negative tag (objects) 0.21(0.15+0.05) 5310.33: rev-list count with blob:none 0.23(0.17+0.05) 5310.34: rev-list count with blob:limit=1k 6.65(5.72+0.40) 5310.35: rev-list count with tree:0 0.23(0.16+0.06) 5310.36: simulated partial clone 5.57(3.26+0.59) 5310.38: clone (partial bitmap) 15.89(8.39+1.84) 5310.39: pack to file (partial bitmap) 58.32(27.55+7.47) 5310.40: rev-list with tree filter (partial bitmap) 0.73(0.18+0.15) (All numbers here come from a ~2022-era copy of the kernel, via Abhradeep Chakraborty who implemented the lookup table extension). In the almost three years since lookup tables were introduced, GitHub has used them in production without issue, taking advantage of the above performance benefits along the way. Since this feature has had sufficient time to flush out any bugs and/or performance regressions, let's enable it by default so that all bitmap users can reap similar performance benefits. [1]: https://lore.kernel.org/git/pull.1266.git.1655728395.gitgitgadget@gmail.com/ Signed-off-by: Taylor Blau <me@ttaylorr.com> --- Documentation/config/pack.adoc | 2 +- builtin/multi-pack-index.c | 1 + builtin/pack-objects.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/config/pack.adoc b/Documentation/config/pack.adoc index da527377fa..ba538e3d9c 100644 --- a/Documentation/config/pack.adoc +++ b/Documentation/config/pack.adoc @@ -191,7 +191,7 @@ pack.writeBitmapLookupTable:: bitmap index (if one is written). This table is used to defer loading individual bitmaps as late as possible. This can be beneficial in repositories that have relatively large bitmap - indexes. Defaults to false. + indexes. Defaults to true. pack.readReverseIndex:: When true, git will read any .rev file(s) that may be available diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 2a938466f5..6ad6f814e3 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -142,6 +142,7 @@ static int cmd_multi_pack_index_write(int argc, const char **argv, int ret; opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; + opts.flags |= MIDX_WRITE_BITMAP_LOOKUP_TABLE; git_config(git_multi_pack_index_write_config, NULL); diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 3973267e9e..384fefbb1d 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -239,7 +239,7 @@ static enum { WRITE_BITMAP_QUIET, WRITE_BITMAP_TRUE, } write_bitmap_index; -static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE; +static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE | BITMAP_OPT_LOOKUP_TABLE; static int exclude_promisor_objects; static int exclude_promisor_objects_best_effort; -- 2.49.0.226.g0e6cae136d ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] pack-bitmap: write lookup table extension by default 2025-04-17 21:12 ` [PATCH 1/4] pack-bitmap: write lookup table extension by default Taylor Blau @ 2025-04-17 22:04 ` Junio C Hamano 2025-04-18 9:33 ` Jeff King 0 siblings, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2025-04-17 22:04 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Elijah Newren, Jeff King Taylor Blau <me@ttaylorr.com> writes: > diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c > index 2a938466f5..6ad6f814e3 100644 > --- a/builtin/multi-pack-index.c > +++ b/builtin/multi-pack-index.c > @@ -142,6 +142,7 @@ static int cmd_multi_pack_index_write(int argc, const char **argv, > int ret; > > opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; > + opts.flags |= MIDX_WRITE_BITMAP_LOOKUP_TABLE; > > git_config(git_multi_pack_index_write_config, NULL); > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index 3973267e9e..384fefbb1d 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -239,7 +239,7 @@ static enum { > WRITE_BITMAP_QUIET, > WRITE_BITMAP_TRUE, > } write_bitmap_index; > -static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE; > +static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE | BITMAP_OPT_LOOKUP_TABLE; Are these two hunks required to be kept in sync? If so, I am wondering what is the right approach to make sure they are. The definition of MIDX_WRITE_BITMAP_* flag bits in midx.h and BITMAP_OPT_* flag bits in write_bitmap_index enum are distinct two sets, and we need a way to somehow convert between them back and forth if we really wanted to ensure that these "default" values are kept in sync automatically. The reason I ask is mostly because I do not know offhand, and I would imagine that it would be hard for third-parties to verify, if these are only two places that need to be updated in order for lookup table extensions to be written by default, when somebody new wants to further update the default. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] pack-bitmap: write lookup table extension by default 2025-04-17 22:04 ` Junio C Hamano @ 2025-04-18 9:33 ` Jeff King 2025-04-18 15:44 ` Junio C Hamano 0 siblings, 1 reply; 18+ messages in thread From: Jeff King @ 2025-04-18 9:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: Taylor Blau, git, Elijah Newren On Thu, Apr 17, 2025 at 03:04:00PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c > > index 2a938466f5..6ad6f814e3 100644 > > --- a/builtin/multi-pack-index.c > > +++ b/builtin/multi-pack-index.c > > @@ -142,6 +142,7 @@ static int cmd_multi_pack_index_write(int argc, const char **argv, > > int ret; > > > > opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; > > + opts.flags |= MIDX_WRITE_BITMAP_LOOKUP_TABLE; > > > > git_config(git_multi_pack_index_write_config, NULL); > > > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > > index 3973267e9e..384fefbb1d 100644 > > --- a/builtin/pack-objects.c > > +++ b/builtin/pack-objects.c > > @@ -239,7 +239,7 @@ static enum { > > WRITE_BITMAP_QUIET, > > WRITE_BITMAP_TRUE, > > } write_bitmap_index; > > -static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE; > > +static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE | BITMAP_OPT_LOOKUP_TABLE; > > Are these two hunks required to be kept in sync? They're not technically required to be in sync. It is OK for the midx bitmaps to have different options than the ones we make for packs. And in theory they could intentionally diverge, though in practice I don't think we (yet) have any extensions or options that would be more appropriate for one or the other. So if we did want to join them, I think it would make sense to still be able to use different flags for each situation, but initialize them from a common definition. > If so, I am wondering what is the right approach to make sure they > are. The definition of MIDX_WRITE_BITMAP_* flag bits in midx.h and > BITMAP_OPT_* flag bits in write_bitmap_index enum are distinct two > sets, and we need a way to somehow convert between them back and > forth if we really wanted to ensure that these "default" values are > kept in sync automatically. Yeah, I think that would be much simpler if the bitmap options were held separately from the other midx flags, and then we could use the same values for each (and a common "defaults" definition). Which is conceptually simple, but means we have to plumb a separate variable through a bunch of intermediate functions. See the (untested) patch below. I dunno if it's worth it for something that doesn't really change all that often. OTOH, it does remove the extra layer of translation in write_midx_bitmap(). diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 2a938466f5..fb073a946a 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -10,6 +10,7 @@ #include "object-store-ll.h" #include "replace-object.h" #include "repository.h" +#include "pack-bitmap.h" #define BUILTIN_MIDX_WRITE_USAGE \ N_("git multi-pack-index [<options>] write [--preferred-pack=<pack>]" \ @@ -54,6 +55,7 @@ static struct opts_multi_pack_index { char *refs_snapshot; unsigned long batch_size; unsigned flags; + uint16_t bitmap_options; int stdin_packs; } opts; @@ -89,16 +91,16 @@ static int git_multi_pack_index_write_config(const char *var, const char *value, { if (!strcmp(var, "pack.writebitmaphashcache")) { if (git_config_bool(var, value)) - opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; + opts.bitmap_options |= BITMAP_OPT_HASH_CACHE; else - opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE; + opts.bitmap_options &= ~BITMAP_OPT_HASH_CACHE; } if (!strcmp(var, "pack.writebitmaplookuptable")) { if (git_config_bool(var, value)) - opts.flags |= MIDX_WRITE_BITMAP_LOOKUP_TABLE; + opts.bitmap_options |= BITMAP_OPT_LOOKUP_TABLE; else - opts.flags &= ~MIDX_WRITE_BITMAP_LOOKUP_TABLE; + opts.bitmap_options &= ~BITMAP_OPT_LOOKUP_TABLE; } /* @@ -141,7 +143,7 @@ static int cmd_multi_pack_index_write(int argc, const char **argv, }; int ret; - opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; + opts.bitmap_options |= BITMAP_OPT_HASH_CACHE; git_config(git_multi_pack_index_write_config, NULL); @@ -167,7 +169,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv, ret = write_midx_file_only(repo, opts.object_dir, &packs, opts.preferred_pack, - opts.refs_snapshot, opts.flags); + opts.refs_snapshot, opts.flags, + opts.bitmap_options); string_list_clear(&packs, 0); free(opts.refs_snapshot); @@ -177,7 +180,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv, } ret = write_midx_file(repo, opts.object_dir, opts.preferred_pack, - opts.refs_snapshot, opts.flags); + opts.refs_snapshot, opts.flags, + opts.bitmap_options); free(opts.refs_snapshot); return ret; @@ -236,7 +240,8 @@ static int cmd_multi_pack_index_expire(int argc, const char **argv, FREE_AND_NULL(options); - return expire_midx_packs(the_repository, opts.object_dir, opts.flags); + return expire_midx_packs(the_repository, opts.object_dir, opts.flags, + opts.bitmap_options); } static int cmd_multi_pack_index_repack(int argc, const char **argv, @@ -269,7 +274,8 @@ static int cmd_multi_pack_index_repack(int argc, const char **argv, FREE_AND_NULL(options); return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, opts.flags); + (size_t)opts.batch_size, opts.flags, + opts.bitmap_options); } int cmd_multi_pack_index(int argc, diff --git a/builtin/repack.c b/builtin/repack.c index f3330ade7b..69c9a8bc4c 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -1562,7 +1562,7 @@ int cmd_repack(int argc, if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL, 0)) flags |= MIDX_WRITE_INCREMENTAL; write_midx_file(the_repository, repo_get_object_directory(the_repository), - NULL, NULL, flags); + NULL, NULL, flags, 0); } cleanup: diff --git a/midx-write.c b/midx-write.c index 48a4dc5e94..5729b989cb 100644 --- a/midx-write.c +++ b/midx-write.c @@ -841,10 +841,10 @@ static int write_midx_bitmap(struct write_midx_context *ctx, struct packing_data *pdata, struct commit **commits, uint32_t commits_nr, - unsigned flags) + unsigned flags, + uint16_t options) { int ret, i; - uint16_t options = 0; struct bitmap_writer writer; struct pack_idx_entry **index; struct strbuf bitmap_name = STRBUF_INIT; @@ -859,12 +859,6 @@ static int write_midx_bitmap(struct write_midx_context *ctx, get_midx_filename_ext(ctx->repo->hash_algo, &bitmap_name, object_dir, midx_hash, MIDX_EXT_BITMAP); - if (flags & MIDX_WRITE_BITMAP_HASH_CACHE) - options |= BITMAP_OPT_HASH_CACHE; - - if (flags & MIDX_WRITE_BITMAP_LOOKUP_TABLE) - options |= BITMAP_OPT_LOOKUP_TABLE; - /* * Build the MIDX-order index based on pdata.objects (which is already * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of @@ -1070,7 +1064,8 @@ static int write_midx_internal(struct repository *r, const char *object_dir, struct string_list *packs_to_drop, const char *preferred_pack_name, const char *refs_snapshot, - unsigned flags) + unsigned flags, + uint16_t bitmap_options) { struct strbuf midx_name = STRBUF_INIT; unsigned char midx_hash[GIT_MAX_RAWSZ]; @@ -1433,7 +1428,7 @@ static int write_midx_internal(struct repository *r, const char *object_dir, if (write_midx_bitmap(&ctx, object_dir, midx_hash, &pdata, commits, commits_nr, - flags) < 0) { + flags, bitmap_options) < 0) { error(_("could not write multi-pack bitmap")); result = 1; clear_packing_data(&pdata); @@ -1529,23 +1524,27 @@ static int write_midx_internal(struct repository *r, const char *object_dir, int write_midx_file(struct repository *r, const char *object_dir, const char *preferred_pack_name, - const char *refs_snapshot, unsigned flags) + const char *refs_snapshot, unsigned flags, + uint16_t bitmap_options) { return write_midx_internal(r, object_dir, NULL, NULL, preferred_pack_name, refs_snapshot, - flags); + flags, bitmap_options); } int write_midx_file_only(struct repository *r, const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, - const char *refs_snapshot, unsigned flags) + const char *refs_snapshot, unsigned flags, + uint16_t bitmap_options) { return write_midx_internal(r, object_dir, packs_to_include, NULL, - preferred_pack_name, refs_snapshot, flags); + preferred_pack_name, refs_snapshot, flags, + bitmap_options); } -int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags) +int expire_midx_packs(struct repository *r, const char *object_dir, + unsigned flags, uint16_t bitmap_options) { uint32_t i, *count, result = 0; struct string_list packs_to_drop = STRING_LIST_INIT_DUP; @@ -1603,7 +1602,8 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla if (packs_to_drop.nr) result = write_midx_internal(r, object_dir, NULL, - &packs_to_drop, NULL, NULL, flags); + &packs_to_drop, NULL, NULL, flags, + bitmap_options); string_list_clear(&packs_to_drop, 0); @@ -1718,7 +1718,8 @@ static void fill_included_packs_batch(struct repository *r, free(pack_info); } -int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags) +int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, + unsigned flags, uint16_t bitmap_options) { int result = 0; uint32_t i, packs_to_repack = 0; @@ -1801,7 +1802,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, } result = write_midx_internal(r, object_dir, NULL, NULL, NULL, NULL, - flags); + flags, bitmap_options); cleanup: free(include_pack); diff --git a/midx.h b/midx.h index 9d1374cbd5..5a79302a7a 100644 --- a/midx.h +++ b/midx.h @@ -81,8 +81,6 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) #define MIDX_WRITE_REV_INDEX (1 << 1) #define MIDX_WRITE_BITMAP (1 << 2) -#define MIDX_WRITE_BITMAP_HASH_CACHE (1 << 3) -#define MIDX_WRITE_BITMAP_LOOKUP_TABLE (1 << 4) #define MIDX_WRITE_INCREMENTAL (1 << 5) #define MIDX_EXT_REV "rev" @@ -131,15 +129,16 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i */ int write_midx_file(struct repository *r, const char *object_dir, const char *preferred_pack_name, const char *refs_snapshot, - unsigned flags); + unsigned flags, uint16_t bitmap_options); int write_midx_file_only(struct repository *r, const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, - const char *refs_snapshot, unsigned flags); + const char *refs_snapshot, unsigned flags, + uint16_t bitmap_options); void clear_midx_file(struct repository *r); int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags); -int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags); -int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags); +int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags, uint16_t bitmap_options); +int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags, uint16_t bitmap_options); void close_midx(struct multi_pack_index *m); ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] pack-bitmap: write lookup table extension by default 2025-04-18 9:33 ` Jeff King @ 2025-04-18 15:44 ` Junio C Hamano 2025-04-18 21:52 ` Taylor Blau 0 siblings, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2025-04-18 15:44 UTC (permalink / raw) To: Jeff King; +Cc: Taylor Blau, git, Elijah Newren Jeff King <peff@peff.net> writes: > They're not technically required to be in sync. It is OK for the midx > bitmaps to have different options than the ones we make for packs. And > in theory they could intentionally diverge, though in practice I don't > think we (yet) have any extensions or options that would be more > appropriate for one or the other. > > So if we did want to join them, I think it would make sense to still be > able to use different flags for each situation, but initialize them from > a common definition. Thanks for great explanation---I guess it is not worth pursuing, then. It is not like it would make the system misbehave when two are set differently. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] pack-bitmap: write lookup table extension by default 2025-04-18 15:44 ` Junio C Hamano @ 2025-04-18 21:52 ` Taylor Blau 0 siblings, 0 replies; 18+ messages in thread From: Taylor Blau @ 2025-04-18 21:52 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git, Elijah Newren On Fri, Apr 18, 2025 at 08:44:29AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > They're not technically required to be in sync. It is OK for the midx > > bitmaps to have different options than the ones we make for packs. And > > in theory they could intentionally diverge, though in practice I don't > > think we (yet) have any extensions or options that would be more > > appropriate for one or the other. > > > > So if we did want to join them, I think it would make sense to still be > > able to use different flags for each situation, but initialize them from > > a common definition. > > Thanks for great explanation---I guess it is not worth pursuing, > then. It is not like it would make the system misbehave when two > are set differently. Hah, Peff beat me to it. I saw your reply last night and was going to write you a very similar response. I think the summary from my perspective would be that: the two could fall out-of-sync intentionally if we want the two commands to ever have different defaults. Tangentially we could use some common "bitmap_flags" field whose bits are defined in pack-bitmap.h and used in both places. The latter is a bit awkward currently because the current "flags" that we pass into the MIDX machinery from the builtin all have MIDX-specific meanings. So we would have to either make sure that MIDX uses separate bit positions (which is awful and far too fragile for my comfort/taste) or store them as a separate set of flags (I think what Peff is getting at above). Thanks, Taylor ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/4] p5312: removed duplicate performance test script 2025-04-17 21:12 [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Taylor Blau 2025-04-17 21:12 ` [PATCH 1/4] pack-bitmap: write lookup table extension by default Taylor Blau @ 2025-04-17 21:12 ` Taylor Blau 2025-04-17 22:08 ` Junio C Hamano 2025-04-17 21:12 ` [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table Taylor Blau ` (2 subsequent siblings) 4 siblings, 1 reply; 18+ messages in thread From: Taylor Blau @ 2025-04-17 21:12 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jeff King, Junio C Hamano When the reachability bitmap format learned to read and write a lookup table containing the set of commits which received reachability bitmaps, commit 761416ef91 (bitmap-lookup-table: add performance tests for lookup table, 2022-08-14) added that mirrored p5310 but with reverse indexes enabled. Later on in a8dd7e05b1 (config: enable `pack.writeReverseIndex` by default, 2023-04-12), we enabled reverse indexes by default, which made these two tests indistinguishable from one another. Commit a8dd7e05b1 should have removed p5312 as a duplicate, but didn't do so. Correct that by removing p5312 as a functional duplicate of p5310. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- t/perf/p5312-pack-bitmaps-revs.sh | 34 ------------------------------- 1 file changed, 34 deletions(-) delete mode 100755 t/perf/p5312-pack-bitmaps-revs.sh diff --git a/t/perf/p5312-pack-bitmaps-revs.sh b/t/perf/p5312-pack-bitmaps-revs.sh deleted file mode 100755 index ceec60656b..0000000000 --- a/t/perf/p5312-pack-bitmaps-revs.sh +++ /dev/null @@ -1,34 +0,0 @@ -#!/bin/sh - -test_description='Tests pack performance using bitmaps (rev index enabled)' -. ./perf-lib.sh -. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" - -test_lookup_pack_bitmap () { - test_expect_success 'start the test from scratch' ' - rm -rf * .git - ' - - test_perf_large_repo - - test_expect_success 'setup bitmap config' ' - git config pack.writebitmaps true - ' - - # we need to create the tag up front such that it is covered by the repack and - # thus by generated bitmaps. - test_expect_success 'create tags' ' - git tag --message="tag pointing to HEAD" perf-tag HEAD - ' - - test_perf "enable lookup table: $1" ' - git config pack.writeBitmapLookupTable '"$1"' - ' - - test_pack_bitmap -} - -test_lookup_pack_bitmap false -test_lookup_pack_bitmap true - -test_done -- 2.49.0.226.g0e6cae136d ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 2/4] p5312: removed duplicate performance test script 2025-04-17 21:12 ` [PATCH 2/4] p5312: removed duplicate performance test script Taylor Blau @ 2025-04-17 22:08 ` Junio C Hamano 2025-04-18 21:57 ` Taylor Blau 0 siblings, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2025-04-17 22:08 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Elijah Newren, Jeff King Taylor Blau <me@ttaylorr.com> writes: > Subject: Re: [PATCH 2/4] p5312: removed duplicate performance test script "removed" -> "remove"??? > When the reachability bitmap format learned to read and write a lookup > table containing the set of commits which received reachability bitmaps, > commit 761416ef91 (bitmap-lookup-table: add performance tests for lookup > table, 2022-08-14) added that mirrored p5310 but with reverse indexes > enabled. "added that" -> "added a <something> that"??? > Later on in a8dd7e05b1 (config: enable `pack.writeReverseIndex` by > default, 2023-04-12), we enabled reverse indexes by default, which made > these two tests indistinguishable from one another. Commit a8dd7e05b1 > should have removed p5312 as a duplicate, but didn't do so. Or to retain the same coverage, it should have explicitly disabled reverse index in one of the tests, while allowing the other to use the reverse index enabled by default, perhaps? > Correct that by removing p5312 as a functional duplicate of p5310. > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > t/perf/p5312-pack-bitmaps-revs.sh | 34 ------------------------------- > 1 file changed, 34 deletions(-) > delete mode 100755 t/perf/p5312-pack-bitmaps-revs.sh > > diff --git a/t/perf/p5312-pack-bitmaps-revs.sh b/t/perf/p5312-pack-bitmaps-revs.sh > deleted file mode 100755 > index ceec60656b..0000000000 > --- a/t/perf/p5312-pack-bitmaps-revs.sh > +++ /dev/null > @@ -1,34 +0,0 @@ > -#!/bin/sh > - > -test_description='Tests pack performance using bitmaps (rev index enabled)' > -. ./perf-lib.sh > -. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" > - > -test_lookup_pack_bitmap () { > - test_expect_success 'start the test from scratch' ' > - rm -rf * .git > - ' > - > - test_perf_large_repo > - > - test_expect_success 'setup bitmap config' ' > - git config pack.writebitmaps true > - ' > - > - # we need to create the tag up front such that it is covered by the repack and > - # thus by generated bitmaps. > - test_expect_success 'create tags' ' > - git tag --message="tag pointing to HEAD" perf-tag HEAD > - ' > - > - test_perf "enable lookup table: $1" ' > - git config pack.writeBitmapLookupTable '"$1"' > - ' > - > - test_pack_bitmap > -} > - > -test_lookup_pack_bitmap false > -test_lookup_pack_bitmap true > - > -test_done ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/4] p5312: removed duplicate performance test script 2025-04-17 22:08 ` Junio C Hamano @ 2025-04-18 21:57 ` Taylor Blau 0 siblings, 0 replies; 18+ messages in thread From: Taylor Blau @ 2025-04-18 21:57 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Elijah Newren, Jeff King On Thu, Apr 17, 2025 at 03:08:59PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > Subject: Re: [PATCH 2/4] p5312: removed duplicate performance test script > > "removed" -> "remove"??? > > > When the reachability bitmap format learned to read and write a lookup > > table containing the set of commits which received reachability bitmaps, > > commit 761416ef91 (bitmap-lookup-table: add performance tests for lookup > > table, 2022-08-14) added that mirrored p5310 but with reverse indexes > > enabled. > > "added that" -> "added a <something> that"??? I am embarrassed. These are both awful. Here's the relevant portion of the range-diff: 2: 51c4604e16 ! 2: a80a7b5e60 p5312: removed duplicate performance test script @@ Metadata Author: Taylor Blau <me@ttaylorr.com> ## Commit message ## - p5312: removed duplicate performance test script + p5312: remove duplicate performance test script When the reachability bitmap format learned to read and write a lookup table containing the set of commits which received reachability bitmaps, commit 761416ef91 (bitmap-lookup-table: add performance tests for lookup - table, 2022-08-14) added that mirrored p5310 but with reverse indexes - enabled. + table, 2022-08-14) added a new performance test script mirroring p5310 + but with reverse indexes enabled. Later on in a8dd7e05b1 (config: enable `pack.writeReverseIndex` by default, 2023-04-12), we enabled reverse indexes by default, which made > > Later on in a8dd7e05b1 (config: enable `pack.writeReverseIndex` by > > default, 2023-04-12), we enabled reverse indexes by default, which made > > these two tests indistinguishable from one another. Commit a8dd7e05b1 > > should have removed p5312 as a duplicate, but didn't do so. > > Or to retain the same coverage, it should have explicitly disabled > reverse index in one of the tests, while allowing the other to use > the reverse index enabled by default, perhaps? I don't think we necessarily would benefit from having two variants of this performance test. It is a little annoying to maintain, but that isn't the main reason that I proposed removing it here. I think that the pair of performance tests were useful in proving out the lookup table extension as useful to bitmaps' performance characteristics by comparison to the non-lookup table version. In that sense, I think the pair of performance tests were useful as a contrast to one another. Since we have evidence of their usefulness, the contrast is less important IMHO. I think we still want to keep the "lookup table enabled" version to prevent regressions, though. Thanks, Taylor ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table 2025-04-17 21:12 [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Taylor Blau 2025-04-17 21:12 ` [PATCH 1/4] pack-bitmap: write lookup table extension by default Taylor Blau 2025-04-17 21:12 ` [PATCH 2/4] p5312: removed duplicate performance test script Taylor Blau @ 2025-04-17 21:12 ` Taylor Blau 2025-04-17 22:21 ` Junio C Hamano 2025-04-17 21:12 ` [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup Taylor Blau 2025-05-02 21:21 ` [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Junio C Hamano 4 siblings, 1 reply; 18+ messages in thread From: Taylor Blau @ 2025-04-17 21:12 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jeff King, Junio C Hamano In a previous commit, the setting which controls whether or not the pack- and MIDX-bitmap machinery writes a lookup table, 'pack.writeBitmapLookupTable' was enabled by default. As a result, we can clean up many of our bitmap-related performance tests. Many of the relevant performance tests look something like: test_it () { test_expect_success 'setup pack.writeBitmapLookupTable' ' git config pack.writeBitmapLookupTable '"$1"' ' # ... } test_it true test_it false , where the two invocations of 'test_it' run the tests with and without bitmap lookup tables enabled. But now that lookup tables are enabled by default and have proven to be a performance win, let's avoid benchmarking what is now an uncommon and non-default scenario. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- t/perf/p5310-pack-bitmaps.sh | 47 +++++------- t/perf/p5311-pack-bitmaps-fetch.sh | 76 +++++++++---------- t/perf/p5326-multi-pack-bitmaps.sh | 107 ++++++++++++--------------- t/perf/p5333-pseudo-merge-bitmaps.sh | 1 - 4 files changed, 102 insertions(+), 129 deletions(-) diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index b1399f1007..52f9fca14b 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -4,37 +4,28 @@ test_description='Tests pack performance using bitmaps' . ./perf-lib.sh . "${TEST_DIRECTORY}/perf/lib-bitmap.sh" -test_lookup_pack_bitmap () { - test_expect_success 'start the test from scratch' ' - rm -rf * .git - ' +test_expect_success 'start the test from scratch' ' + rm -rf * .git +' - test_perf_large_repo +test_perf_large_repo - # note that we do everything through config, - # since we want to be able to compare bitmap-aware - # git versus non-bitmap git - # - # We intentionally use the deprecated pack.writebitmaps - # config so that we can test against older versions of git. - test_expect_success 'setup bitmap config' ' - git config pack.writebitmaps true - ' +# note that we do everything through config, +# since we want to be able to compare bitmap-aware +# git versus non-bitmap git +# +# We intentionally use the deprecated pack.writebitmaps +# config so that we can test against older versions of git. +test_expect_success 'setup bitmap config' ' + git config pack.writebitmaps true +' - # we need to create the tag up front such that it is covered by the repack and - # thus by generated bitmaps. - test_expect_success 'create tags' ' - git tag --message="tag pointing to HEAD" perf-tag HEAD - ' +# we need to create the tag up front such that it is covered by the repack and +# thus by generated bitmaps. +test_expect_success 'create tags' ' + git tag --message="tag pointing to HEAD" perf-tag HEAD +' - test_perf "enable lookup table: $1" ' - git config pack.writeBitmapLookupTable '"$1"' - ' - - test_pack_bitmap -} - -test_lookup_pack_bitmap false -test_lookup_pack_bitmap true +test_pack_bitmap test_done diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh index 047efb995d..75b05a600e 100755 --- a/t/perf/p5311-pack-bitmaps-fetch.sh +++ b/t/perf/p5311-pack-bitmaps-fetch.sh @@ -3,52 +3,46 @@ test_description='performance of fetches from bitmapped packs' . ./perf-lib.sh -test_fetch_bitmaps () { - test_expect_success 'setup test directory' ' - rm -fr * .git - ' - - test_perf_default_repo +test_expect_success 'setup test directory' ' + rm -fr * .git +' - test_expect_success 'create bitmapped server repo' ' - git config pack.writebitmaps true && - git config pack.writeBitmapLookupTable '"$1"' && - git repack -ad - ' +test_perf_default_repo - # simulate a fetch from a repository that last fetched N days ago, for - # various values of N. We do so by following the first-parent chain, - # and assume the first entry in the chain that is N days older than the current - # HEAD is where the HEAD would have been then. - for days in 1 2 4 8 16 32 64 128; do - title=$(printf '%10s' "($days days)") - test_expect_success "setup revs from $days days ago" ' - now=$(git log -1 --format=%ct HEAD) && - then=$(($now - ($days * 86400))) && - tip=$(git rev-list -1 --first-parent --until=$then HEAD) && - { - echo HEAD && - echo ^$tip - } >revs - ' +test_expect_success 'create bitmapped server repo' ' + git config pack.writebitmaps true && + git repack -ad +' - test_perf "server $title (lookup=$1)" ' - git pack-objects --stdout --revs \ - --thin --delta-base-offset \ - <revs >tmp.pack - ' +# simulate a fetch from a repository that last fetched N days ago, for +# various values of N. We do so by following the first-parent chain, +# and assume the first entry in the chain that is N days older than the current +# HEAD is where the HEAD would have been then. +for days in 1 2 4 8 16 32 64 128; do + title=$(printf '%10s' "($days days)") + test_expect_success "setup revs from $days days ago" ' + now=$(git log -1 --format=%ct HEAD) && + then=$(($now - ($days * 86400))) && + tip=$(git rev-list -1 --first-parent --until=$then HEAD) && + { + echo HEAD && + echo ^$tip + } >revs + ' - test_size "size $title" ' - test_file_size tmp.pack - ' + test_perf "server $title" ' + git pack-objects --stdout --revs \ + --thin --delta-base-offset \ + <revs >tmp.pack + ' - test_perf "client $title (lookup=$1)" ' - git index-pack --stdin --fix-thin <tmp.pack - ' - done -} + test_size "size $title" ' + test_file_size tmp.pack + ' -test_fetch_bitmaps true -test_fetch_bitmaps false + test_perf "client $title" ' + git index-pack --stdin --fix-thin <tmp.pack + ' +done test_done diff --git a/t/perf/p5326-multi-pack-bitmaps.sh b/t/perf/p5326-multi-pack-bitmaps.sh index d082e6cacb..9f32582dec 100755 --- a/t/perf/p5326-multi-pack-bitmaps.sh +++ b/t/perf/p5326-multi-pack-bitmaps.sh @@ -4,64 +4,53 @@ test_description='Tests performance using midx bitmaps' . ./perf-lib.sh . "${TEST_DIRECTORY}/perf/lib-bitmap.sh" -test_bitmap () { - local enabled="$1" - - test_expect_success "remove existing repo (lookup=$enabled)" ' - rm -fr * .git - ' - - test_perf_large_repo - - # we need to create the tag up front such that it is covered by the repack and - # thus by generated bitmaps. - test_expect_success 'create tags' ' - git tag --message="tag pointing to HEAD" perf-tag HEAD - ' - - test_expect_success "use lookup table: $enabled" ' - git config pack.writeBitmapLookupTable '"$enabled"' - ' - - test_expect_success "start with bitmapped pack (lookup=$enabled)" ' - git repack -adb - ' - - test_perf "setup multi-pack index (lookup=$enabled)" ' - git multi-pack-index write --bitmap - ' - - test_expect_success "drop pack bitmap (lookup=$enabled)" ' - rm -f .git/objects/pack/pack-*.bitmap - ' - - test_full_bitmap - - test_expect_success "create partial bitmap state (lookup=$enabled)" ' - # pick a commit to represent the repo tip in the past - cutoff=$(git rev-list HEAD~100 -1) && - orig_tip=$(git rev-parse HEAD) && - - # now pretend we have just one tip - rm -rf .git/logs .git/refs/* .git/packed-refs && - git update-ref HEAD $cutoff && - - # and then repack, which will leave us with a nice - # big bitmap pack of the "old" history, and all of - # the new history will be loose, as if it had been pushed - # up incrementally and exploded via unpack-objects - git repack -Ad && - git multi-pack-index write --bitmap && - - # and now restore our original tip, as if the pushes - # had happened - git update-ref HEAD $orig_tip - ' - - test_partial_bitmap -} - -test_bitmap false -test_bitmap true +test_expect_success "remove existing repo" ' + rm -fr * .git +' + +test_perf_large_repo + +# we need to create the tag up front such that it is covered by the repack and +# thus by generated bitmaps. +test_expect_success 'create tags' ' + git tag --message="tag pointing to HEAD" perf-tag HEAD +' + +test_expect_success "start with bitmapped pack" ' + git repack -adb +' + +test_perf "setup multi-pack index" ' + git multi-pack-index write --bitmap +' + +test_expect_success "drop pack bitmap" ' + rm -f .git/objects/pack/pack-*.bitmap +' + +test_full_bitmap + +test_expect_success "create partial bitmap state" ' + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now pretend we have just one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + git multi-pack-index write --bitmap && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip +' + +test_partial_bitmap test_done diff --git a/t/perf/p5333-pseudo-merge-bitmaps.sh b/t/perf/p5333-pseudo-merge-bitmaps.sh index 2e8b1d2635..91b1c06745 100755 --- a/t/perf/p5333-pseudo-merge-bitmaps.sh +++ b/t/perf/p5333-pseudo-merge-bitmaps.sh @@ -11,7 +11,6 @@ test_expect_success 'setup' ' -c bitmapPseudoMerge.all.threshold=now \ -c bitmapPseudoMerge.all.stableThreshold=never \ -c bitmapPseudoMerge.all.maxMerges=64 \ - -c pack.writeBitmapLookupTable=true \ repack -adb ' -- 2.49.0.226.g0e6cae136d ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table 2025-04-17 21:12 ` [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table Taylor Blau @ 2025-04-17 22:21 ` Junio C Hamano 2025-04-18 4:24 ` Junio C Hamano 0 siblings, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2025-04-17 22:21 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Elijah Newren, Jeff King Taylor Blau <me@ttaylorr.com> writes: > In a previous commit, the setting which controls whether or not the > pack- and MIDX-bitmap machinery writes a lookup table, > 'pack.writeBitmapLookupTable' was enabled by default. > > As a result, we can clean up many of our bitmap-related performance > tests. Many of the relevant performance tests look something like: > > test_it () { > test_expect_success 'setup pack.writeBitmapLookupTable' ' > git config pack.writeBitmapLookupTable '"$1"' > ' > > # ... > } > > test_it true > test_it false > > , where the two invocations of 'test_it' run the tests with and without > bitmap lookup tables enabled. > > But now that lookup tables are enabled by default and have proven to be > a performance win, let's avoid benchmarking what is now an uncommon and > non-default scenario. > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- Hmph, how costly are these tests to run and maintain? I somehow have a feeling that removal of these "performance" tests is less worrysome than removing correctness tests, but as long as we claim to support both configurations (i.e. with and without lookup tables), it feels a bit premature to remove tests for one of them. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table 2025-04-17 22:21 ` Junio C Hamano @ 2025-04-18 4:24 ` Junio C Hamano 2025-04-18 10:02 ` Jeff King 0 siblings, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2025-04-18 4:24 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Elijah Newren, Jeff King Junio C Hamano <gitster@pobox.com> writes: > I somehow have a feeling that removal of these "performance" tests > is less worrysome than removing correctness tests, but as long as we > claim to support both configurations (i.e. with and without lookup > tables), it feels a bit premature to remove tests for one of them. In case the implication was missed, I was hinting that in the longer term, once one variant proves to be better than the other variant(s) in any and all aspects, it would be a great move to remove the other one(s). It is exactly what is happening on the recursive-ort front. Once we become so confident about correctness and performance with the configuration with lookup tables that we are willing to lose an escape hatch to operate without them, we can obviously remove these tests for configuration without lookup tables. If we are not there yet, and still rely on the "escape hatch" value of the configuration that does not use the lookup tables, we want to make sure that the escape hatch still functions, right ;-)? Thanks. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table 2025-04-18 4:24 ` Junio C Hamano @ 2025-04-18 10:02 ` Jeff King 0 siblings, 0 replies; 18+ messages in thread From: Jeff King @ 2025-04-18 10:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: Taylor Blau, git, Elijah Newren On Thu, Apr 17, 2025 at 09:24:46PM -0700, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > I somehow have a feeling that removal of these "performance" tests > > is less worrysome than removing correctness tests, but as long as we > > claim to support both configurations (i.e. with and without lookup > > tables), it feels a bit premature to remove tests for one of them. > > In case the implication was missed, I was hinting that in the longer > term, once one variant proves to be better than the other variant(s) > in any and all aspects, it would be a great move to remove the other > one(s). It is exactly what is happening on the recursive-ort front. > > Once we become so confident about correctness and performance with > the configuration with lookup tables that we are willing to lose an > escape hatch to operate without them, we can obviously remove these > tests for configuration without lookup tables. If we are not there > yet, and still rely on the "escape hatch" value of the configuration > that does not use the lookup tables, we want to make sure that the > escape hatch still functions, right ;-)? I think the perf tests differ from the correctness tests in that a single run is not all that interesting. You can see how long something takes, but that's meaningless without a baseline. The interesting results come from comparing two versions. So in this case: - running the simplified test script comparing an old version (where lookup tables were not the default) with one where they are (i.e., one with the first patch from this series). That will show off the perf improvement from the lookup tables (and in a better way than the original, because we'll actually compute the time difference between the two versions, rather than showing them as separate lines which the perf suite does not realize are related). - going forward, comparing two "new" versions will show us if the operations regress in performance, using config both from Git's defaults but also the repo. So most of the time, you'd be testing the default case, where we do generate the lookup tables (because they're now the default). But if you have a particular repo or config setup you care about, you can provide a repo with pack.writeBitmapLookupTable set as you like. That does create a blind spot if no developers running the perf suite ever do the "you can provide..." step. But I think that is the tip of the iceberg in terms of repo and config blind spots in the perf suite. There are so many possible combinations, and it's expensive to test them all. I don't think we have any particular reason to think that the non-lookup-table code path would significantly regress in performance. You asked earlier how much these tests cost to run. It's basically doubling the cost of each script, since we're running everything twice. So p5310 using linux.git, for instance, just took ~10 minutes after this patch. And that's with PERF_REPEAT_COUNT set to 1! The default would have been 30 minutes (and thus prior to this patch, 60 minutes total). That's a lot of minutes. -Peff ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup 2025-04-17 21:12 [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Taylor Blau ` (2 preceding siblings ...) 2025-04-17 21:12 ` [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table Taylor Blau @ 2025-04-17 21:12 ` Taylor Blau 2025-04-17 22:22 ` Junio C Hamano 2025-04-18 10:17 ` Jeff King 2025-05-02 21:21 ` [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Junio C Hamano 4 siblings, 2 replies; 18+ messages in thread From: Taylor Blau @ 2025-04-17 21:12 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jeff King, Junio C Hamano In the test_pack_bitmap() helper function, we first repack the repository under test for consistency and to eliminate any effects from different distributions of objects among packs. This step is performed with test_perf, so it is repeated $GIT_PERF_REPEAT_COUNT number of times. But we do not care about timing this portion of the setup phase, and repeating the process does not change the outcome. Use test_expect_success to avoid spending time repeating an idempotent portion of the setup for performance tests that use test_pack_bitmap(). Signed-off-by: Taylor Blau <me@ttaylorr.com> --- t/perf/lib-bitmap.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/perf/lib-bitmap.sh b/t/perf/lib-bitmap.sh index 55a8feb1dc..fdf5f35f1b 100644 --- a/t/perf/lib-bitmap.sh +++ b/t/perf/lib-bitmap.sh @@ -69,7 +69,7 @@ test_partial_bitmap () { } test_pack_bitmap () { - test_perf "repack to disk" ' + test_expect_success "repack to disk" ' git repack -ad ' -- 2.49.0.226.g0e6cae136d ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup 2025-04-17 21:12 ` [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup Taylor Blau @ 2025-04-17 22:22 ` Junio C Hamano 2025-04-18 10:17 ` Jeff King 1 sibling, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2025-04-17 22:22 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Elijah Newren, Jeff King Taylor Blau <me@ttaylorr.com> writes: > In the test_pack_bitmap() helper function, we first repack the > repository under test for consistency and to eliminate any effects from > different distributions of objects among packs. > > This step is performed with test_perf, so it is repeated > $GIT_PERF_REPEAT_COUNT number of times. But we do not care about timing > this portion of the setup phase, and repeating the process does not > change the outcome. > > Use test_expect_success to avoid spending time repeating an idempotent > portion of the setup for performance tests that use test_pack_bitmap(). > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > t/perf/lib-bitmap.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) OK. > > diff --git a/t/perf/lib-bitmap.sh b/t/perf/lib-bitmap.sh > index 55a8feb1dc..fdf5f35f1b 100644 > --- a/t/perf/lib-bitmap.sh > +++ b/t/perf/lib-bitmap.sh > @@ -69,7 +69,7 @@ test_partial_bitmap () { > } > > test_pack_bitmap () { > - test_perf "repack to disk" ' > + test_expect_success "repack to disk" ' > git repack -ad > ' ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup 2025-04-17 21:12 ` [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup Taylor Blau 2025-04-17 22:22 ` Junio C Hamano @ 2025-04-18 10:17 ` Jeff King 1 sibling, 0 replies; 18+ messages in thread From: Jeff King @ 2025-04-18 10:17 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Elijah Newren, Junio C Hamano On Thu, Apr 17, 2025 at 05:12:23PM -0400, Taylor Blau wrote: > In the test_pack_bitmap() helper function, we first repack the > repository under test for consistency and to eliminate any effects from > different distributions of objects among packs. > > This step is performed with test_perf, so it is repeated > $GIT_PERF_REPEAT_COUNT number of times. But we do not care about timing > this portion of the setup phase, and repeating the process does not > change the outcome. Isn't this also where we actually generate the bitmaps? I.e., it is where we would see a performance regression in the bitmap writing process (whereas the rest of the script is about the reading side). That said, I don't think it's even doing that very well. It is mutating the on-disk state, so the first run will potentially be much slower than subsequent runs (since everything is now in one big pack with bitmaps, and we try to reuse deltas and bitmap data as much as possible). And since we take best-of-N, we're basically just measuring those subsequent noop repacks (unless you set the repeat count to 1!). I think we've run into this before, e.g. in 775c71e16d (p5302: create the repo in each index-pack test, 2019-04-22). And there the solution was to reset the repo state before each timing, assuming it is quick enough not to affect the test too much. Our perf suite doesn't provide much support there (we'd want something like hyperfine's --prepare option). So I dunno. It is possible for timing this operation to provide some value, but I don't think the current implementation is doing that. And it's quite expensive to run. > diff --git a/t/perf/lib-bitmap.sh b/t/perf/lib-bitmap.sh > index 55a8feb1dc..fdf5f35f1b 100644 > --- a/t/perf/lib-bitmap.sh > +++ b/t/perf/lib-bitmap.sh > @@ -69,7 +69,7 @@ test_partial_bitmap () { > } > > test_pack_bitmap () { > - test_perf "repack to disk" ' > + test_expect_success "repack to disk" ' > git repack -ad > ' The same issue exists in t5326, which calls "multi-pack-index write" with the "--bitmap" flag, I think. So if we are going to do this, we'd probably want the same there. -Peff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups 2025-04-17 21:12 [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Taylor Blau ` (3 preceding siblings ...) 2025-04-17 21:12 ` [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup Taylor Blau @ 2025-05-02 21:21 ` Junio C Hamano 2025-05-05 7:11 ` Patrick Steinhardt 4 siblings, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2025-05-02 21:21 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Elijah Newren, Jeff King Taylor Blau <me@ttaylorr.com> writes: > This is a short series I extracted from a larger topic on reusing > "external"[^1] deltas during verbatim pack reuse. > > As part of performance-testing that series, I realized that bitmap > lookup tables are not written by default. Since it has been a > significant period of time since their introduction, the first patch of > this series makes writing the lookup table extension the default > behavior. This is: > > * pack-bitmap: write lookup table extension by default > > The next three patches clean up some t/perf scripts that were redundant > now that lookup tables are the default behavior. Those are: > > * p5312: removed duplicate performance test script > * t/perf: avoid testing bitmaps without lookup table > * t/perf/lib-bitmap.sh: avoid test_perf during setup > > Thanks in advance for your review :-). > > [^1]: The term I'm using to describe delta/base pairs which either (a) > are represented from different packs in a MIDX bitmap, or (b) the client > is known to already have the base. > > Taylor Blau (4): > pack-bitmap: write lookup table extension by default > p5312: removed duplicate performance test script > t/perf: avoid testing bitmaps without lookup table > t/perf/lib-bitmap.sh: avoid test_perf during setup Peff and I were the only two people who read these patches? Is this topic still viable, or has it been backburnered? Thanks. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups 2025-05-02 21:21 ` [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Junio C Hamano @ 2025-05-05 7:11 ` Patrick Steinhardt 0 siblings, 0 replies; 18+ messages in thread From: Patrick Steinhardt @ 2025-05-05 7:11 UTC (permalink / raw) To: Junio C Hamano; +Cc: Taylor Blau, git, Elijah Newren, Jeff King On Fri, May 02, 2025 at 02:21:47PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > This is a short series I extracted from a larger topic on reusing > > "external"[^1] deltas during verbatim pack reuse. > > > > As part of performance-testing that series, I realized that bitmap > > lookup tables are not written by default. Since it has been a > > significant period of time since their introduction, the first patch of > > this series makes writing the lookup table extension the default > > behavior. This is: > > > > * pack-bitmap: write lookup table extension by default > > > > The next three patches clean up some t/perf scripts that were redundant > > now that lookup tables are the default behavior. Those are: > > > > * p5312: removed duplicate performance test script > > * t/perf: avoid testing bitmaps without lookup table > > * t/perf/lib-bitmap.sh: avoid test_perf during setup > > > > Thanks in advance for your review :-). > > > > [^1]: The term I'm using to describe delta/base pairs which either (a) > > are represented from different packs in a MIDX bitmap, or (b) the client > > is known to already have the base. > > > > Taylor Blau (4): > > pack-bitmap: write lookup table extension by default > > p5312: removed duplicate performance test script > > t/perf: avoid testing bitmaps without lookup table > > t/perf/lib-bitmap.sh: avoid test_perf during setup > > Peff and I were the only two people who read these patches? > Is this topic still viable, or has it been backburnered? I read through the patch series, but didn't have anything to add over what has already been discussed. Overall I think that it makes sense to enable lookup tables -- we've had them enabled since February 2023 for all users of GitLab. Patrick ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-05-05 7:11 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-17 21:12 [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Taylor Blau 2025-04-17 21:12 ` [PATCH 1/4] pack-bitmap: write lookup table extension by default Taylor Blau 2025-04-17 22:04 ` Junio C Hamano 2025-04-18 9:33 ` Jeff King 2025-04-18 15:44 ` Junio C Hamano 2025-04-18 21:52 ` Taylor Blau 2025-04-17 21:12 ` [PATCH 2/4] p5312: removed duplicate performance test script Taylor Blau 2025-04-17 22:08 ` Junio C Hamano 2025-04-18 21:57 ` Taylor Blau 2025-04-17 21:12 ` [PATCH 3/4] t/perf: avoid testing bitmaps without lookup table Taylor Blau 2025-04-17 22:21 ` Junio C Hamano 2025-04-18 4:24 ` Junio C Hamano 2025-04-18 10:02 ` Jeff King 2025-04-17 21:12 ` [PATCH 4/4] t/perf/lib-bitmap.sh: avoid test_perf during setup Taylor Blau 2025-04-17 22:22 ` Junio C Hamano 2025-04-18 10:17 ` Jeff King 2025-05-02 21:21 ` [PATCH 0/4] pack-bitmap: enable lookup tables by default, misc. cleanups Junio C Hamano 2025-05-05 7:11 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).