All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, peff@peff.net
Subject: Re: [PATCH 3/4] midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps
Date: Thu, 09 Sep 2021 11:34:16 +0200	[thread overview]
Message-ID: <87v93adr8r.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <87zgsmdu6d.fsf@evledraar.gmail.com>


On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Sep 07 2021, Taylor Blau wrote:
>
>> On Wed, Sep 08, 2021 at 03:40:19AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Tue, Sep 07 2021, Taylor Blau wrote:
>>>
>>> > +static int git_multi_pack_index_write_config(const char *var, const char *value,
>>> > +					     void *cb)
>>> > +{
>>> > +	if (!strcmp(var, "pack.writebitmaphashcache")) {
>>> > +		if (git_config_bool(var, value))
>>> > +			opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
>>> > +		else
>>> > +			opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE;
>>> > +	}
>>> > +
>>> > +	/*
>>> > +	 * No need to fall-back to 'git_default_config', since this was already
>>> > +	 * called in 'cmd_multi_pack_index()'.
>>> > +	 */
>>> > +	return 0;
>>> > +}
>>> > +
>>> >  static int cmd_multi_pack_index_write(int argc, const char **argv)
>>> >  {
>>> >  	struct option *options;
>>> > @@ -73,6 +90,10 @@ static int cmd_multi_pack_index_write(int argc, const char **argv)
>>> >  		OPT_END(),
>>> >  	};
>>> >
>>> > +	opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
>>> > +
>>> > +	git_config(git_multi_pack_index_write_config, NULL);
>>> > +
>>>
>>> Since this is a write-only config option it would seem more logical to
>>> just call git_config() once, and have a git_multip_pack_index_config,
>>> which then would fall back on git_default_config, so we iterate it once,
>>> and no need for a comment about the oddity.
>>
>> Perhaps, but I'm not crazy about each sub-command having to call
>> git_config() itself when 'write' is the only one that actually has any
>> values to read.
>>
>> FWIW, the commit-graph builtin does the same thing as is written here
>> (calling git_config() twice, once in cmd_commit_graph() with
>> git_default_config as the callback and again in cmd_commit_graph_write()
>> with git_commit_graph_write_config as the callback).
>
> I didn't notice your earlier d356d5debe5 (commit-graph: introduce
> 'commitGraph.maxNewFilters', 2020-09-17). As an aside the test added in
> that commit seems to be broken or not testing that code change at all,
> if I comment out the git_config(git_commit_graph_write_config, &opts)
> it'll pass.
>
> As a comment on this series I'd find 4/4 squashed into 3/4 easier to
> read, when I did a "git blame" and found d356d5debe5 I discovered the
> test right away, if and when this gets merged someone might do the same,
> but not find the test as easily (they'd probably then grep the config
> variable name and find it eventually...).
>
> More importantly, the same issue with the commit-graph test seems to be
> the case here, if I comment out the added config reading code it'll
> still pass, it seems to be testing something, but not that the config is
> being read.
>
>> So I'm not opposed to cleaning it up, but I'd rather be consistent with
>> the existing behavior. To be honest, I'm not at all convinced that
>> reading the config twice is a bottleneck here when compared to
>> generating a MIDX.
>
> It's never going to matter at all for performance, I should have been
> clearer with my comments. I meant them purely as a "this code is hard to
> follow" comment.
>
> I.e. since we read the config twice, and in both commit-graph.c and
> multi-pack-index.c munge and write to the "opts" struct on
> parse_options(), you'll need to follow logic like:
>
>     1. Read config in cmd_X(), might set variable xyz
>     2. Do parse_options() in cmd_X(), might set variable xyz also
>     3. Now in cmd_X_subcmd(), read config, might set variable xyz
>     4. Do parse_options() in cmd_X(), migh set variable xyz also
>
> Of course in this case the relevant opts.flags only matters for the
> "write" subcommand, so on more careful reading we don't need to worry
> about the value flip-flopping between config defaults and getopts
> settings, but just in terms of establishing a pattern we'll be following
> in the subcommand built-ins I think this is setting us up for more
> complexity than is needed.
>
> As far as being consistent with existing behavior, in git-worktree,
> git-stash which are both similarly structured subcommands we follow the
> pattern of calling git_config() once, it seems to me better to follow
> that pattern than the one in d356d5debe5 if the config can be
> unambiguously parsed in one pass.

In similar spirit as my
https://lore.kernel.org/git/87v93bidhn.fsf@evledraar.gmail.com/ I
started seeing if not doing the flags via getopt but instead variables &
setting the flags later was better, and came up with this on top. Not
for this series, more to muse on how we can write these subcommands in a
simpler manner (or not).

I may have discovered a subtle bug in the process, in
cmd_multi_pack_index_repack() we end up calling write_midx_internal(),
which cares about MIDX_WRITE_REV_INDEX, but only
cmd_multi_pack_index_write() will set that flag, both before & after my
patch. Are we using the wrong flags during repack as a result?

diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c
index dd1652531bf..1b97b2ee4e1 100644
--- a/builtin/multi-pack-index.c
+++ b/builtin/multi-pack-index.c
@@ -45,14 +45,16 @@ static char const * const builtin_multi_pack_index_usage[] = {
 static struct opts_multi_pack_index {
 	const char *object_dir;
 	const char *preferred_pack;
-	unsigned long batch_size;
-	unsigned flags;
-} opts;
+	int progress;
+	int write_bitmap_hash_cache;
+} opts = {
+	.write_bitmap_hash_cache = -1,
+};
 
 static struct option common_opts[] = {
 	OPT_FILENAME(0, "object-dir", &opts.object_dir,
 	  N_("object directory containing set of packfile and pack-index pairs")),
-	OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS),
+	OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
 	OPT_END(),
 };
 
@@ -61,38 +63,29 @@ static struct option *add_common_options(struct option *prev)
 	return parse_options_concat(common_opts, prev);
 }
 
-static int git_multi_pack_index_write_config(const char *var, const char *value,
-					     void *cb)
+static int git_multi_pack_index_config(const char *var, const char *value,
+				       void *cb)
 {
 	if (!strcmp(var, "pack.writebitmaphashcache")) {
-		if (git_config_bool(var, value))
-			opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
-		else
-			opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE;
+		opts.write_bitmap_hash_cache = git_config_bool(var, value);
+		return 0;
 	}
 
-	/*
-	 * No need to fall-back to 'git_default_config', since this was already
-	 * called in 'cmd_multi_pack_index()'.
-	 */
-	return 0;
+	return git_default_config(var, value, NULL);
 }
 
 static int cmd_multi_pack_index_write(int argc, const char **argv)
 {
 	struct option *options;
+	static int write_bitmap = 0;
 	static struct option builtin_multi_pack_index_write_options[] = {
 		OPT_STRING(0, "preferred-pack", &opts.preferred_pack,
 			   N_("preferred-pack"),
 			   N_("pack for reuse when computing a multi-pack bitmap")),
-		OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"),
-			MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX),
+		OPT_BOOL(0, "bitmap", &write_bitmap, N_("write multi-pack bitmap")),
 		OPT_END(),
 	};
-
-	opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
-
-	git_config(git_multi_pack_index_write_config, NULL);
+	unsigned flags = 0;
 
 	options = add_common_options(builtin_multi_pack_index_write_options);
 
@@ -107,8 +100,15 @@ static int cmd_multi_pack_index_write(int argc, const char **argv)
 
 	FREE_AND_NULL(options);
 
-	return write_midx_file(opts.object_dir, opts.preferred_pack,
-			       opts.flags);
+	if (opts.progress)
+		flags |= MIDX_PROGRESS;
+	/* Both -1 default and 1 via config */
+	if (!opts.write_bitmap_hash_cache)
+		flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
+	if (write_bitmap)
+		flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX;
+
+	return write_midx_file(opts.object_dir, opts.preferred_pack, flags);
 }
 
 static int cmd_multi_pack_index_verify(int argc, const char **argv)
@@ -124,7 +124,7 @@ static int cmd_multi_pack_index_verify(int argc, const char **argv)
 		usage_with_options(builtin_multi_pack_index_verify_usage,
 				   options);
 
-	return verify_midx_file(the_repository, opts.object_dir, opts.flags);
+	return verify_midx_file(the_repository, opts.object_dir, opts.progress);
 }
 
 static int cmd_multi_pack_index_expire(int argc, const char **argv)
@@ -140,14 +140,15 @@ static int cmd_multi_pack_index_expire(int argc, const char **argv)
 		usage_with_options(builtin_multi_pack_index_expire_usage,
 				   options);
 
-	return expire_midx_packs(the_repository, opts.object_dir, opts.flags);
+	return expire_midx_packs(the_repository, opts.object_dir, opts.progress);
 }
 
 static int cmd_multi_pack_index_repack(int argc, const char **argv)
 {
+	static unsigned long batch_size = 0;
 	struct option *options;
 	static struct option builtin_multi_pack_index_repack_options[] = {
-		OPT_MAGNITUDE(0, "batch-size", &opts.batch_size,
+		OPT_MAGNITUDE(0, "batch-size", &batch_size,
 		  N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")),
 		OPT_END(),
 	};
@@ -167,7 +168,8 @@ static int cmd_multi_pack_index_repack(int argc, const char **argv)
 	FREE_AND_NULL(options);
 
 	return midx_repack(the_repository, opts.object_dir,
-			   (size_t)opts.batch_size, opts.flags);
+			   (size_t)batch_size,
+			   opts.progress ? MIDX_PROGRESS : 0);
 }
 
 int cmd_multi_pack_index(int argc, const char **argv,
@@ -175,10 +177,10 @@ int cmd_multi_pack_index(int argc, const char **argv,
 {
 	struct option *builtin_multi_pack_index_options = common_opts;
 
-	git_config(git_default_config, NULL);
+	git_config(git_multi_pack_index_config, NULL);
 
 	if (isatty(2))
-		opts.flags |= MIDX_PROGRESS;
+		opts.progress = 1;
 	argc = parse_options(argc, argv, prefix,
 			     builtin_multi_pack_index_options,
 			     builtin_multi_pack_index_usage,
diff --git a/midx.c b/midx.c
index 6c35dcd557c..3e722888d69 100644
--- a/midx.c
+++ b/midx.c
@@ -1482,7 +1482,7 @@ static int compare_pair_pos_vs_id(const void *_a, const void *_b)
 			display_progress(progress, _n); \
 	} while (0)
 
-int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags)
+int verify_midx_file(struct repository *r, const char *object_dir, int opt_progress)
 {
 	struct pair_pos_vs_id *pairs = NULL;
 	uint32_t i;
@@ -1505,7 +1505,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	if (!midx_checksum_valid(m))
 		midx_report(_("incorrect checksum"));
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_delayed_progress(_("Looking for referenced packfiles"),
 					  m->num_packs);
 	for (i = 0; i < m->num_packs; i++) {
@@ -1534,7 +1534,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		return verify_midx_error;
 	}
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"),
 						 m->num_objects - 1);
 	for (i = 0; i < m->num_objects - 1; i++) {
@@ -1563,14 +1563,14 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i);
 	}
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_sparse_progress(_("Sorting objects by packfile"),
 						 m->num_objects);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects);
 	for (i = 0; i < m->num_objects; i++) {
 		struct object_id oid;
diff --git a/midx.h b/midx.h
index 541d9ac728d..0dfe6a54ef3 100644
--- a/midx.h
+++ b/midx.h
@@ -64,7 +64,7 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i
 
 int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
 void clear_midx_file(struct repository *r);
-int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
+int verify_midx_file(struct repository *r, const char *object_dir, int opt_progress);
 int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
 int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags);
 

  reply	other threads:[~2021-09-09  9:37 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-07 21:17 [PATCH 0/4] pack-bitmap: permute existing namehash values Taylor Blau
2021-09-07 21:17 ` [PATCH 1/4] t/helper/test-bitmap.c: add 'dump-hashes' mode Taylor Blau
2021-09-08  1:37   ` Ævar Arnfjörð Bjarmason
2021-09-08  2:24     ` Taylor Blau
2021-09-07 21:17 ` [PATCH 2/4] pack-bitmap.c: propagate namehash values from existing bitmaps Taylor Blau
2021-09-07 21:18 ` [PATCH 3/4] midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps Taylor Blau
2021-09-08  1:40   ` Ævar Arnfjörð Bjarmason
2021-09-08  2:28     ` Taylor Blau
2021-09-09  8:18       ` Ævar Arnfjörð Bjarmason
2021-09-09  9:34         ` Ævar Arnfjörð Bjarmason [this message]
2021-09-09 14:55           ` Taylor Blau
2021-09-09 15:50             ` Ævar Arnfjörð Bjarmason
2021-09-09 16:23               ` Taylor Blau
2021-09-09 14:47         ` Taylor Blau
2021-09-13  0:38   ` Junio C Hamano
2021-09-14  1:15     ` Taylor Blau
2021-09-07 21:18 ` [PATCH 4/4] t5326: test propagating hashcache values Taylor Blau
2021-09-08  1:46   ` Ævar Arnfjörð Bjarmason
2021-09-08  2:30     ` Taylor Blau
2021-09-17  8:56       ` Ævar Arnfjörð Bjarmason
2021-09-17 17:32         ` Taylor Blau
2021-09-17 19:22           ` Ævar Arnfjörð Bjarmason
2021-09-13  0:46   ` Junio C Hamano
2021-09-14  1:12     ` Taylor Blau
2021-09-14  2:05       ` Junio C Hamano
2021-09-14  5:11         ` Taylor Blau
2021-09-14  5:17           ` Taylor Blau
2021-09-14  5:27           ` Jeff King
2021-09-14  5:31             ` Taylor Blau
2021-09-14  5:23         ` Jeff King
2021-09-14  5:49           ` Junio C Hamano
2021-09-14 22:05 ` [PATCH v2 0/7] pack-bitmap: permute existing namehash values Taylor Blau
2021-09-14 22:06   ` [PATCH v2 1/7] t/helper/test-bitmap.c: add 'dump-hashes' mode Taylor Blau
2021-09-14 22:06   ` [PATCH v2 2/7] pack-bitmap.c: propagate namehash values from existing bitmaps Taylor Blau
2021-09-14 22:06   ` [PATCH v2 3/7] midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps Taylor Blau
2021-09-14 22:06   ` [PATCH v2 4/7] p5326: create missing 'perf-tag' tag Taylor Blau
2021-09-16 22:36     ` Jeff King
2021-09-17  4:14       ` Taylor Blau
2021-09-14 22:06   ` [PATCH v2 5/7] p5326: don't set core.multiPackIndex unnecessarily Taylor Blau
2021-09-16 22:38     ` Jeff King
2021-09-14 22:06   ` [PATCH v2 6/7] p5326: generate pack bitmaps before writing the MIDX bitmap Taylor Blau
2021-09-16 22:45     ` Jeff King
2021-09-17  4:20       ` Taylor Blau
2021-09-14 22:06   ` [PATCH v2 7/7] t5326: test propagating hashcache values Taylor Blau
2021-09-16 22:49     ` Jeff King
2021-09-16 22:52   ` [PATCH v2 0/7] pack-bitmap: permute existing namehash values Jeff King
2021-09-17 21:21 ` [PATCH v3 " Taylor Blau
2021-09-17 21:21   ` [PATCH v3 1/7] t/helper/test-bitmap.c: add 'dump-hashes' mode Taylor Blau
2021-09-17 21:21   ` [PATCH v3 2/7] pack-bitmap.c: propagate namehash values from existing bitmaps Taylor Blau
2021-09-17 21:21   ` [PATCH v3 3/7] midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps Taylor Blau
2021-09-17 21:21   ` [PATCH v3 4/7] p5326: create missing 'perf-tag' tag Taylor Blau
2021-09-17 21:21   ` [PATCH v3 5/7] p5326: don't set core.multiPackIndex unnecessarily Taylor Blau
2021-09-17 21:21   ` [PATCH v3 6/7] p5326: generate pack bitmaps before writing the MIDX bitmap Taylor Blau
2021-09-17 21:21   ` [PATCH v3 7/7] t5326: test propagating hashcache values Taylor Blau
2021-09-17 22:12   ` [PATCH v3 0/7] pack-bitmap: permute existing namehash values Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v93adr8r.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.