git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] propagate fsck message severity for bundle fetch
@ 2024-11-21 20:41 Justin Tobler
  2024-11-21 20:41 ` [PATCH 1/5] bundle: add bundle verification options type Justin Tobler
                   ` (5 more replies)
  0 siblings, 6 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-21 20:41 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

Greetings,

With 63d903ff52 (unbundle: extend object verification for fetches,
2024-06-19), fsck checks are now performed on fetched bundles depending
on `transfer.fsckObjects` and `fetch.fsckObjects` configuration. This
works, but provides no means to override the default fsck message
severity as is done for other git-fetch(1) operations. This series aims
to propagate fsck message severity configuration to the underlying
git-index-pack(1) process executed on the bundle in a similar manner as
done with git-fetch-pack(1).

  - Patches 1 and 2 adapt the bundle subsystem to support additional
    options when performing `unbundle()`.

  - Patches 3 and 4 adapt the fetch-pack subsystem to expose a means to
    generate the message configuration arguments.

  - Patch 5 wires the newly generated fsck configuration options to the
    bundle options used when fetching from a bundle.

Thanks
-Justin

Justin Tobler (5):
  bundle: add bundle verification options type
  bundle: support fsck message configuration
  fetch-pack: introduce `fetch_pack_options`
  fetch-pack: expose `fetch_pack_config_cb()`
  transport: propagate fsck configuration during bundle fetch

 builtin/bundle.c        |  2 +-
 bundle-uri.c            | 13 ++++++++-----
 bundle.c                | 17 +++++++++++------
 bundle.h                | 15 ++++++++++-----
 fetch-pack.c            | 20 +++++++++++---------
 fetch-pack.h            | 12 ++++++++++++
 t/t5607-clone-bundle.sh |  7 +++++++
 transport.c             | 11 +++++++++--
 8 files changed, 69 insertions(+), 28 deletions(-)


base-commit: 4083a6f05206077a50af7658bedc17a94c54607d
-- 
2.47.0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/5] bundle: add bundle verification options type
  2024-11-21 20:41 [PATCH 0/5] propagate fsck message severity for bundle fetch Justin Tobler
@ 2024-11-21 20:41 ` Justin Tobler
  2024-11-22  1:21   ` Junio C Hamano
  2024-11-26  9:08   ` Patrick Steinhardt
  2024-11-21 20:41 ` [PATCH 2/5] bundle: support fsck message configuration Justin Tobler
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-21 20:41 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

Bundle verification performed as part of `unbundle()` is configurable
via providing `verify_bundle_flags`. This is done by invoking
`verify_bundle()` and propagating the set flags. If the
`VERIFY_BUNDLE_FSCK` flag is provided, the `fsck-objects` flag is
specified when invoking git-index-pack(1) to perform fsck checks on the
objects in the bundle.

Introduce a new type, `verify_bundle_opts`, and update `unbundle()` to
accept this instead of `verify_bundle_flags` to perform the same
verification configuration. In a subsequent commit, `verify_bundle_opts`
will be extended to support configuration of fsck message severity.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 builtin/bundle.c |  2 +-
 bundle-uri.c     | 13 ++++++++-----
 bundle.c         | 14 +++++++++-----
 bundle.h         | 14 +++++++++-----
 transport.c      |  6 ++++--
 5 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/builtin/bundle.c b/builtin/bundle.c
index 127518c2a8..15ac75ab51 100644
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -218,7 +218,7 @@ static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix)
 		strvec_pushl(&extra_index_pack_args, "-v", "--progress-title",
 			     _("Unbundling objects"), NULL);
 	ret = !!unbundle(the_repository, &header, bundle_fd,
-			 &extra_index_pack_args, 0) ||
+			 &extra_index_pack_args, NULL) ||
 		list_bundle_refs(&header, argc, argv);
 	bundle_header_release(&header);
 
diff --git a/bundle-uri.c b/bundle-uri.c
index 0df66e2872..ed3afcaeb3 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -361,12 +361,16 @@ static int copy_uri_to_file(const char *filename, const char *uri)
 
 static int unbundle_from_file(struct repository *r, const char *file)
 {
-	int result = 0;
-	int bundle_fd;
+	struct verify_bundle_opts opts = {
+		.flags = VERIFY_BUNDLE_QUIET |
+			 (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0)
+	};
 	struct bundle_header header = BUNDLE_HEADER_INIT;
-	struct string_list_item *refname;
 	struct strbuf bundle_ref = STRBUF_INIT;
+	struct string_list_item *refname;
 	size_t bundle_prefix_len;
+	int result = 0;
+	int bundle_fd;
 
 	bundle_fd = read_bundle_header(file, &header);
 	if (bundle_fd < 0) {
@@ -379,8 +383,7 @@ static int unbundle_from_file(struct repository *r, const char *file)
 	 * a reachable ref pointing to the new tips, which will reach
 	 * the prerequisite commits.
 	 */
-	result = unbundle(r, &header, bundle_fd, NULL,
-			  VERIFY_BUNDLE_QUIET | (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0));
+	result = unbundle(r, &header, bundle_fd, NULL, &opts);
 	if (result) {
 		result = 1;
 		goto cleanup;
diff --git a/bundle.c b/bundle.c
index 4773b51eb1..db17f50ee0 100644
--- a/bundle.c
+++ b/bundle.c
@@ -626,13 +626,17 @@ int create_bundle(struct repository *r, const char *path,
 	return ret;
 }
 
-int unbundle(struct repository *r, struct bundle_header *header,
-	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags)
+int unbundle(struct repository *r, struct bundle_header *header, int bundle_fd,
+	     struct strvec *extra_index_pack_args,
+	     struct verify_bundle_opts *_opts)
 {
 	struct child_process ip = CHILD_PROCESS_INIT;
+	struct verify_bundle_opts opts = { 0 };
 
-	if (verify_bundle(r, header, flags))
+	if (_opts)
+		opts = *_opts;
+
+	if (verify_bundle(r, header, opts.flags))
 		return -1;
 
 	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
@@ -641,7 +645,7 @@ int unbundle(struct repository *r, struct bundle_header *header,
 	if (header->filter.choice)
 		strvec_push(&ip.args, "--promisor=from-bundle");
 
-	if (flags & VERIFY_BUNDLE_FSCK)
+	if (opts.flags & VERIFY_BUNDLE_FSCK)
 		strvec_push(&ip.args, "--fsck-objects");
 
 	if (extra_index_pack_args)
diff --git a/bundle.h b/bundle.h
index 5ccc9a061a..bddf44c267 100644
--- a/bundle.h
+++ b/bundle.h
@@ -39,6 +39,10 @@ enum verify_bundle_flags {
 int verify_bundle(struct repository *r, struct bundle_header *header,
 		  enum verify_bundle_flags flags);
 
+struct verify_bundle_opts {
+	enum verify_bundle_flags flags;
+};
+
 /**
  * Unbundle after reading the header with read_bundle_header().
  *
@@ -49,12 +53,12 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
  * (e.g. "-v" for verbose/progress), NULL otherwise. The provided
  * "extra_index_pack_args" (if any) will be strvec_clear()'d for you.
  *
- * Before unbundling, this method will call verify_bundle() with the
- * given 'flags'.
+ * Before unbundling, this method will call verify_bundle() with 'flags'
+ * provided in 'opts'.
  */
-int unbundle(struct repository *r, struct bundle_header *header,
-	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags);
+int unbundle(struct repository *r, struct bundle_header *header, int bundle_fd,
+	     struct strvec *extra_index_pack_args,
+	     struct verify_bundle_opts *opts);
 int list_bundle_refs(struct bundle_header *header,
 		int argc, const char **argv);
 
diff --git a/transport.c b/transport.c
index 47fda6a773..7e0ec4adc9 100644
--- a/transport.c
+++ b/transport.c
@@ -176,6 +176,8 @@ static int fetch_refs_from_bundle(struct transport *transport,
 				  int nr_heads UNUSED,
 				  struct ref **to_fetch UNUSED)
 {
+	struct verify_bundle_opts opts = { .flags = fetch_pack_fsck_objects() ?
+							    VERIFY_BUNDLE_FSCK : 0 };
 	struct bundle_transport_data *data = transport->data;
 	struct strvec extra_index_pack_args = STRVEC_INIT;
 	int ret;
@@ -185,9 +187,9 @@ static int fetch_refs_from_bundle(struct transport *transport,
 
 	if (!data->get_refs_from_bundle_called)
 		get_refs_from_bundle_inner(transport);
+
 	ret = unbundle(the_repository, &data->header, data->fd,
-		       &extra_index_pack_args,
-		       fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0);
+		       &extra_index_pack_args, &opts);
 	transport->hash_algo = data->header.hash_algo;
 
 	strvec_clear(&extra_index_pack_args);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 2/5] bundle: support fsck message configuration
  2024-11-21 20:41 [PATCH 0/5] propagate fsck message severity for bundle fetch Justin Tobler
  2024-11-21 20:41 ` [PATCH 1/5] bundle: add bundle verification options type Justin Tobler
@ 2024-11-21 20:41 ` Justin Tobler
  2024-11-22  1:30   ` Junio C Hamano
  2024-11-21 20:41 ` [PATCH 3/5] fetch-pack: introduce `fetch_pack_options` Justin Tobler
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-21 20:41 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

The git-index-pack(1) spawned during `unbundle()` can be optionally
configured with `--fsck-objects` to perform fsck checks on the bundle.
This does not propagate fsck message severity configuration though.

Extend `verify_bundle_opts` to store this information and update
`unbundle()` to configure the `--fsck-objects` option appropriately.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 bundle.c | 3 ++-
 bundle.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/bundle.c b/bundle.c
index db17f50ee0..97b70e2e51 100644
--- a/bundle.c
+++ b/bundle.c
@@ -646,7 +646,8 @@ int unbundle(struct repository *r, struct bundle_header *header, int bundle_fd,
 		strvec_push(&ip.args, "--promisor=from-bundle");
 
 	if (opts.flags & VERIFY_BUNDLE_FSCK)
-		strvec_push(&ip.args, "--fsck-objects");
+		strvec_pushf(&ip.args, "--fsck-objects%s",
+			     opts.fsck_msg_types ? opts.fsck_msg_types : "");
 
 	if (extra_index_pack_args)
 		strvec_pushv(&ip.args, extra_index_pack_args->v);
diff --git a/bundle.h b/bundle.h
index bddf44c267..2a7b556f83 100644
--- a/bundle.h
+++ b/bundle.h
@@ -41,6 +41,7 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
 
 struct verify_bundle_opts {
 	enum verify_bundle_flags flags;
+	const char *fsck_msg_types;
 };
 
 /**
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 3/5] fetch-pack: introduce `fetch_pack_options`
  2024-11-21 20:41 [PATCH 0/5] propagate fsck message severity for bundle fetch Justin Tobler
  2024-11-21 20:41 ` [PATCH 1/5] bundle: add bundle verification options type Justin Tobler
  2024-11-21 20:41 ` [PATCH 2/5] bundle: support fsck message configuration Justin Tobler
@ 2024-11-21 20:41 ` Justin Tobler
  2024-11-22  1:46   ` Junio C Hamano
  2024-11-21 20:41 ` [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()` Justin Tobler
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-21 20:41 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When `fetch_pack_config()` is invoked, fetch-pack configuration is
parsed from the config. As part of this operation, fsck message severity
configuration is assigned to the `fsck_msg_types` global variable. This
is optionally used to configure the downstream git-index-pack(1) when
the `--strict` option is specified.

In a subsequent commit, the same fetch-pack fsck message configuration
needs to be reused. To facilitate this, introduce `fetch_pack_options`
which gets written to during the `fetch_pack_config_cb()` instead of
directly modifying the `fsck_msg_types` global.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 fetch-pack.c | 16 +++++++++-------
 fetch-pack.h |  8 ++++++++
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index fe1fb3c1b7..73309f8043 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -48,8 +48,8 @@ static int server_supports_filtering;
 static int advertise_sid;
 static struct shallow_lock shallow_lock;
 static const char *alternate_shallow_file;
+static struct fetch_pack_options fetch_pack_options = FETCH_PACK_OPTIONS_INIT;
 static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES;
-static struct strbuf fsck_msg_types = STRBUF_INIT;
 static struct string_list uri_protocols = STRING_LIST_INIT_DUP;
 
 /* Remember to update object flag allocation in object.h */
@@ -1017,7 +1017,7 @@ static int get_pack(struct fetch_pack_args *args,
 			strvec_push(&cmd.args, "--fsck-objects");
 		else
 			strvec_pushf(&cmd.args, "--strict%s",
-				     fsck_msg_types.buf);
+				     fetch_pack_options.fsck_msg_types.buf);
 	}
 
 	if (index_pack_args) {
@@ -1860,6 +1860,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 static int fetch_pack_config_cb(const char *var, const char *value,
 				const struct config_context *ctx, void *cb)
 {
+	struct fetch_pack_options *opts = cb;
 	const char *msg_id;
 
 	if (strcmp(var, "fetch.fsck.skiplist") == 0) {
@@ -1867,8 +1868,8 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 
 		if (git_config_pathname(&path, var, value))
 			return 1;
-		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
-			fsck_msg_types.len ? ',' : '=', path);
+		strbuf_addf(&opts->fsck_msg_types, "%cskiplist=%s",
+			    opts->fsck_msg_types.len ? ',' : '=', path);
 		free(path);
 		return 0;
 	}
@@ -1877,8 +1878,9 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 		if (!value)
 			return config_error_nonbool(var);
 		if (is_valid_msg_type(msg_id, value))
-			strbuf_addf(&fsck_msg_types, "%c%s=%s",
-				fsck_msg_types.len ? ',' : '=', msg_id, value);
+			strbuf_addf(&opts->fsck_msg_types, "%c%s=%s",
+				    opts->fsck_msg_types.len ? ',' : '=',
+				    msg_id, value);
 		else
 			warning("Skipping unknown msg id '%s'", msg_id);
 		return 0;
@@ -1904,7 +1906,7 @@ static void fetch_pack_config(void)
 		}
 	}
 
-	git_config(fetch_pack_config_cb, NULL);
+	git_config(fetch_pack_config_cb, &fetch_pack_options);
 }
 
 static void fetch_pack_setup(void)
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..8243b754ce 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -106,4 +106,12 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
  */
 int fetch_pack_fsck_objects(void);
 
+struct fetch_pack_options {
+	struct strbuf fsck_msg_types;
+};
+
+#define FETCH_PACK_OPTIONS_INIT { \
+	.fsck_msg_types = STRBUF_INIT, \
+}
+
 #endif
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()`
  2024-11-21 20:41 [PATCH 0/5] propagate fsck message severity for bundle fetch Justin Tobler
                   ` (2 preceding siblings ...)
  2024-11-21 20:41 ` [PATCH 3/5] fetch-pack: introduce `fetch_pack_options` Justin Tobler
@ 2024-11-21 20:41 ` Justin Tobler
  2024-11-22  1:57   ` Junio C Hamano
  2024-11-22 16:45   ` shejialuo
  2024-11-21 20:41 ` [PATCH 5/5] transport: propagate fsck configuration during bundle fetch Justin Tobler
  2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
  5 siblings, 2 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-21 20:41 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

During fetch-pack operations, git-index-pack(1) may be spawned and
perform fsck checks. The message severity of these checks is
configurable and propagated via appending it to the `--fsck-objects`
option.

With `fetch_pack_config_cb()`, fsck configuration gets populated to a
`fetch_pack_options`. Expose `fetch_pack_config_cb()`, to facilitate
formatted fsck message configuration generation. In a subsequent commit,
this is used to wire message configuration to `unbundle()` during bundle
fetches.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 fetch-pack.c | 4 ++--
 fetch-pack.h | 4 ++++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 73309f8043..10b66795bc 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1857,8 +1857,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	return ref;
 }
 
-static int fetch_pack_config_cb(const char *var, const char *value,
-				const struct config_context *ctx, void *cb)
+int fetch_pack_config_cb(const char *var, const char *value,
+			 const struct config_context *ctx, void *cb)
 {
 	struct fetch_pack_options *opts = cb;
 	const char *msg_id;
diff --git a/fetch-pack.h b/fetch-pack.h
index 8243b754ce..f35a75a3c5 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -1,6 +1,7 @@
 #ifndef FETCH_PACK_H
 #define FETCH_PACK_H
 
+#include "config.h"
 #include "string-list.h"
 #include "protocol.h"
 #include "list-objects-filter-options.h"
@@ -114,4 +115,7 @@ struct fetch_pack_options {
 	.fsck_msg_types = STRBUF_INIT, \
 }
 
+int fetch_pack_config_cb(const char *var, const char *value,
+			 const struct config_context *ctx, void *cb);
+
 #endif
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 5/5] transport: propagate fsck configuration during bundle fetch
  2024-11-21 20:41 [PATCH 0/5] propagate fsck message severity for bundle fetch Justin Tobler
                   ` (3 preceding siblings ...)
  2024-11-21 20:41 ` [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()` Justin Tobler
@ 2024-11-21 20:41 ` Justin Tobler
  2024-11-22  1:59   ` Junio C Hamano
  2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
  5 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-21 20:41 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When fetching directly from a bundle, fsck message severity
configuration is not propagated to the underlying git-index-pack(1). It
is only capable of enabling or disabling fsck checks entirely. This does
not align with the fsck behavior for fetches through git-fetch-pack(1).

Use the message configuration from fetch-pack and wire it through to
`unbundle()` to enable the same fsck configuration as done through
fetch-pack.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 t/t5607-clone-bundle.sh | 7 +++++++
 transport.c             | 5 +++++
 2 files changed, 12 insertions(+)

diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh
index 7ceaa8194d..c69aa88eae 100755
--- a/t/t5607-clone-bundle.sh
+++ b/t/t5607-clone-bundle.sh
@@ -171,6 +171,13 @@ test_expect_success 'clone bundle with different fsckObjects configurations' '
 
 	test_must_fail git -c transfer.fsckObjects=true \
 		clone bundle-fsck/bad.bundle bundle-transfer-fsck 2>err &&
+	test_grep "missingEmail" err &&
+
+	git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=ignore \
+		clone bundle-fsck/bad.bundle bundle-fsck-ignore &&
+
+	test_must_fail git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=error \
+		clone bundle-fsck/bad.bundle bundle-fsck-error 2>err &&
 	test_grep "missingEmail" err
 '
 
diff --git a/transport.c b/transport.c
index 7e0ec4adc9..a9e03c3964 100644
--- a/transport.c
+++ b/transport.c
@@ -178,6 +178,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 {
 	struct verify_bundle_opts opts = { .flags = fetch_pack_fsck_objects() ?
 							    VERIFY_BUNDLE_FSCK : 0 };
+	struct fetch_pack_options fetch_pack_options = FETCH_PACK_OPTIONS_INIT;
 	struct bundle_transport_data *data = transport->data;
 	struct strvec extra_index_pack_args = STRVEC_INIT;
 	int ret;
@@ -188,11 +189,15 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	if (!data->get_refs_from_bundle_called)
 		get_refs_from_bundle_inner(transport);
 
+	git_config(fetch_pack_config_cb, &fetch_pack_options);
+	opts.fsck_msg_types = fetch_pack_options.fsck_msg_types.buf;
+
 	ret = unbundle(the_repository, &data->header, data->fd,
 		       &extra_index_pack_args, &opts);
 	transport->hash_algo = data->header.hash_algo;
 
 	strvec_clear(&extra_index_pack_args);
+	strbuf_release(&fetch_pack_options.fsck_msg_types);
 	return ret;
 }
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/5] bundle: add bundle verification options type
  2024-11-21 20:41 ` [PATCH 1/5] bundle: add bundle verification options type Justin Tobler
@ 2024-11-22  1:21   ` Junio C Hamano
  2024-11-22 15:22     ` Justin Tobler
  2024-11-26  9:08   ` Patrick Steinhardt
  1 sibling, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2024-11-22  1:21 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> diff --git a/bundle-uri.c b/bundle-uri.c
> index 0df66e2872..ed3afcaeb3 100644
> --- a/bundle-uri.c
> +++ b/bundle-uri.c
> @@ -361,12 +361,16 @@ static int copy_uri_to_file(const char *filename, const char *uri)
>  
>  static int unbundle_from_file(struct repository *r, const char *file)
>  {
> -	int result = 0;
> -	int bundle_fd;
> +	struct verify_bundle_opts opts = {
> +		.flags = VERIFY_BUNDLE_QUIET |
> +			 (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0)
> +	};
>  	struct bundle_header header = BUNDLE_HEADER_INIT;
> -	struct string_list_item *refname;
>  	struct strbuf bundle_ref = STRBUF_INIT;
> +	struct string_list_item *refname;
>  	size_t bundle_prefix_len;
> +	int result = 0;
> +	int bundle_fd;

Unrelated changes to reorder the lines, without any justification
worth describing in the proposed commit log message, distracts and
discourages the reviewers from reading further on.  I would avoid
making such changes if I were doing this patch.

The _real_ change in the above hunk is that a new struct instance
"opts" is defined, with its .flags member initialized based on what
fetch_pack_fsck_object() says.  That helper function requires us to
be in a repository, but because you must have a repository to
unbundle into, that call is safe.

> @@ -379,8 +383,7 @@ static int unbundle_from_file(struct repository *r, const char *file)
>  	 * a reachable ref pointing to the new tips, which will reach
>  	 * the prerequisite commits.
>  	 */
> -	result = unbundle(r, &header, bundle_fd, NULL,
> -			  VERIFY_BUNDLE_QUIET | (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0));
> +	result = unbundle(r, &header, bundle_fd, NULL, &opts);

We can see that .flags in the new structure gets the same value we
used to pass in the original, which is good.

> diff --git a/bundle.c b/bundle.c
> index 4773b51eb1..db17f50ee0 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -626,13 +626,17 @@ int create_bundle(struct repository *r, const char *path,
>  	return ret;
>  }
>  
> -int unbundle(struct repository *r, struct bundle_header *header,
> -	     int bundle_fd, struct strvec *extra_index_pack_args,
> -	     enum verify_bundle_flags flags)
> +int unbundle(struct repository *r, struct bundle_header *header, int bundle_fd,
> +	     struct strvec *extra_index_pack_args,
> +	     struct verify_bundle_opts *_opts)

Again, unrelated rewrapping of lines distracts and discourages the
reviewers from reading further on.  It looked as if the patch is
adding an extra parameter, until I read it again.

The real change here is that the enum is replaced with a struct that
has the same enum as one of its members, which is good.

Name the external-facing one (like this new parameter) _without_
funnies, and call it straight "opts".  The internal stand-in object
you create below can use funny convention but using "_" as prefix is
usually for system stuff (and the language standard forbids it, even
though people often do so in practice, from programs).

>  {
>  	struct child_process ip = CHILD_PROCESS_INIT;
> +	struct verify_bundle_opts opts = { 0 };
>  
> -	if (verify_bundle(r, header, flags))
> +	if (_opts)
> +		opts = *_opts;
> +
> +	if (verify_bundle(r, header, opts.flags))
>  		return -1;

This is an arrangement that looks strange, especially at this stage
of the series without reading the rest.  If verify_bundle() takes
the enum and not &opts, there is no need for stand-in opts struct.
You can have a local enum "flags" that is initialized to 0 and only
when parameter "opts" is not NULL, assign opts->flags to it and use
it throughout the rest of the function.  Reviewers will be left
confused wondering why the patch does this in an unnecessarily more
complex way by using an extra structure instance.

Until you start needing other fields of opts in the function,
perhaps in a later step, that is.

> @@ -641,7 +645,7 @@ int unbundle(struct repository *r, struct bundle_header *header,
>  	if (header->filter.choice)
>  		strvec_push(&ip.args, "--promisor=from-bundle");
>  
> -	if (flags & VERIFY_BUNDLE_FSCK)
> +	if (opts.flags & VERIFY_BUNDLE_FSCK)
>  		strvec_push(&ip.args, "--fsck-objects");

And this is a fallout of the above "strange" arrangement.

>  {
> +	struct verify_bundle_opts opts = { .flags = fetch_pack_fsck_objects() ?
> +							    VERIFY_BUNDLE_FSCK : 0 };

	struct verify_bundle_opts opts = {
		.flags = fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0,
	};

to avoid overly long lines, and prepare for a future you add more
members to the structure (the trailing comma helps).


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/5] bundle: support fsck message configuration
  2024-11-21 20:41 ` [PATCH 2/5] bundle: support fsck message configuration Justin Tobler
@ 2024-11-22  1:30   ` Junio C Hamano
  2024-11-22 15:44     ` Justin Tobler
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2024-11-22  1:30 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> The git-index-pack(1) spawned during `unbundle()` can be optionally
> configured with `--fsck-objects` to perform fsck checks on the bundle.
> This does not propagate fsck message severity configuration though.
>
> Extend `verify_bundle_opts` to store this information and update
> `unbundle()` to configure the `--fsck-objects` option appropriately.
>
> Signed-off-by: Justin Tobler <jltobler@gmail.com>
> ---
>  bundle.c | 3 ++-
>  bundle.h | 1 +
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/bundle.c b/bundle.c
> index db17f50ee0..97b70e2e51 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -646,7 +646,8 @@ int unbundle(struct repository *r, struct bundle_header *header, int bundle_fd,
>  		strvec_push(&ip.args, "--promisor=from-bundle");
>  
>  	if (opts.flags & VERIFY_BUNDLE_FSCK)
> -		strvec_push(&ip.args, "--fsck-objects");
> +		strvec_pushf(&ip.args, "--fsck-objects%s",
> +			     opts.fsck_msg_types ? opts.fsck_msg_types : "");

OK, having %s immediately after --option-name means that anybody who
is adding anything to fsck_msg_types is responsible for starting it
with an "=" equals sign, but that is in line with how existing code
does, e.g. receive-pack drives unpack-objcts/index-pack with the
"--strict%s" option with a potential value for fsck_msg_types).

> diff --git a/bundle.h b/bundle.h
> index bddf44c267..2a7b556f83 100644
> --- a/bundle.h
> +++ b/bundle.h
> @@ -41,6 +41,7 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
>  
>  struct verify_bundle_opts {
>  	enum verify_bundle_flags flags;
> +	const char *fsck_msg_types;
>  };
>  
>  /**

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 3/5] fetch-pack: introduce `fetch_pack_options`
  2024-11-21 20:41 ` [PATCH 3/5] fetch-pack: introduce `fetch_pack_options` Justin Tobler
@ 2024-11-22  1:46   ` Junio C Hamano
  2024-11-22  3:46     ` Junio C Hamano
  2024-11-22 17:31     ` Justin Tobler
  0 siblings, 2 replies; 41+ messages in thread
From: Junio C Hamano @ 2024-11-22  1:46 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> When `fetch_pack_config()` is invoked, fetch-pack configuration is
> parsed from the config. As part of this operation, fsck message severity
> configuration is assigned to the `fsck_msg_types` global variable. This
> is optionally used to configure the downstream git-index-pack(1) when
> the `--strict` option is specified.
>
> In a subsequent commit, the same fetch-pack fsck message configuration
> needs to be reused. To facilitate this, introduce `fetch_pack_options`
> which gets written to during the `fetch_pack_config_cb()` instead of
> directly modifying the `fsck_msg_types` global.

It is unclear how it facilitates to replace one global with another
global that has the data that was previously global as one of its
members.  With the above I was somehow expecting that the option
struct instance is allocated on the stack of a function common to
both callers of the configuration reader (i.e. fetch_pack_config())
as well as the configuration user (i.e. get_pack()).  If we were to
allow the latter to keep accessing the global (which is perfectly
fine), wouldn't it be sufficient for the purpose of this series
(which I am imagining wants to call fetch_pack_config() from the
sideways and grab what came from the configuration) to just pass the
fsck_msg_types strbuf through the call chain of the reaading side?

That is,

 - fetch_pack_config()'s current callers pass the address of
   fsck_msg_types to a new parameter.

 - fetch_pack_config() passes that new parameter when calling
   git_config();

 - fetch_pack_config_cb() uses the cb parameter and stuff its
   findings there;

 - a third-party caller calls fetch_pack_config() with its own
   fsck_msg_types instance (presumably in this series, it would be
   the opts.fsck_msg_types member introduced earlier in the bundle
   code).

or something like that?

So, the reason for existence of the shell around the fsck_msg_types
needs to be explained.  It is perfectly fine to say "we'll add THIS
THING in a later step", if that were the case, but a reviewer tends
to start reading from the front, so the presentation order matters.

Leaving many questions unsolved tangling may be a good way to keep
readers engaged when writing a mystery novel, but not a patch
series.  Having to keep too many things in head, especially when
many of them are not explained well (hence raises "why should I keep
these in my head?" question), is another distraction and discourages
the reviewers from reading further on.

Assuming that the shell structure is necessary around it, the code
changes in this patch looks sensible to me.

Thanks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()`
  2024-11-21 20:41 ` [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()` Justin Tobler
@ 2024-11-22  1:57   ` Junio C Hamano
  2024-11-22 17:41     ` Justin Tobler
  2024-11-22 16:45   ` shejialuo
  1 sibling, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2024-11-22  1:57 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> With `fetch_pack_config_cb()`, fsck configuration gets populated to a
> `fetch_pack_options`. Expose `fetch_pack_config_cb()`, to facilitate
> formatted fsck message configuration generation. In a subsequent commit,
> this is used to wire message configuration to `unbundle()` during bundle
> fetches.

This is generally going in the right direction, but this particular
iteration is highly disappointing for two reasons.

 - The callback calls git_default_config() at end.  Other callers
   may not want it happen.  Think of the reason why a new caller may
   want to use this callback (see the next item).

 - fetch_pack_config_cb() was perfectly good name back when it was
   hidden inside fetch-pack.c, as a private internal implementation
   detail, EVEN THOUGH it did not give its callers everything that
   tries to configure the behaviour of fetch-pack.  It ONLY is about
   how fsck behaviour is affected.  It must be renamed so that any
   new caller can realize that it is configuring fsck checking
   machinery and not general fetch-pack features.

So, I would suggest making at least two changes.

 - rename it to a more sensible name that includes "fsck" somewhere
   (as it is about "fetch.fsck.*" configuration variables, "fetch"
   should also stay in the name).  Let's tentatively call it foo().

 - stop calling git_default_config() from foo().  Instead, return
   some special value foo() does not currently return, let's say -1
   to signal that the key was something foo() was not interested in,
   and write a thin replacement helper

    static int fetch_pack_config_cb(...)
    {
	int st = foo(...);
	if (st < 0)
	    return git_default_config(var, value, ctx, cb);
	return st;
    }

   and call that from fetch_pack_config().

No, "foo()" has neither "fetch" or "fsck" in it; I am not suggesting
to use that as the final name ;-).

Thanks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/5] transport: propagate fsck configuration during bundle fetch
  2024-11-21 20:41 ` [PATCH 5/5] transport: propagate fsck configuration during bundle fetch Justin Tobler
@ 2024-11-22  1:59   ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2024-11-22  1:59 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> When fetching directly from a bundle, fsck message severity
> configuration is not propagated to the underlying git-index-pack(1). It
> is only capable of enabling or disabling fsck checks entirely. This does
> not align with the fsck behavior for fetches through git-fetch-pack(1).
>
> Use the message configuration from fetch-pack and wire it through to
> `unbundle()` to enable the same fsck configuration as done through
> fetch-pack.
>
> Signed-off-by: Justin Tobler <jltobler@gmail.com>
> ---
>  t/t5607-clone-bundle.sh | 7 +++++++
>  transport.c             | 5 +++++
>  2 files changed, 12 insertions(+)

Nicely done.

>
> diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh
> index 7ceaa8194d..c69aa88eae 100755
> --- a/t/t5607-clone-bundle.sh
> +++ b/t/t5607-clone-bundle.sh
> @@ -171,6 +171,13 @@ test_expect_success 'clone bundle with different fsckObjects configurations' '
>  
>  	test_must_fail git -c transfer.fsckObjects=true \
>  		clone bundle-fsck/bad.bundle bundle-transfer-fsck 2>err &&
> +	test_grep "missingEmail" err &&
> +
> +	git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=ignore \
> +		clone bundle-fsck/bad.bundle bundle-fsck-ignore &&
> +
> +	test_must_fail git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=error \
> +		clone bundle-fsck/bad.bundle bundle-fsck-error 2>err &&
>  	test_grep "missingEmail" err
>  '
>  
> diff --git a/transport.c b/transport.c
> index 7e0ec4adc9..a9e03c3964 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -178,6 +178,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
>  {
>  	struct verify_bundle_opts opts = { .flags = fetch_pack_fsck_objects() ?
>  							    VERIFY_BUNDLE_FSCK : 0 };
> +	struct fetch_pack_options fetch_pack_options = FETCH_PACK_OPTIONS_INIT;
>  	struct bundle_transport_data *data = transport->data;
>  	struct strvec extra_index_pack_args = STRVEC_INIT;
>  	int ret;
> @@ -188,11 +189,15 @@ static int fetch_refs_from_bundle(struct transport *transport,
>  	if (!data->get_refs_from_bundle_called)
>  		get_refs_from_bundle_inner(transport);
>  
> +	git_config(fetch_pack_config_cb, &fetch_pack_options);
> +	opts.fsck_msg_types = fetch_pack_options.fsck_msg_types.buf;
> +
>  	ret = unbundle(the_repository, &data->header, data->fd,
>  		       &extra_index_pack_args, &opts);
>  	transport->hash_algo = data->header.hash_algo;
>  
>  	strvec_clear(&extra_index_pack_args);
> +	strbuf_release(&fetch_pack_options.fsck_msg_types);
>  	return ret;
>  }

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 3/5] fetch-pack: introduce `fetch_pack_options`
  2024-11-22  1:46   ` Junio C Hamano
@ 2024-11-22  3:46     ` Junio C Hamano
  2024-11-22 17:31     ` Justin Tobler
  1 sibling, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2024-11-22  3:46 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Assuming that the shell structure is necessary around it, the code
> changes in this patch looks sensible to me.

Ah, another thing.  It would make more sense to do

    good_example(args, struct foo *opt)
    {
	struct foo opt_fallback = { ... init ... };

	if (!opt)
		opt = &opt_fallback;
	...
	use opt->foo and opt->bar
    }

instead of what the patch did with structure assignment, i.e.

    bad_example(args, struct foo *opt_)
    {
	struct foo opt = { ... init ... };

	if (opt_)
		opt = *opt_;
	...
	use opt.foo and opt.bar
    }

because the latter, via structure assignment, always raises "who
owns this piece of data?" question once you start adding more
complex things in the "struct foo", like a strbuf that holds the
fsck configuration data.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/5] bundle: add bundle verification options type
  2024-11-22  1:21   ` Junio C Hamano
@ 2024-11-22 15:22     ` Justin Tobler
  0 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-22 15:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 24/11/22 10:21AM, Junio C Hamano wrote:
> Justin Tobler <jltobler@gmail.com> writes:
> 
> > diff --git a/bundle-uri.c b/bundle-uri.c
> > index 0df66e2872..ed3afcaeb3 100644
> > --- a/bundle-uri.c
> > +++ b/bundle-uri.c
> > @@ -361,12 +361,16 @@ static int copy_uri_to_file(const char *filename, const char *uri)
> >  
> >  static int unbundle_from_file(struct repository *r, const char *file)
> >  {
> > -	int result = 0;
> > -	int bundle_fd;
> > +	struct verify_bundle_opts opts = {
> > +		.flags = VERIFY_BUNDLE_QUIET |
> > +			 (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0)
> > +	};
> >  	struct bundle_header header = BUNDLE_HEADER_INIT;
> > -	struct string_list_item *refname;
> >  	struct strbuf bundle_ref = STRBUF_INIT;
> > +	struct string_list_item *refname;
> >  	size_t bundle_prefix_len;
> > +	int result = 0;
> > +	int bundle_fd;
> 
> Unrelated changes to reorder the lines, without any justification
> worth describing in the proposed commit log message, distracts and
> discourages the reviewers from reading further on.  I would avoid
> making such changes if I were doing this patch.

Thanks for the feedback. I'll revert the unnecessary changes in the next
version and avoid doing this in the future.

> > -int unbundle(struct repository *r, struct bundle_header *header,
> > -	     int bundle_fd, struct strvec *extra_index_pack_args,
> > -	     enum verify_bundle_flags flags)
> > +int unbundle(struct repository *r, struct bundle_header *header, int bundle_fd,
> > +	     struct strvec *extra_index_pack_args,
> > +	     struct verify_bundle_opts *_opts)
> 
> Again, unrelated rewrapping of lines distracts and discourages the
> reviewers from reading further on.  It looked as if the patch is
> adding an extra parameter, until I read it again.
> 
> The real change here is that the enum is replaced with a struct that
> has the same enum as one of its members, which is good.
> 
> Name the external-facing one (like this new parameter) _without_
> funnies, and call it straight "opts".  The internal stand-in object
> you create below can use funny convention but using "_" as prefix is
> usually for system stuff (and the language standard forbids it, even
> though people often do so in practice, from programs).

Good to know, I saw this pattern used in the reftable library. I'll
update per your suggestion in the next version.

> >  {
> >  	struct child_process ip = CHILD_PROCESS_INIT;
> > +	struct verify_bundle_opts opts = { 0 };
> >  
> > -	if (verify_bundle(r, header, flags))
> > +	if (_opts)
> > +		opts = *_opts;
> > +
> > +	if (verify_bundle(r, header, opts.flags))
> >  		return -1;
> 
> This is an arrangement that looks strange, especially at this stage
> of the series without reading the rest.  If verify_bundle() takes
> the enum and not &opts, there is no need for stand-in opts struct.
> You can have a local enum "flags" that is initialized to 0 and only
> when parameter "opts" is not NULL, assign opts->flags to it and use
> it throughout the rest of the function.  Reviewers will be left
> confused wondering why the patch does this in an unnecessarily more
> complex way by using an extra structure instance.

Good point, at this point in the series its not obvious why the added
compexity is good or worth it. I'll defer making this change to the
following patch in the next version.

> 
> Until you start needing other fields of opts in the function,
> perhaps in a later step, that is.

Ya, this is its intent. Having a default options to fallback to is more
useful once there are other fields present.

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/5] bundle: support fsck message configuration
  2024-11-22  1:30   ` Junio C Hamano
@ 2024-11-22 15:44     ` Justin Tobler
  2024-11-25  1:33       ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-22 15:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 24/11/22 10:30AM, Junio C Hamano wrote:
> Justin Tobler <jltobler@gmail.com> writes:
> 
> >  	if (opts.flags & VERIFY_BUNDLE_FSCK)
> > -		strvec_push(&ip.args, "--fsck-objects");
> > +		strvec_pushf(&ip.args, "--fsck-objects%s",
> > +			     opts.fsck_msg_types ? opts.fsck_msg_types : "");
> 
> OK, having %s immediately after --option-name means that anybody who
> is adding anything to fsck_msg_types is responsible for starting it
> with an "=" equals sign, but that is in line with how existing code
> does, e.g. receive-pack drives unpack-objcts/index-pack with the
> "--strict%s" option with a potential value for fsck_msg_types).

I was considering making it the responsibility of the `fsck_msg_types`
consumer to conditionally preprend the '='. This would mean that
fetch-pack, in the later patch, should also be updated to not
automatically prepend '=' when parsing the config and be adjusted
accordingly. This does feel more intuitive to me, but I've not sure it
warrants the added fallout and complexity.

In the next version I'll add a comment to the type definition indicating
that any value for `fsck_msg_types` other than NULL, is expected to be
prefixed with '='. I am open to changing how this works altogether
though if we think that is best. :)

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()`
  2024-11-21 20:41 ` [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()` Justin Tobler
  2024-11-22  1:57   ` Junio C Hamano
@ 2024-11-22 16:45   ` shejialuo
  2024-11-27  1:21     ` Justin Tobler
  1 sibling, 1 reply; 41+ messages in thread
From: shejialuo @ 2024-11-22 16:45 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

On Thu, Nov 21, 2024 at 02:41:18PM -0600, Justin Tobler wrote:
> During fetch-pack operations, git-index-pack(1) may be spawned and
> perform fsck checks. The message severity of these checks is
> configurable and propagated via appending it to the `--fsck-objects`
> option.
> 
> With `fetch_pack_config_cb()`, fsck configuration gets populated to a
> `fetch_pack_options`. Expose `fetch_pack_config_cb()`, to facilitate
> formatted fsck message configuration generation. In a subsequent commit,
> this is used to wire message configuration to `unbundle()` during bundle
> fetches.
> 

In my perspective, we may not separate [PATCH 4/5] and [PATCH 5/5].
Should the reason why we want to expose `fetch_pack_config_cb` is that
we need to propagate the fsck severity to `unbundle`? Without the
information of the last patch, we cannot know any detail thing. So, they
are highly relevant.

However, from the comment of the Junio, there are a lot of things need
to be changed. So, just a comment.

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 3/5] fetch-pack: introduce `fetch_pack_options`
  2024-11-22  1:46   ` Junio C Hamano
  2024-11-22  3:46     ` Junio C Hamano
@ 2024-11-22 17:31     ` Justin Tobler
  1 sibling, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-22 17:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 24/11/22 10:46AM, Junio C Hamano wrote:
> Justin Tobler <jltobler@gmail.com> writes:
> 
> > When `fetch_pack_config()` is invoked, fetch-pack configuration is
> > parsed from the config. As part of this operation, fsck message severity
> > configuration is assigned to the `fsck_msg_types` global variable. This
> > is optionally used to configure the downstream git-index-pack(1) when
> > the `--strict` option is specified.
> >
> > In a subsequent commit, the same fetch-pack fsck message configuration
> > needs to be reused. To facilitate this, introduce `fetch_pack_options`
> > which gets written to during the `fetch_pack_config_cb()` instead of
> > directly modifying the `fsck_msg_types` global.
> 
> It is unclear how it facilitates to replace one global with another
> global that has the data that was previously global as one of its
> members.  With the above I was somehow expecting that the option
> struct instance is allocated on the stack of a function common to
> both callers of the configuration reader (i.e. fetch_pack_config())
> as well as the configuration user (i.e. get_pack()).  If we were to
> allow the latter to keep accessing the global (which is perfectly
> fine), wouldn't it be sufficient for the purpose of this series
> (which I am imagining wants to call fetch_pack_config() from the
> sideways and grab what came from the configuration) to just pass the
> fsck_msg_types strbuf through the call chain of the reaading side?

For the purposes of this series, the addition of the
`fetch_pack_options` structure as a wrapper around `fsck_msg_types` is
not needed. As mentioned, it would be sufficient to just pass the
`strbuf` directly and have it modified by `fetch_pack_config_cb()`.

The original intent of providing the shell structure was to allow for
more easy extension of the fetch-pack config parsing in the future, but
it doesn't probably make much sense to do it now and its purpose wasn't
explained.

In the next version I'll drop the use of shell structure in favor of
passing an instance of `strbuf` directly.

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()`
  2024-11-22  1:57   ` Junio C Hamano
@ 2024-11-22 17:41     ` Justin Tobler
  0 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-22 17:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 24/11/22 10:57AM, Junio C Hamano wrote:
> Justin Tobler <jltobler@gmail.com> writes:
> 
> > With `fetch_pack_config_cb()`, fsck configuration gets populated to a
> > `fetch_pack_options`. Expose `fetch_pack_config_cb()`, to facilitate
> > formatted fsck message configuration generation. In a subsequent commit,
> > this is used to wire message configuration to `unbundle()` during bundle
> > fetches.
> 
> This is generally going in the right direction, but this particular
> iteration is highly disappointing for two reasons.
> 
>  - The callback calls git_default_config() at end.  Other callers
>    may not want it happen.  Think of the reason why a new caller may
>    want to use this callback (see the next item).
> 
>  - fetch_pack_config_cb() was perfectly good name back when it was
>    hidden inside fetch-pack.c, as a private internal implementation
>    detail, EVEN THOUGH it did not give its callers everything that
>    tries to configure the behaviour of fetch-pack.  It ONLY is about
>    how fsck behaviour is affected.  It must be renamed so that any
>    new caller can realize that it is configuring fsck checking
>    machinery and not general fetch-pack features.
> 
> So, I would suggest making at least two changes.
> 
>  - rename it to a more sensible name that includes "fsck" somewhere
>    (as it is about "fetch.fsck.*" configuration variables, "fetch"
>    should also stay in the name).  Let's tentatively call it foo().
> 
>  - stop calling git_default_config() from foo().  Instead, return
>    some special value foo() does not currently return, let's say -1
>    to signal that the key was something foo() was not interested in,
>    and write a thin replacement helper
> 
>     static int fetch_pack_config_cb(...)
>     {
> 	int st = foo(...);
> 	if (st < 0)
> 	    return git_default_config(var, value, ctx, cb);
> 	return st;
>     }
> 
>    and call that from fetch_pack_config().
> 
> No, "foo()" has neither "fetch" or "fsck" in it; I am not suggesting
> to use that as the final name ;-).
> 
> Thanks.

Thanks for the suggestions. I'll factor out the fsck configuration bit
as suggested and name it something like "fetch_pack_fsck_config()". The
new name should be more clear and this change will also ensure only the
desired fsck configuration is being parsed which makes more sense. :)

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/5] bundle: support fsck message configuration
  2024-11-22 15:44     ` Justin Tobler
@ 2024-11-25  1:33       ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2024-11-25  1:33 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> I was considering making it the responsibility of the `fsck_msg_types`
> consumer to conditionally preprend the '='.

In my initial reading, I too thought it might be logically more
clear, but I changed my mind and the patch posted as is is fine
(and that last part of the sentence is what I wanted to say).

Because fsck_msg_types starts as an empty strbuf and accumulates
elements one at a time, each time we add something to it, we'd need
to check if we are adding the first element (in which case we do not
want to terminate the existing list with ",") or we already have
something in there (in which case we do want to add "," before our
new element).  Because we are doing the check anyway, instead of
saying "ah, this is the first one, so let's not add ','", saying
"the first one?  we need '='" is not too bad.

And the consuming side would not have to have a conditional "if the
string is empty, do nothing, otherwise add '=' and then the string".
The consumers can just "concatenate the string, which is possibly
empty, after the option".

So in the end, the complexity for the producer is the same, and the
consumer becomes much simpler.

And this exactly pattern (which I personally find a bit ugly) is
used by receive-pack to drive unpack-objects and index-pack, which
makes it doubly OK to use it.

Adding a comment to describe what is expected in the variable is
indeed very much appreciated, though.

Thanks.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/5] bundle: add bundle verification options type
  2024-11-21 20:41 ` [PATCH 1/5] bundle: add bundle verification options type Justin Tobler
  2024-11-22  1:21   ` Junio C Hamano
@ 2024-11-26  9:08   ` Patrick Steinhardt
  2024-11-26 15:59     ` Justin Tobler
  1 sibling, 1 reply; 41+ messages in thread
From: Patrick Steinhardt @ 2024-11-26  9:08 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

On Thu, Nov 21, 2024 at 02:41:15PM -0600, Justin Tobler wrote:
> diff --git a/bundle-uri.c b/bundle-uri.c
> index 0df66e2872..ed3afcaeb3 100644
> --- a/bundle-uri.c
> +++ b/bundle-uri.c
> @@ -361,12 +361,16 @@ static int copy_uri_to_file(const char *filename, const char *uri)
>  
>  static int unbundle_from_file(struct repository *r, const char *file)
>  {
> -	int result = 0;
> -	int bundle_fd;
> +	struct verify_bundle_opts opts = {
> +		.flags = VERIFY_BUNDLE_QUIET |
> +			 (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0)

This is missing its trailing comma.

> diff --git a/bundle.h b/bundle.h
> index 5ccc9a061a..bddf44c267 100644
> --- a/bundle.h
> +++ b/bundle.h
> @@ -39,6 +39,10 @@ enum verify_bundle_flags {
>  int verify_bundle(struct repository *r, struct bundle_header *header,
>  		  enum verify_bundle_flags flags);

Curious. I would have expected that `verify_bundle()` should receive the
full `verify_bundle_opts` because these are so similarly named. Like
this we have the weird situation where `unbundle()` seemingly receives
the options that are intended for `verify_bundle()`, but we never pass
them through in the first place.

> +struct verify_bundle_opts {
> +	enum verify_bundle_flags flags;
> +};
> +
>  /**
>   * Unbundle after reading the header with read_bundle_header().
>   *
> diff --git a/transport.c b/transport.c
> index 47fda6a773..7e0ec4adc9 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -176,6 +176,8 @@ static int fetch_refs_from_bundle(struct transport *transport,
>  				  int nr_heads UNUSED,
>  				  struct ref **to_fetch UNUSED)
>  {
> +	struct verify_bundle_opts opts = { .flags = fetch_pack_fsck_objects() ?
> +							    VERIFY_BUNDLE_FSCK : 0 };

This is weirdly formatted and should rather follow what you have done
further up in `unbundle_from_file()`.

Patrick

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/5] bundle: add bundle verification options type
  2024-11-26  9:08   ` Patrick Steinhardt
@ 2024-11-26 15:59     ` Justin Tobler
  0 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-26 15:59 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/11/26 10:08AM, Patrick Steinhardt wrote:
> On Thu, Nov 21, 2024 at 02:41:15PM -0600, Justin Tobler wrote:
> > diff --git a/bundle.h b/bundle.h
> > index 5ccc9a061a..bddf44c267 100644
> > --- a/bundle.h
> > +++ b/bundle.h
> > @@ -39,6 +39,10 @@ enum verify_bundle_flags {
> >  int verify_bundle(struct repository *r, struct bundle_header *header,
> >  		  enum verify_bundle_flags flags);
> 
> Curious. I would have expected that `verify_bundle()` should receive the
> full `verify_bundle_opts` because these are so similarly named. Like
> this we have the weird situation where `unbundle()` seemingly receives
> the options that are intended for `verify_bundle()`, but we never pass
> them through in the first place.

The `verify_bundle_opts` type should probably be renamed to
`unbundle_opts`. As you mentioned, these options are not used directly
by `verify_bundle()` as the current naming would suggest which is
confusing. Really the intent is allow further configuration of
`unbundle()`. This is evident in the next patch where the options are
extended to further configure the git-index-pack(1) spawned through
`unbundle()`. I'll update the naming in the next version.

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 0/4] propagate fsck message severity for bundle fetch
  2024-11-21 20:41 [PATCH 0/5] propagate fsck message severity for bundle fetch Justin Tobler
                   ` (4 preceding siblings ...)
  2024-11-21 20:41 ` [PATCH 5/5] transport: propagate fsck configuration during bundle fetch Justin Tobler
@ 2024-11-27  0:57 ` Justin Tobler
  2024-11-27  0:57   ` [PATCH v2 1/4] bundle: add bundle verification options type Justin Tobler
                     ` (4 more replies)
  5 siblings, 5 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27  0:57 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

Greetings,

With 63d903ff52 (unbundle: extend object verification for fetches,
2024-06-19), fsck checks are now performed on fetched bundles depending
on `transfer.fsckObjects` and `fetch.fsckObjects` configuration. This
works, but provides no means to override the default fsck message
severity as is done for other git-fetch(1) operations. This series aims
to propagate fsck message severity configuration to the underlying
git-index-pack(1) process executed on the bundle in a similar manner as
done with git-fetch-pack(1).

  - Patches 1 and 2 adapt the bundle subsystem to support additional
    options when performing `unbundle()`.

  - Patch 3 adapts the fetch-pack subsystem to expose a means to
    generate the message configuration arguments.

  - Patch 4 wires the newly generated fsck configuration options to the
    bundle options used when fetching from a bundle.

Changes in V2:

  - Reverted unnecessary line formatted changes.

  - Simplified fallback logic for default `unbundle()` options.

  - Dropped unncessary shell structure in the third patch to keep the
    patch more straightforward.

  - Instead of exposing `fetch_pack_config_cb()` directly, the fetch
    pack fsck config parsing logic is split and exposed. This also
    addresses the issue of unwanted default Git config parsing occuring.
    Consequently the previous third and fourth patch were condensed into
    a single patch.

  - Improved explainations in the patches, naming, and various style
    fixes.

Due to all rearranging in this version, I opted not to include a
range-diff.

Thanks
-Justin

Justin Tobler (4):
  bundle: add bundle verification options type
  bundle: support fsck message configuration
  fetch-pack: split out fsck config parsing
  transport: propagate fsck configuration during bundle fetch

 builtin/bundle.c        |  2 +-
 bundle-uri.c            |  7 +++++--
 bundle.c                | 13 +++++++++----
 bundle.h                | 17 ++++++++++++++---
 fetch-pack.c            | 24 +++++++++++++++++-------
 fetch-pack.h            |  3 +++
 t/t5607-clone-bundle.sh |  7 +++++++
 transport.c             | 26 ++++++++++++++++++++++++--
 8 files changed, 80 insertions(+), 19 deletions(-)


base-commit: 4083a6f05206077a50af7658bedc17a94c54607d
-- 
2.47.0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 1/4] bundle: add bundle verification options type
  2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
@ 2024-11-27  0:57   ` Justin Tobler
  2024-11-27  0:57   ` [PATCH v2 2/4] bundle: support fsck message configuration Justin Tobler
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27  0:57 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When `unbundle()` is invoked, fsck verification may be configured by
passing the `VERIFY_BUNDLE_FSCK` flag. This mechanism allows fsck checks
on the bundle to be enabled or disabled entirely. To facilitate more
fine-grained fsck configuration, additional context must be provided to
`unbundle()`.

Introduce the `unbundle_opts` type, which wraps the existing
`verify_bundle_flags`, to facilitate future extension of `unbundle()`
configuration. Also update `unbundle()` and its call sites to accept
this new options type instead of the flags directly. The end behavior is
functionally the same, but allows for the set of configurable options to
be extended. This is leveraged in a subsequent commit to enable fsck
message severity configuration.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 builtin/bundle.c |  2 +-
 bundle-uri.c     |  7 +++++--
 bundle.c         |  6 +++++-
 bundle.h         | 10 +++++++---
 transport.c      |  6 ++++--
 5 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/builtin/bundle.c b/builtin/bundle.c
index 127518c2a8..15ac75ab51 100644
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -218,7 +218,7 @@ static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix)
 		strvec_pushl(&extra_index_pack_args, "-v", "--progress-title",
 			     _("Unbundling objects"), NULL);
 	ret = !!unbundle(the_repository, &header, bundle_fd,
-			 &extra_index_pack_args, 0) ||
+			 &extra_index_pack_args, NULL) ||
 		list_bundle_refs(&header, argc, argv);
 	bundle_header_release(&header);
 
diff --git a/bundle-uri.c b/bundle-uri.c
index 0df66e2872..cdf9e4f9e1 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -367,6 +367,10 @@ static int unbundle_from_file(struct repository *r, const char *file)
 	struct string_list_item *refname;
 	struct strbuf bundle_ref = STRBUF_INIT;
 	size_t bundle_prefix_len;
+	struct unbundle_opts opts = {
+		.flags = VERIFY_BUNDLE_QUIET |
+			 (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0),
+	};
 
 	bundle_fd = read_bundle_header(file, &header);
 	if (bundle_fd < 0) {
@@ -379,8 +383,7 @@ static int unbundle_from_file(struct repository *r, const char *file)
 	 * a reachable ref pointing to the new tips, which will reach
 	 * the prerequisite commits.
 	 */
-	result = unbundle(r, &header, bundle_fd, NULL,
-			  VERIFY_BUNDLE_QUIET | (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0));
+	result = unbundle(r, &header, bundle_fd, NULL, &opts);
 	if (result) {
 		result = 1;
 		goto cleanup;
diff --git a/bundle.c b/bundle.c
index 4773b51eb1..485033ea3f 100644
--- a/bundle.c
+++ b/bundle.c
@@ -628,9 +628,13 @@ int create_bundle(struct repository *r, const char *path,
 
 int unbundle(struct repository *r, struct bundle_header *header,
 	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags)
+	     struct unbundle_opts *opts)
 {
 	struct child_process ip = CHILD_PROCESS_INIT;
+	enum verify_bundle_flags flags = 0;
+
+	if (opts)
+		flags = opts->flags;
 
 	if (verify_bundle(r, header, flags))
 		return -1;
diff --git a/bundle.h b/bundle.h
index 5ccc9a061a..6a09cc7bfb 100644
--- a/bundle.h
+++ b/bundle.h
@@ -39,6 +39,10 @@ enum verify_bundle_flags {
 int verify_bundle(struct repository *r, struct bundle_header *header,
 		  enum verify_bundle_flags flags);
 
+struct unbundle_opts {
+	enum verify_bundle_flags flags;
+};
+
 /**
  * Unbundle after reading the header with read_bundle_header().
  *
@@ -49,12 +53,12 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
  * (e.g. "-v" for verbose/progress), NULL otherwise. The provided
  * "extra_index_pack_args" (if any) will be strvec_clear()'d for you.
  *
- * Before unbundling, this method will call verify_bundle() with the
- * given 'flags'.
+ * Before unbundling, this method will call verify_bundle() with 'flags'
+ * provided in 'opts'.
  */
 int unbundle(struct repository *r, struct bundle_header *header,
 	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags);
+	     struct unbundle_opts *opts);
 int list_bundle_refs(struct bundle_header *header,
 		int argc, const char **argv);
 
diff --git a/transport.c b/transport.c
index 47fda6a773..8536b14181 100644
--- a/transport.c
+++ b/transport.c
@@ -176,6 +176,9 @@ static int fetch_refs_from_bundle(struct transport *transport,
 				  int nr_heads UNUSED,
 				  struct ref **to_fetch UNUSED)
 {
+	struct unbundle_opts opts = {
+		.flags = fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0,
+	};
 	struct bundle_transport_data *data = transport->data;
 	struct strvec extra_index_pack_args = STRVEC_INIT;
 	int ret;
@@ -186,8 +189,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	if (!data->get_refs_from_bundle_called)
 		get_refs_from_bundle_inner(transport);
 	ret = unbundle(the_repository, &data->header, data->fd,
-		       &extra_index_pack_args,
-		       fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0);
+		       &extra_index_pack_args, &opts);
 	transport->hash_algo = data->header.hash_algo;
 
 	strvec_clear(&extra_index_pack_args);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 2/4] bundle: support fsck message configuration
  2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
  2024-11-27  0:57   ` [PATCH v2 1/4] bundle: add bundle verification options type Justin Tobler
@ 2024-11-27  0:57   ` Justin Tobler
  2024-11-27  5:44     ` Patrick Steinhardt
  2024-11-27  0:57   ` [PATCH v2 3/4] fetch-pack: split out fsck config parsing Justin Tobler
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-27  0:57 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

If the `VERIFY_BUNDLE_FLAG` is set during `unbundle()`, the
git-index-pack(1) spawned is configured with the `--fsck-options` flag
to perform fsck verification. With this flag enabled, there is not a way
to configure fsck message severity though.

Extend the `unbundle_opts` type to store fsck message severity
configuration and update `unbundle()` to conditionally append it to the
`--fsck-objects` flag if provided. This enables `unbundle()` call sites
to support optionally setting the severity for specific fsck messages.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 bundle.c | 13 +++++++------
 bundle.h |  7 +++++++
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/bundle.c b/bundle.c
index 485033ea3f..4e53ddfca2 100644
--- a/bundle.c
+++ b/bundle.c
@@ -631,12 +631,12 @@ int unbundle(struct repository *r, struct bundle_header *header,
 	     struct unbundle_opts *opts)
 {
 	struct child_process ip = CHILD_PROCESS_INIT;
-	enum verify_bundle_flags flags = 0;
+	struct unbundle_opts opts_fallback = { 0 };
 
-	if (opts)
-		flags = opts->flags;
+	if (!opts)
+		opts = &opts_fallback;
 
-	if (verify_bundle(r, header, flags))
+	if (verify_bundle(r, header, opts->flags))
 		return -1;
 
 	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
@@ -645,8 +645,9 @@ int unbundle(struct repository *r, struct bundle_header *header,
 	if (header->filter.choice)
 		strvec_push(&ip.args, "--promisor=from-bundle");
 
-	if (flags & VERIFY_BUNDLE_FSCK)
-		strvec_push(&ip.args, "--fsck-objects");
+	if (opts->flags & VERIFY_BUNDLE_FSCK)
+		strvec_pushf(&ip.args, "--fsck-objects%s",
+			     opts->fsck_msg_types ? opts->fsck_msg_types : "");
 
 	if (extra_index_pack_args)
 		strvec_pushv(&ip.args, extra_index_pack_args->v);
diff --git a/bundle.h b/bundle.h
index 6a09cc7bfb..df17326b09 100644
--- a/bundle.h
+++ b/bundle.h
@@ -41,6 +41,13 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
 
 struct unbundle_opts {
 	enum verify_bundle_flags flags;
+	/**
+	 * fsck_msg_types may optionally contain fsck message severity
+	 * configuration. If present, this configuration gets directly appended
+	 * to a '--fsck-objects' option and therefore must be prefixed with '='.
+	 * (E.g. "=missingEmail=ignore,gitmodulesUrl=ignore")
+	 */
+	const char *fsck_msg_types;
 };
 
 /**
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 3/4] fetch-pack: split out fsck config parsing
  2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
  2024-11-27  0:57   ` [PATCH v2 1/4] bundle: add bundle verification options type Justin Tobler
  2024-11-27  0:57   ` [PATCH v2 2/4] bundle: support fsck message configuration Justin Tobler
@ 2024-11-27  0:57   ` Justin Tobler
  2024-11-27  5:44     ` Patrick Steinhardt
  2024-11-27  0:57   ` [PATCH v2 4/4] transport: propagate fsck configuration during bundle fetch Justin Tobler
  2024-11-27 23:33   ` [PATCH v3 0/4] propagate fsck message severity for " Justin Tobler
  4 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-27  0:57 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When `fetch_pack_config()` is invoked, fetch-pack configuration is
parsed from the config. As part of this operation, fsck message severity
configuration is assigned to the `fsck_msg_types` global variable. This
is optionally used to configure the downstream git-index-pack(1) when
the `--strict` option is specified.

The same parsed fsck message severity configuration is also needed
outside of fetch-pack. Instead of exposing/relying on the existing
global state, split out the fsck config parsing logic into
`fetch_pack_fsck_config()` and expose it. In a subsequent commit, this
is used to provide fsck configuration when invoking `unbundle()`.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 fetch-pack.c | 24 +++++++++++++++++-------
 fetch-pack.h |  3 +++
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index fe1fb3c1b7..e7d4f6e6e2 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1857,8 +1857,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	return ref;
 }
 
-static int fetch_pack_config_cb(const char *var, const char *value,
-				const struct config_context *ctx, void *cb)
+int fetch_pack_fsck_config(const char *var, const char *value,
+			   struct strbuf *msg_types)
 {
 	const char *msg_id;
 
@@ -1867,8 +1867,8 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 
 		if (git_config_pathname(&path, var, value))
 			return 1;
-		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
-			fsck_msg_types.len ? ',' : '=', path);
+		strbuf_addf(msg_types, "%cskiplist=%s",
+			msg_types->len ? ',' : '=', path);
 		free(path);
 		return 0;
 	}
@@ -1877,14 +1877,24 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 		if (!value)
 			return config_error_nonbool(var);
 		if (is_valid_msg_type(msg_id, value))
-			strbuf_addf(&fsck_msg_types, "%c%s=%s",
-				fsck_msg_types.len ? ',' : '=', msg_id, value);
+			strbuf_addf(msg_types, "%c%s=%s",
+				msg_types->len ? ',' : '=', msg_id, value);
 		else
 			warning("Skipping unknown msg id '%s'", msg_id);
 		return 0;
 	}
 
-	return git_default_config(var, value, ctx, cb);
+	return -1;
+}
+
+static int fetch_pack_config_cb(const char *var, const char *value,
+				const struct config_context *ctx, void *cb)
+{
+	int ret = fetch_pack_fsck_config(var, value, &fsck_msg_types);
+	if (ret < 0)
+		return git_default_config(var, value, ctx, cb);
+
+	return ret;
 }
 
 static void fetch_pack_config(void)
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..c667b6fbf3 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -106,4 +106,7 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
  */
 int fetch_pack_fsck_objects(void);
 
+int fetch_pack_fsck_config(const char *var, const char *value,
+			   struct strbuf *msg_types);
+
 #endif
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 4/4] transport: propagate fsck configuration during bundle fetch
  2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
                     ` (2 preceding siblings ...)
  2024-11-27  0:57   ` [PATCH v2 3/4] fetch-pack: split out fsck config parsing Justin Tobler
@ 2024-11-27  0:57   ` Justin Tobler
  2024-11-27  1:39     ` Junio C Hamano
  2024-11-27 23:33   ` [PATCH v3 0/4] propagate fsck message severity for " Justin Tobler
  4 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-27  0:57 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When fetching directly from a bundle, fsck message severity
configuration is not propagated to the underlying git-index-pack(1). It
is only capable of enabling or disabling fsck checks entirely. This does
not align with the fsck behavior for fetches through git-fetch-pack(1).

Use the fsck config parsing from fetch-pack to populate fsck message
severity configuration and wire it through to `unbundle()` to enable the
same fsck verification as done through fetch-pack.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 t/t5607-clone-bundle.sh |  7 +++++++
 transport.c             | 20 ++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh
index 7ceaa8194d..c69aa88eae 100755
--- a/t/t5607-clone-bundle.sh
+++ b/t/t5607-clone-bundle.sh
@@ -171,6 +171,13 @@ test_expect_success 'clone bundle with different fsckObjects configurations' '
 
 	test_must_fail git -c transfer.fsckObjects=true \
 		clone bundle-fsck/bad.bundle bundle-transfer-fsck 2>err &&
+	test_grep "missingEmail" err &&
+
+	git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=ignore \
+		clone bundle-fsck/bad.bundle bundle-fsck-ignore &&
+
+	test_must_fail git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=error \
+		clone bundle-fsck/bad.bundle bundle-fsck-error 2>err &&
 	test_grep "missingEmail" err
 '
 
diff --git a/transport.c b/transport.c
index 8536b14181..0209cac5e2 100644
--- a/transport.c
+++ b/transport.c
@@ -19,6 +19,7 @@
 #include "branch.h"
 #include "url.h"
 #include "submodule.h"
+#include "strbuf.h"
 #include "string-list.h"
 #include "oid-array.h"
 #include "sigchain.h"
@@ -172,6 +173,19 @@ static struct ref *get_refs_from_bundle(struct transport *transport,
 	return result;
 }
 
+static int fetch_fsck_config_cb(const char *var, const char *value,
+				const struct config_context *ctx UNUSED, void *cb)
+{
+	struct strbuf *msg_types = cb;
+	int ret;
+
+	ret = fetch_pack_fsck_config(var, value, msg_types);
+	if (ret < 0)
+		return 0;
+
+	return ret;
+}
+
 static int fetch_refs_from_bundle(struct transport *transport,
 				  int nr_heads UNUSED,
 				  struct ref **to_fetch UNUSED)
@@ -181,6 +195,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	};
 	struct bundle_transport_data *data = transport->data;
 	struct strvec extra_index_pack_args = STRVEC_INIT;
+	struct strbuf msg_types = STRBUF_INIT;
 	int ret;
 
 	if (transport->progress)
@@ -188,11 +203,16 @@ static int fetch_refs_from_bundle(struct transport *transport,
 
 	if (!data->get_refs_from_bundle_called)
 		get_refs_from_bundle_inner(transport);
+
+	git_config(fetch_fsck_config_cb, &msg_types);
+	opts.fsck_msg_types = msg_types.buf;
+
 	ret = unbundle(the_repository, &data->header, data->fd,
 		       &extra_index_pack_args, &opts);
 	transport->hash_algo = data->header.hash_algo;
 
 	strvec_clear(&extra_index_pack_args);
+	strbuf_release(&msg_types);
 	return ret;
 }
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()`
  2024-11-22 16:45   ` shejialuo
@ 2024-11-27  1:21     ` Justin Tobler
  0 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27  1:21 UTC (permalink / raw)
  To: shejialuo; +Cc: git

On 24/11/23 12:45AM, shejialuo wrote:
> On Thu, Nov 21, 2024 at 02:41:18PM -0600, Justin Tobler wrote:
> > During fetch-pack operations, git-index-pack(1) may be spawned and
> > perform fsck checks. The message severity of these checks is
> > configurable and propagated via appending it to the `--fsck-objects`
> > option.
> > 
> > With `fetch_pack_config_cb()`, fsck configuration gets populated to a
> > `fetch_pack_options`. Expose `fetch_pack_config_cb()`, to facilitate
> > formatted fsck message configuration generation. In a subsequent commit,
> > this is used to wire message configuration to `unbundle()` during bundle
> > fetches.
> > 
> 
> In my perspective, we may not separate [PATCH 4/5] and [PATCH 5/5].
> Should the reason why we want to expose `fetch_pack_config_cb` is that
> we need to propagate the fsck severity to `unbundle`? Without the
> information of the last patch, we cannot know any detail thing. So, they
> are highly relevant.

Thanks for the feedback. Due to other suggested changes, I've opted to
condense the third and fourth patch into one in the next version which I
think helps somewhat. I've also tried to update the commit messages to
provide additional context. Hopefully this helps :)

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 4/4] transport: propagate fsck configuration during bundle fetch
  2024-11-27  0:57   ` [PATCH v2 4/4] transport: propagate fsck configuration during bundle fetch Justin Tobler
@ 2024-11-27  1:39     ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2024-11-27  1:39 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> +static int fetch_fsck_config_cb(const char *var, const char *value,
> +				const struct config_context *ctx UNUSED, void *cb)
> +{
> +	struct strbuf *msg_types = cb;
> +	int ret;
> +
> +	ret = fetch_pack_fsck_config(var, value, msg_types);
> +	if (ret < 0)
> +		return 0;
> +
> +	return ret;
> +}

Looks good.  Both callers of fetch_pack_fsck_config() must deal with
the unusual -1 return from it that signals "I didn't find what I
care", and the original caller does so by calling the default callback
while this one just says "everything is well".

I wasn't carefully looking for typoes and style violations, but the
logic flow in all these four patches looked perfect in this round.

Thanks for a pleasant read.  Will queue.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 2/4] bundle: support fsck message configuration
  2024-11-27  0:57   ` [PATCH v2 2/4] bundle: support fsck message configuration Justin Tobler
@ 2024-11-27  5:44     ` Patrick Steinhardt
  0 siblings, 0 replies; 41+ messages in thread
From: Patrick Steinhardt @ 2024-11-27  5:44 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

On Tue, Nov 26, 2024 at 06:57:05PM -0600, Justin Tobler wrote:
> diff --git a/bundle.c b/bundle.c
> index 485033ea3f..4e53ddfca2 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -631,12 +631,12 @@ int unbundle(struct repository *r, struct bundle_header *header,
>  	     struct unbundle_opts *opts)
>  {
>  	struct child_process ip = CHILD_PROCESS_INIT;
> -	enum verify_bundle_flags flags = 0;
> +	struct unbundle_opts opts_fallback = { 0 };
>  
> -	if (opts)
> -		flags = opts->flags;
> +	if (!opts)
> +		opts = &opts_fallback;

Tiny nit: you could've introduced the fallback in the first commit
already. Like this you first introduce the code pattern and then change
it immediately in the subsequent commit.

Not worth a reroll though.

> diff --git a/bundle.h b/bundle.h
> index 6a09cc7bfb..df17326b09 100644
> --- a/bundle.h
> +++ b/bundle.h
> @@ -41,6 +41,13 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
>  
>  struct unbundle_opts {
>  	enum verify_bundle_flags flags;
> +	/**

Nit: s|/**/|/*|

Again, not worth a reroll from my point of view, also with the recent
discussion at <877c8yti5n.fsf@iotcl.com> in mind where we basically went
"We don't care about them".

Patrick

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 3/4] fetch-pack: split out fsck config parsing
  2024-11-27  0:57   ` [PATCH v2 3/4] fetch-pack: split out fsck config parsing Justin Tobler
@ 2024-11-27  5:44     ` Patrick Steinhardt
  2024-11-27 17:37       ` Justin Tobler
  0 siblings, 1 reply; 41+ messages in thread
From: Patrick Steinhardt @ 2024-11-27  5:44 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

On Tue, Nov 26, 2024 at 06:57:06PM -0600, Justin Tobler wrote:
> diff --git a/fetch-pack.c b/fetch-pack.c
> index fe1fb3c1b7..e7d4f6e6e2 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1877,14 +1877,24 @@ static int fetch_pack_config_cb(const char *var, const char *value,
>  		if (!value)
>  			return config_error_nonbool(var);
>  		if (is_valid_msg_type(msg_id, value))
> -			strbuf_addf(&fsck_msg_types, "%c%s=%s",
> -				fsck_msg_types.len ? ',' : '=', msg_id, value);
> +			strbuf_addf(msg_types, "%c%s=%s",
> +				msg_types->len ? ',' : '=', msg_id, value);
>  		else
>  			warning("Skipping unknown msg id '%s'", msg_id);
>  		return 0;
>  	}
>  
> -	return git_default_config(var, value, ctx, cb);
> +	return -1;
> +}
> +
> +static int fetch_pack_config_cb(const char *var, const char *value,
> +				const struct config_context *ctx, void *cb)
> +{
> +	int ret = fetch_pack_fsck_config(var, value, &fsck_msg_types);
> +	if (ret < 0)
> +		return git_default_config(var, value, ctx, cb);
> +
> +	return ret;
>  }
>  
>  static void fetch_pack_config(void)

Okay, this now reads a lot nicer. But I'm sceptical whether we should
return -1 for the case where the value wasn't handled because we now
cannot discern the case where the function returns an error from the
case where it simply didn't handle the value.

In fact, we cannot even use positive values right now as far as I can
see:

  - We return 0 on success.

  - We return 1 in case `git_config_pathname()` indicates an error.

  - We return -1 when there is no value with "fetch.fsck.".

I'd propose to have a look at whether the positive return value from the
second case is actually used anywhere. If not, we can refactor this case
so that we always return negative on errors. And then we could further
adapt the function to return positive in case it didn't handle the
value.

> diff --git a/fetch-pack.h b/fetch-pack.h
> index b5c579cdae..c667b6fbf3 100644
> --- a/fetch-pack.h
> +++ b/fetch-pack.h
> @@ -106,4 +106,7 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
>   */
>  int fetch_pack_fsck_objects(void);
>  
> +int fetch_pack_fsck_config(const char *var, const char *value,
> +			   struct strbuf *msg_types);

We should also add some docs here, at the least to document the error
codes.

Patrick

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 3/4] fetch-pack: split out fsck config parsing
  2024-11-27  5:44     ` Patrick Steinhardt
@ 2024-11-27 17:37       ` Justin Tobler
  0 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27 17:37 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/11/27 06:44AM, Patrick Steinhardt wrote:
> On Tue, Nov 26, 2024 at 06:57:06PM -0600, Justin Tobler wrote:
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index fe1fb3c1b7..e7d4f6e6e2 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1877,14 +1877,24 @@ static int fetch_pack_config_cb(const char *var, const char *value,
> >  		if (!value)
> >  			return config_error_nonbool(var);
> >  		if (is_valid_msg_type(msg_id, value))
> > -			strbuf_addf(&fsck_msg_types, "%c%s=%s",
> > -				fsck_msg_types.len ? ',' : '=', msg_id, value);
> > +			strbuf_addf(msg_types, "%c%s=%s",
> > +				msg_types->len ? ',' : '=', msg_id, value);
> >  		else
> >  			warning("Skipping unknown msg id '%s'", msg_id);
> >  		return 0;
> >  	}
> >  
> > -	return git_default_config(var, value, ctx, cb);
> > +	return -1;
> > +}
> > +
> > +static int fetch_pack_config_cb(const char *var, const char *value,
> > +				const struct config_context *ctx, void *cb)
> > +{
> > +	int ret = fetch_pack_fsck_config(var, value, &fsck_msg_types);
> > +	if (ret < 0)
> > +		return git_default_config(var, value, ctx, cb);
> > +
> > +	return ret;
> >  }
> >  
> >  static void fetch_pack_config(void)
> 
> Okay, this now reads a lot nicer. But I'm sceptical whether we should
> return -1 for the case where the value wasn't handled because we now
> cannot discern the case where the function returns an error from the
> case where it simply didn't handle the value.
> 
> In fact, we cannot even use positive values right now as far as I can
> see:
> 
>   - We return 0 on success.
> 
>   - We return 1 in case `git_config_pathname()` indicates an error.
> 
>   - We return -1 when there is no value with "fetch.fsck.".
> 
> I'd propose to have a look at whether the positive return value from the
> second case is actually used anywhere. If not, we can refactor this case
> so that we always return negative on errors. And then we could further
> adapt the function to return positive in case it didn't handle the
> value.

The `fetch_pack_fsck_config()` function is only used through callback
functions for `git_config()`. The `config_fn_t` callback is expected to
return 0 for success, or -1 if the variable could not be parsed
properly. Taking a look at `configset_iter()`, in practice any return
value >= 0 is treated as success.

This means that because the `git_config_pathname()` errors in
`fetch_pack_fsck_config()` return 1, an error is printed, but the
operation is able to proceed.

  # Skip list is missing path value.
  $ git -c fetch.fsck.skipList fetch
  error: missing value for 'fetch.fsck.skiplist'
  <fetch continues...>

This skip list flag behavior is consistent across other fsck operations.
Changing it here to return -1 would make config parsing more strict and
result in a missing skip list path causing the whole operation to die.

Since returning 1 is treated as a success anyway, we could change it to
return 0. This would result in functionally the same behavior and free
up `fetch_pack_fsck_config()` to return 1 to indicate that there is no
value with "fetch.fsck.". This would allow us to properly differentiate
between the "didn't handle the value" and error cases.

I'll send another version making this change and add some comments to
document the behavior.

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 0/4] propagate fsck message severity for bundle fetch
  2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
                     ` (3 preceding siblings ...)
  2024-11-27  0:57   ` [PATCH v2 4/4] transport: propagate fsck configuration during bundle fetch Justin Tobler
@ 2024-11-27 23:33   ` Justin Tobler
  2024-11-27 23:33     ` [PATCH v3 1/4] bundle: add bundle verification options type Justin Tobler
                       ` (3 more replies)
  4 siblings, 4 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27 23:33 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

Greetings,

With 63d903ff52 (unbundle: extend object verification for fetches,
2024-06-19), fsck checks are now performed on fetched bundles depending
on `transfer.fsckObjects` and `fetch.fsckObjects` configuration. This
works, but provides no means to override the default fsck message
severity as is done for other git-fetch(1) operations. This series aims
to propagate fsck message severity configuration to the underlying
git-index-pack(1) process executed on the bundle in a similar manner as
done with git-fetch-pack(1).

  - Patches 1 and 2 adapt the bundle subsystem to support additional
    options when performing `unbundle()`.

  - Patch 3 adapts the fetch-pack subsystem to expose a means to
    generate the message configuration arguments.

  - Patch 4 wires the newly generated fsck configuration options to the
    bundle options used when fetching from a bundle.

Changes in V3:

  - The `fetch_pack_fsck_config()` has been updated to now return 1 when
    the provided config variable is undhandled instead of returning -1.
    This allows call sites to properly differentiate between errors and
    unhandled config variables. To make this change, the handling of
    `git_config_path()` errors was updated to return 0 instead of 1.
    Both of these return values considered by `git_config()` to be a
    success so there is no functional change. This allows returning 1 to
    now indicate that the config variable was not process only.

  - Added comment to document expected `fetch_pack_fsck_config()`
    behavior and return values.

  - Small comment style change.

Thanks
-Justin

Justin Tobler (4):
  bundle: add bundle verification options type
  bundle: support fsck message configuration
  fetch-pack: split out fsck config parsing
  transport: propagate fsck configuration during bundle fetch

 builtin/bundle.c        |  2 +-
 bundle-uri.c            |  7 +++++--
 bundle.c                | 13 +++++++++----
 bundle.h                | 17 ++++++++++++++---
 fetch-pack.c            | 26 ++++++++++++++++++--------
 fetch-pack.h            | 11 +++++++++++
 t/t5607-clone-bundle.sh |  7 +++++++
 transport.c             | 26 ++++++++++++++++++++++++--
 8 files changed, 89 insertions(+), 20 deletions(-)

Range-diff against v2:
1:  da47f0aa0f = 1:  da47f0aa0f bundle: add bundle verification options type
2:  19e91c9f99 ! 2:  5dbd0fa6b7 bundle: support fsck message configuration
    @@ bundle.h: int verify_bundle(struct repository *r, struct bundle_header *header,
      
      struct unbundle_opts {
      	enum verify_bundle_flags flags;
    -+	/**
    ++	/*
     +	 * fsck_msg_types may optionally contain fsck message severity
     +	 * configuration. If present, this configuration gets directly appended
     +	 * to a '--fsck-objects' option and therefore must be prefixed with '='.
3:  527874e73d ! 3:  b8db9af9e7 fetch-pack: split out fsck config parsing
    @@ Commit message
         `fetch_pack_fsck_config()` and expose it. In a subsequent commit, this
         is used to provide fsck configuration when invoking `unbundle()`.
     
    +    For `fetch_pack_fsck_config()` to discern between errors and unhandled
    +    config variables, the return code when `git_config_path()` errors is
    +    changed to a different value also indicating success. This frees up the
    +    previous return code to now indicate the provided config variable
    +    was unhandled. The behavior remains functionally the same.
    +
         Signed-off-by: Justin Tobler <jltobler@gmail.com>
     
      ## fetch-pack.c ##
    @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
      	const char *msg_id;
      
     @@ fetch-pack.c: static int fetch_pack_config_cb(const char *var, const char *value,
    + 		char *path ;
      
      		if (git_config_pathname(&path, var, value))
    - 			return 1;
    +-			return 1;
     -		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
     -			fsck_msg_types.len ? ',' : '=', path);
    ++			return 0;
     +		strbuf_addf(msg_types, "%cskiplist=%s",
     +			msg_types->len ? ',' : '=', path);
      		free(path);
    @@ fetch-pack.c: static int fetch_pack_config_cb(const char *var, const char *value
      	}
      
     -	return git_default_config(var, value, ctx, cb);
    -+	return -1;
    ++	return 1;
     +}
     +
     +static int fetch_pack_config_cb(const char *var, const char *value,
     +				const struct config_context *ctx, void *cb)
     +{
     +	int ret = fetch_pack_fsck_config(var, value, &fsck_msg_types);
    -+	if (ret < 0)
    ++	if (ret > 0)
     +		return git_default_config(var, value, ctx, cb);
     +
     +	return ret;
    @@ fetch-pack.h: int report_unmatched_refs(struct ref **sought, int nr_sought);
       */
      int fetch_pack_fsck_objects(void);
      
    ++/*
    ++ * Check if the provided config variable pertains to fetch fsck and if so append
    ++ * the configuration to the provided strbuf.
    ++ *
    ++ * When a fetch fsck config option is successfully processed the function
    ++ * returns 0. If the provided config option is unrelated to fetch fsck, 1 is
    ++ * returned. Errors return -1.
    ++ */
     +int fetch_pack_fsck_config(const char *var, const char *value,
     +			   struct strbuf *msg_types);
     +
4:  b1a3f73561 ! 4:  cc8ae0a1c4 transport: propagate fsck configuration during bundle fetch
    @@ transport.c: static struct ref *get_refs_from_bundle(struct transport *transport
     +	int ret;
     +
     +	ret = fetch_pack_fsck_config(var, value, msg_types);
    -+	if (ret < 0)
    ++	if (ret > 0)
     +		return 0;
     +
     +	return ret;

base-commit: 4083a6f05206077a50af7658bedc17a94c54607d
-- 
2.47.0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 1/4] bundle: add bundle verification options type
  2024-11-27 23:33   ` [PATCH v3 0/4] propagate fsck message severity for " Justin Tobler
@ 2024-11-27 23:33     ` Justin Tobler
  2024-11-27 23:33     ` [PATCH v3 2/4] bundle: support fsck message configuration Justin Tobler
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27 23:33 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When `unbundle()` is invoked, fsck verification may be configured by
passing the `VERIFY_BUNDLE_FSCK` flag. This mechanism allows fsck checks
on the bundle to be enabled or disabled entirely. To facilitate more
fine-grained fsck configuration, additional context must be provided to
`unbundle()`.

Introduce the `unbundle_opts` type, which wraps the existing
`verify_bundle_flags`, to facilitate future extension of `unbundle()`
configuration. Also update `unbundle()` and its call sites to accept
this new options type instead of the flags directly. The end behavior is
functionally the same, but allows for the set of configurable options to
be extended. This is leveraged in a subsequent commit to enable fsck
message severity configuration.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 builtin/bundle.c |  2 +-
 bundle-uri.c     |  7 +++++--
 bundle.c         |  6 +++++-
 bundle.h         | 10 +++++++---
 transport.c      |  6 ++++--
 5 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/builtin/bundle.c b/builtin/bundle.c
index 127518c2a8..15ac75ab51 100644
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -218,7 +218,7 @@ static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix)
 		strvec_pushl(&extra_index_pack_args, "-v", "--progress-title",
 			     _("Unbundling objects"), NULL);
 	ret = !!unbundle(the_repository, &header, bundle_fd,
-			 &extra_index_pack_args, 0) ||
+			 &extra_index_pack_args, NULL) ||
 		list_bundle_refs(&header, argc, argv);
 	bundle_header_release(&header);
 
diff --git a/bundle-uri.c b/bundle-uri.c
index 0df66e2872..cdf9e4f9e1 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -367,6 +367,10 @@ static int unbundle_from_file(struct repository *r, const char *file)
 	struct string_list_item *refname;
 	struct strbuf bundle_ref = STRBUF_INIT;
 	size_t bundle_prefix_len;
+	struct unbundle_opts opts = {
+		.flags = VERIFY_BUNDLE_QUIET |
+			 (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0),
+	};
 
 	bundle_fd = read_bundle_header(file, &header);
 	if (bundle_fd < 0) {
@@ -379,8 +383,7 @@ static int unbundle_from_file(struct repository *r, const char *file)
 	 * a reachable ref pointing to the new tips, which will reach
 	 * the prerequisite commits.
 	 */
-	result = unbundle(r, &header, bundle_fd, NULL,
-			  VERIFY_BUNDLE_QUIET | (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0));
+	result = unbundle(r, &header, bundle_fd, NULL, &opts);
 	if (result) {
 		result = 1;
 		goto cleanup;
diff --git a/bundle.c b/bundle.c
index 4773b51eb1..485033ea3f 100644
--- a/bundle.c
+++ b/bundle.c
@@ -628,9 +628,13 @@ int create_bundle(struct repository *r, const char *path,
 
 int unbundle(struct repository *r, struct bundle_header *header,
 	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags)
+	     struct unbundle_opts *opts)
 {
 	struct child_process ip = CHILD_PROCESS_INIT;
+	enum verify_bundle_flags flags = 0;
+
+	if (opts)
+		flags = opts->flags;
 
 	if (verify_bundle(r, header, flags))
 		return -1;
diff --git a/bundle.h b/bundle.h
index 5ccc9a061a..6a09cc7bfb 100644
--- a/bundle.h
+++ b/bundle.h
@@ -39,6 +39,10 @@ enum verify_bundle_flags {
 int verify_bundle(struct repository *r, struct bundle_header *header,
 		  enum verify_bundle_flags flags);
 
+struct unbundle_opts {
+	enum verify_bundle_flags flags;
+};
+
 /**
  * Unbundle after reading the header with read_bundle_header().
  *
@@ -49,12 +53,12 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
  * (e.g. "-v" for verbose/progress), NULL otherwise. The provided
  * "extra_index_pack_args" (if any) will be strvec_clear()'d for you.
  *
- * Before unbundling, this method will call verify_bundle() with the
- * given 'flags'.
+ * Before unbundling, this method will call verify_bundle() with 'flags'
+ * provided in 'opts'.
  */
 int unbundle(struct repository *r, struct bundle_header *header,
 	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags);
+	     struct unbundle_opts *opts);
 int list_bundle_refs(struct bundle_header *header,
 		int argc, const char **argv);
 
diff --git a/transport.c b/transport.c
index 47fda6a773..8536b14181 100644
--- a/transport.c
+++ b/transport.c
@@ -176,6 +176,9 @@ static int fetch_refs_from_bundle(struct transport *transport,
 				  int nr_heads UNUSED,
 				  struct ref **to_fetch UNUSED)
 {
+	struct unbundle_opts opts = {
+		.flags = fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0,
+	};
 	struct bundle_transport_data *data = transport->data;
 	struct strvec extra_index_pack_args = STRVEC_INIT;
 	int ret;
@@ -186,8 +189,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	if (!data->get_refs_from_bundle_called)
 		get_refs_from_bundle_inner(transport);
 	ret = unbundle(the_repository, &data->header, data->fd,
-		       &extra_index_pack_args,
-		       fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0);
+		       &extra_index_pack_args, &opts);
 	transport->hash_algo = data->header.hash_algo;
 
 	strvec_clear(&extra_index_pack_args);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 2/4] bundle: support fsck message configuration
  2024-11-27 23:33   ` [PATCH v3 0/4] propagate fsck message severity for " Justin Tobler
  2024-11-27 23:33     ` [PATCH v3 1/4] bundle: add bundle verification options type Justin Tobler
@ 2024-11-27 23:33     ` Justin Tobler
  2024-11-27 23:33     ` [PATCH v3 3/4] fetch-pack: split out fsck config parsing Justin Tobler
  2024-11-27 23:33     ` [PATCH v3 4/4] transport: propagate fsck configuration during bundle fetch Justin Tobler
  3 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27 23:33 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

If the `VERIFY_BUNDLE_FLAG` is set during `unbundle()`, the
git-index-pack(1) spawned is configured with the `--fsck-options` flag
to perform fsck verification. With this flag enabled, there is not a way
to configure fsck message severity though.

Extend the `unbundle_opts` type to store fsck message severity
configuration and update `unbundle()` to conditionally append it to the
`--fsck-objects` flag if provided. This enables `unbundle()` call sites
to support optionally setting the severity for specific fsck messages.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 bundle.c | 13 +++++++------
 bundle.h |  7 +++++++
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/bundle.c b/bundle.c
index 485033ea3f..4e53ddfca2 100644
--- a/bundle.c
+++ b/bundle.c
@@ -631,12 +631,12 @@ int unbundle(struct repository *r, struct bundle_header *header,
 	     struct unbundle_opts *opts)
 {
 	struct child_process ip = CHILD_PROCESS_INIT;
-	enum verify_bundle_flags flags = 0;
+	struct unbundle_opts opts_fallback = { 0 };
 
-	if (opts)
-		flags = opts->flags;
+	if (!opts)
+		opts = &opts_fallback;
 
-	if (verify_bundle(r, header, flags))
+	if (verify_bundle(r, header, opts->flags))
 		return -1;
 
 	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
@@ -645,8 +645,9 @@ int unbundle(struct repository *r, struct bundle_header *header,
 	if (header->filter.choice)
 		strvec_push(&ip.args, "--promisor=from-bundle");
 
-	if (flags & VERIFY_BUNDLE_FSCK)
-		strvec_push(&ip.args, "--fsck-objects");
+	if (opts->flags & VERIFY_BUNDLE_FSCK)
+		strvec_pushf(&ip.args, "--fsck-objects%s",
+			     opts->fsck_msg_types ? opts->fsck_msg_types : "");
 
 	if (extra_index_pack_args)
 		strvec_pushv(&ip.args, extra_index_pack_args->v);
diff --git a/bundle.h b/bundle.h
index 6a09cc7bfb..a80aa8ad9b 100644
--- a/bundle.h
+++ b/bundle.h
@@ -41,6 +41,13 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
 
 struct unbundle_opts {
 	enum verify_bundle_flags flags;
+	/*
+	 * fsck_msg_types may optionally contain fsck message severity
+	 * configuration. If present, this configuration gets directly appended
+	 * to a '--fsck-objects' option and therefore must be prefixed with '='.
+	 * (E.g. "=missingEmail=ignore,gitmodulesUrl=ignore")
+	 */
+	const char *fsck_msg_types;
 };
 
 /**
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 3/4] fetch-pack: split out fsck config parsing
  2024-11-27 23:33   ` [PATCH v3 0/4] propagate fsck message severity for " Justin Tobler
  2024-11-27 23:33     ` [PATCH v3 1/4] bundle: add bundle verification options type Justin Tobler
  2024-11-27 23:33     ` [PATCH v3 2/4] bundle: support fsck message configuration Justin Tobler
@ 2024-11-27 23:33     ` Justin Tobler
  2024-11-28  3:25       ` Junio C Hamano
  2024-11-27 23:33     ` [PATCH v3 4/4] transport: propagate fsck configuration during bundle fetch Justin Tobler
  3 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-11-27 23:33 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When `fetch_pack_config()` is invoked, fetch-pack configuration is
parsed from the config. As part of this operation, fsck message severity
configuration is assigned to the `fsck_msg_types` global variable. This
is optionally used to configure the downstream git-index-pack(1) when
the `--strict` option is specified.

The same parsed fsck message severity configuration is also needed
outside of fetch-pack. Instead of exposing/relying on the existing
global state, split out the fsck config parsing logic into
`fetch_pack_fsck_config()` and expose it. In a subsequent commit, this
is used to provide fsck configuration when invoking `unbundle()`.

For `fetch_pack_fsck_config()` to discern between errors and unhandled
config variables, the return code when `git_config_path()` errors is
changed to a different value also indicating success. This frees up the
previous return code to now indicate the provided config variable
was unhandled. The behavior remains functionally the same.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 fetch-pack.c | 26 ++++++++++++++++++--------
 fetch-pack.h | 11 +++++++++++
 2 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index fe1fb3c1b7..c095f3a84b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1857,8 +1857,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	return ref;
 }
 
-static int fetch_pack_config_cb(const char *var, const char *value,
-				const struct config_context *ctx, void *cb)
+int fetch_pack_fsck_config(const char *var, const char *value,
+			   struct strbuf *msg_types)
 {
 	const char *msg_id;
 
@@ -1866,9 +1866,9 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 		char *path ;
 
 		if (git_config_pathname(&path, var, value))
-			return 1;
-		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
-			fsck_msg_types.len ? ',' : '=', path);
+			return 0;
+		strbuf_addf(msg_types, "%cskiplist=%s",
+			msg_types->len ? ',' : '=', path);
 		free(path);
 		return 0;
 	}
@@ -1877,14 +1877,24 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 		if (!value)
 			return config_error_nonbool(var);
 		if (is_valid_msg_type(msg_id, value))
-			strbuf_addf(&fsck_msg_types, "%c%s=%s",
-				fsck_msg_types.len ? ',' : '=', msg_id, value);
+			strbuf_addf(msg_types, "%c%s=%s",
+				msg_types->len ? ',' : '=', msg_id, value);
 		else
 			warning("Skipping unknown msg id '%s'", msg_id);
 		return 0;
 	}
 
-	return git_default_config(var, value, ctx, cb);
+	return 1;
+}
+
+static int fetch_pack_config_cb(const char *var, const char *value,
+				const struct config_context *ctx, void *cb)
+{
+	int ret = fetch_pack_fsck_config(var, value, &fsck_msg_types);
+	if (ret > 0)
+		return git_default_config(var, value, ctx, cb);
+
+	return ret;
 }
 
 static void fetch_pack_config(void)
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..9d3470366f 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -106,4 +106,15 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
  */
 int fetch_pack_fsck_objects(void);
 
+/*
+ * Check if the provided config variable pertains to fetch fsck and if so append
+ * the configuration to the provided strbuf.
+ *
+ * When a fetch fsck config option is successfully processed the function
+ * returns 0. If the provided config option is unrelated to fetch fsck, 1 is
+ * returned. Errors return -1.
+ */
+int fetch_pack_fsck_config(const char *var, const char *value,
+			   struct strbuf *msg_types);
+
 #endif
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 4/4] transport: propagate fsck configuration during bundle fetch
  2024-11-27 23:33   ` [PATCH v3 0/4] propagate fsck message severity for " Justin Tobler
                       ` (2 preceding siblings ...)
  2024-11-27 23:33     ` [PATCH v3 3/4] fetch-pack: split out fsck config parsing Justin Tobler
@ 2024-11-27 23:33     ` Justin Tobler
  3 siblings, 0 replies; 41+ messages in thread
From: Justin Tobler @ 2024-11-27 23:33 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler

When fetching directly from a bundle, fsck message severity
configuration is not propagated to the underlying git-index-pack(1). It
is only capable of enabling or disabling fsck checks entirely. This does
not align with the fsck behavior for fetches through git-fetch-pack(1).

Use the fsck config parsing from fetch-pack to populate fsck message
severity configuration and wire it through to `unbundle()` to enable the
same fsck verification as done through fetch-pack.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 t/t5607-clone-bundle.sh |  7 +++++++
 transport.c             | 20 ++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh
index 7ceaa8194d..c69aa88eae 100755
--- a/t/t5607-clone-bundle.sh
+++ b/t/t5607-clone-bundle.sh
@@ -171,6 +171,13 @@ test_expect_success 'clone bundle with different fsckObjects configurations' '
 
 	test_must_fail git -c transfer.fsckObjects=true \
 		clone bundle-fsck/bad.bundle bundle-transfer-fsck 2>err &&
+	test_grep "missingEmail" err &&
+
+	git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=ignore \
+		clone bundle-fsck/bad.bundle bundle-fsck-ignore &&
+
+	test_must_fail git -c fetch.fsckObjects=true -c fetch.fsck.missingEmail=error \
+		clone bundle-fsck/bad.bundle bundle-fsck-error 2>err &&
 	test_grep "missingEmail" err
 '
 
diff --git a/transport.c b/transport.c
index 8536b14181..6966df51a8 100644
--- a/transport.c
+++ b/transport.c
@@ -19,6 +19,7 @@
 #include "branch.h"
 #include "url.h"
 #include "submodule.h"
+#include "strbuf.h"
 #include "string-list.h"
 #include "oid-array.h"
 #include "sigchain.h"
@@ -172,6 +173,19 @@ static struct ref *get_refs_from_bundle(struct transport *transport,
 	return result;
 }
 
+static int fetch_fsck_config_cb(const char *var, const char *value,
+				const struct config_context *ctx UNUSED, void *cb)
+{
+	struct strbuf *msg_types = cb;
+	int ret;
+
+	ret = fetch_pack_fsck_config(var, value, msg_types);
+	if (ret > 0)
+		return 0;
+
+	return ret;
+}
+
 static int fetch_refs_from_bundle(struct transport *transport,
 				  int nr_heads UNUSED,
 				  struct ref **to_fetch UNUSED)
@@ -181,6 +195,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	};
 	struct bundle_transport_data *data = transport->data;
 	struct strvec extra_index_pack_args = STRVEC_INIT;
+	struct strbuf msg_types = STRBUF_INIT;
 	int ret;
 
 	if (transport->progress)
@@ -188,11 +203,16 @@ static int fetch_refs_from_bundle(struct transport *transport,
 
 	if (!data->get_refs_from_bundle_called)
 		get_refs_from_bundle_inner(transport);
+
+	git_config(fetch_fsck_config_cb, &msg_types);
+	opts.fsck_msg_types = msg_types.buf;
+
 	ret = unbundle(the_repository, &data->header, data->fd,
 		       &extra_index_pack_args, &opts);
 	transport->hash_algo = data->header.hash_algo;
 
 	strvec_clear(&extra_index_pack_args);
+	strbuf_release(&msg_types);
 	return ret;
 }
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/4] fetch-pack: split out fsck config parsing
  2024-11-27 23:33     ` [PATCH v3 3/4] fetch-pack: split out fsck config parsing Justin Tobler
@ 2024-11-28  3:25       ` Junio C Hamano
  2024-12-03  9:34         ` Patrick Steinhardt
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2024-11-28  3:25 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> For `fetch_pack_fsck_config()` to discern between errors and unhandled
> config variables, the return code when `git_config_path()` errors is
> changed to a different value also indicating success. This frees up the
> previous return code to now indicate the provided config variable
> was unhandled. The behavior remains functionally the same.

> @@ -1866,9 +1866,9 @@ static int fetch_pack_config_cb(const char *var, const char *value,
>  		char *path ;
>  
>  		if (git_config_pathname(&path, var, value))
> -			return 1;
> +			return 0;

So, after getting complaint about a misconfiguration, the caller
used to be able to, if they wanted to, tell two cases apart: a
misconfigured variable was ignored here, and we happily parsed the
configuration.  Now they no longer can tell them apart, because we
return 0 for both cases.

> -		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
> -			fsck_msg_types.len ? ',' : '=', path);
> +		strbuf_addf(msg_types, "%cskiplist=%s",
> +			msg_types->len ? ',' : '=', path);
>  		free(path);
>  		return 0;
>  	}
> @@ -1877,14 +1877,24 @@ static int fetch_pack_config_cb(const char *var, const char *value,
>  		if (!value)
>  			return config_error_nonbool(var);
>  		if (is_valid_msg_type(msg_id, value))
> -			strbuf_addf(&fsck_msg_types, "%c%s=%s",
> -				fsck_msg_types.len ? ',' : '=', msg_id, value);
> +			strbuf_addf(msg_types, "%c%s=%s",
> +				msg_types->len ? ',' : '=', msg_id, value);
>  		else
>  			warning("Skipping unknown msg id '%s'", msg_id);
>  		return 0;
>  	}
>  
> -	return git_default_config(var, value, ctx, cb);
> +	return 1;
> +}

And a 1 is returned to signal a "we didn't see what we care about".

> +static int fetch_pack_config_cb(const char *var, const char *value,
> +				const struct config_context *ctx, void *cb)
> +{
> +	int ret = fetch_pack_fsck_config(var, value, &fsck_msg_types);
> +	if (ret > 0)
> +		return git_default_config(var, value, ctx, cb);
> +
> +	return ret;
>  }

And we catch that 1 and ask git_default_config() to use what we
didn't use.

OK.  If I were doing this patch, I would probably have chosen not to
change the value used to signal "a misconfiguration but that is not
too serious so we'll ignore after warning", picked another and new
value, like 2, to signal "the key is not something we care about",
which would mean fetch_pack_config_cb() would call "default" only
when it sees 2.  But with the current callers, and with the callers
after this series, it is not necessary.  If we need to accomodate
other new callers later, we can make such an update then.

This step, together with other patches, are nicely done.

Will queue.  Thanks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/4] fetch-pack: split out fsck config parsing
  2024-11-28  3:25       ` Junio C Hamano
@ 2024-12-03  9:34         ` Patrick Steinhardt
  2024-12-03 14:23           ` Justin Tobler
  0 siblings, 1 reply; 41+ messages in thread
From: Patrick Steinhardt @ 2024-12-03  9:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Justin Tobler, git

On Thu, Nov 28, 2024 at 12:25:44PM +0900, Junio C Hamano wrote:
> Justin Tobler <jltobler@gmail.com> writes:
> 
> > For `fetch_pack_fsck_config()` to discern between errors and unhandled
> > config variables, the return code when `git_config_path()` errors is
> > changed to a different value also indicating success. This frees up the
> > previous return code to now indicate the provided config variable
> > was unhandled. The behavior remains functionally the same.
> 
> > @@ -1866,9 +1866,9 @@ static int fetch_pack_config_cb(const char *var, const char *value,
> >  		char *path ;
> >  
> >  		if (git_config_pathname(&path, var, value))
> > -			return 1;
> > +			return 0;
> 
> So, after getting complaint about a misconfiguration, the caller
> used to be able to, if they wanted to, tell two cases apart: a
> misconfigured variable was ignored here, and we happily parsed the
> configuration.  Now they no longer can tell them apart, because we
> return 0 for both cases.

I wonder why we want to ignore these errors though. Grepping through the
codebase surfaces that all other callsites of `git_config_pathname()`
return its return value directly, which means that we'll die in case the
path name cannot be parsed. Shouldn't we do the same here, or is there a
good reason why we need to ignore it other than "We used to do it like
that"? In other words, I would have the below diff.

Other than that this patch series looks good to me.

Patrick

diff --git a/fetch-pack.c b/fetch-pack.c
index fe1fb3c1b7c..c60627bba4e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1863,10 +1863,13 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 	const char *msg_id;
 
 	if (strcmp(var, "fetch.fsck.skiplist") == 0) {
-		char *path ;
+		char *path;
+		int err;
+
+		err = git_config_pathname(&path, var, value);
+		if (err < 0)
+			return -1;
 
-		if (git_config_pathname(&path, var, value))
-			return 1;
 		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
 			fsck_msg_types.len ? ',' : '=', path);
 		free(path);
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 138e6778a47..73d6b3874de 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -240,6 +240,15 @@ test_expect_success 'push with receive.fsck.skipList' '
 	git push --porcelain dst bogus
 '
 
+test_expect_success 'fetch with implitcit fetch.fsck.skipList value fails' '
+	test_when_finished "rm -rf source repo" &&
+	git init source &&
+	test_commit -C source initial &&
+	git init repo &&
+	test_must_fail git -C repo -c fetch.fsck.skipList fetch "file://$(pwd)/source" 2>err &&
+	test_grep "unable to parse ${SQ}fetch.fsck.skiplist${SQ}" err
+'
+
 test_expect_success 'fetch with fetch.fsck.skipList' '
 	refspec=refs/heads/bogus:refs/heads/bogus &&
 	git push . $commit:refs/heads/bogus &&

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/4] fetch-pack: split out fsck config parsing
  2024-12-03  9:34         ` Patrick Steinhardt
@ 2024-12-03 14:23           ` Justin Tobler
  2024-12-03 14:28             ` Patrick Steinhardt
  0 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-12-03 14:23 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, git

On 24/12/03 10:34AM, Patrick Steinhardt wrote:
> On Thu, Nov 28, 2024 at 12:25:44PM +0900, Junio C Hamano wrote:
> > Justin Tobler <jltobler@gmail.com> writes:
> > 
> > > For `fetch_pack_fsck_config()` to discern between errors and unhandled
> > > config variables, the return code when `git_config_path()` errors is
> > > changed to a different value also indicating success. This frees up the
> > > previous return code to now indicate the provided config variable
> > > was unhandled. The behavior remains functionally the same.
> > 
> > > @@ -1866,9 +1866,9 @@ static int fetch_pack_config_cb(const char *var, const char *value,
> > >  		char *path ;
> > >  
> > >  		if (git_config_pathname(&path, var, value))
> > > -			return 1;
> > > +			return 0;
> > 
> > So, after getting complaint about a misconfiguration, the caller
> > used to be able to, if they wanted to, tell two cases apart: a
> > misconfigured variable was ignored here, and we happily parsed the
> > configuration.  Now they no longer can tell them apart, because we
> > return 0 for both cases.
> 
> I wonder why we want to ignore these errors though. Grepping through the
> codebase surfaces that all other callsites of `git_config_pathname()`
> return its return value directly, which means that we'll die in case the
> path name cannot be parsed. Shouldn't we do the same here, or is there a
> good reason why we need to ignore it other than "We used to do it like
> that"? In other words, I would have the below diff.

In both "fsck.c:git_fsck_config()" and
"receive-pack:receive_pack_config()", the `git_config_pathname()`
callsites are set to return 1 on error. The main reason for ignoring the
error was to keep it consistent with the other fsck operations usage.
Outside of that, I also wasn't quite sure why specifically
"fsck.skipList" parsing is not as strict.

I do think being consistent would be nice, but I don't feel super
strongly either way. I'm open to changing it in another version if that
is best. :)

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/4] fetch-pack: split out fsck config parsing
  2024-12-03 14:23           ` Justin Tobler
@ 2024-12-03 14:28             ` Patrick Steinhardt
  2024-12-03 23:17               ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Patrick Steinhardt @ 2024-12-03 14:28 UTC (permalink / raw)
  To: Justin Tobler; +Cc: Junio C Hamano, git

On Tue, Dec 03, 2024 at 08:23:11AM -0600, Justin Tobler wrote:
> On 24/12/03 10:34AM, Patrick Steinhardt wrote:
> > On Thu, Nov 28, 2024 at 12:25:44PM +0900, Junio C Hamano wrote:
> > > Justin Tobler <jltobler@gmail.com> writes:
> > > 
> > > > For `fetch_pack_fsck_config()` to discern between errors and unhandled
> > > > config variables, the return code when `git_config_path()` errors is
> > > > changed to a different value also indicating success. This frees up the
> > > > previous return code to now indicate the provided config variable
> > > > was unhandled. The behavior remains functionally the same.
> > > 
> > > > @@ -1866,9 +1866,9 @@ static int fetch_pack_config_cb(const char *var, const char *value,
> > > >  		char *path ;
> > > >  
> > > >  		if (git_config_pathname(&path, var, value))
> > > > -			return 1;
> > > > +			return 0;
> > > 
> > > So, after getting complaint about a misconfiguration, the caller
> > > used to be able to, if they wanted to, tell two cases apart: a
> > > misconfigured variable was ignored here, and we happily parsed the
> > > configuration.  Now they no longer can tell them apart, because we
> > > return 0 for both cases.
> > 
> > I wonder why we want to ignore these errors though. Grepping through the
> > codebase surfaces that all other callsites of `git_config_pathname()`
> > return its return value directly, which means that we'll die in case the
> > path name cannot be parsed. Shouldn't we do the same here, or is there a
> > good reason why we need to ignore it other than "We used to do it like
> > that"? In other words, I would have the below diff.
> 
> In both "fsck.c:git_fsck_config()" and
> "receive-pack:receive_pack_config()", the `git_config_pathname()`
> callsites are set to return 1 on error. The main reason for ignoring the
> error was to keep it consistent with the other fsck operations usage.
> Outside of that, I also wasn't quite sure why specifically
> "fsck.skipList" parsing is not as strict.

Okay, so your refactorings are simply keeping the status quo and are
consistent with what we do elsewhere where we parse the same config key.

> I do think being consistent would be nice, but I don't feel super
> strongly either way. I'm open to changing it in another version if that
> is best. :)

I think the current version of this patch series is fine as-is then. It
might be a good idea to adapt this in a follow-up series, unless there
is a good reason not to.

Patrick

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/4] fetch-pack: split out fsck config parsing
  2024-12-03 14:28             ` Patrick Steinhardt
@ 2024-12-03 23:17               ` Junio C Hamano
  2024-12-04  2:39                 ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2024-12-03 23:17 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Justin Tobler, git

Patrick Steinhardt <ps@pks.im> writes:

> I think the current version of this patch series is fine as-is then. It
> might be a good idea to adapt this in a follow-up series, unless there
> is a good reason not to.

Sounds good.  We may want to tighten the rules a bit to reject
obvious misconfigurations, but that is outside the scope of this
topic.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/4] fetch-pack: split out fsck config parsing
  2024-12-03 23:17               ` Junio C Hamano
@ 2024-12-04  2:39                 ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2024-12-04  2:39 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Justin Tobler, git

Junio C Hamano <gitster@pobox.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> I think the current version of this patch series is fine as-is then. It
>> might be a good idea to adapt this in a follow-up series, unless there
>> is a good reason not to.
>
> Sounds good.  We may want to tighten the rules a bit to reject
> obvious misconfigurations, but that is outside the scope of this
> topic.

This reminds me of a slightly related tangent.

There are pathname-valued ones that we added deliberately as an
optional configuration variable, IIRC, and fsck.skiplist may be one
of them.  In other words, they mean "In a repository that I want the
configuration to affect, I'll have the configuration file at this
path, but it is merely optional.  If it is missing, pretend as if
the configuration variable does not even exist".

In retrospect, it was a mistake to define that this variable is
required and triggers an error when misconfigured, and that variable
is optional and triggers only a warning when misconfigured.  That is
one more thing for the user to remember and does not scale.

We probably should have done something like

 - the value given to all pathname valued configuration variables
   and command name options MUST correctly name a path that the
   command can read, and a misconfigured configuration variable or
   an invalid command line option should trigger an error when the
   command needs to access it.

 - the value (not the variable) can say "I am optional--if the path
   does not appear on this system, or if it is unreadable, pretend
   as if this configuration variable or the command line option were
   not given".

The "when the command needs to" part is important.  Ideally, "git
log" should not fail when core.hookspath is misconfigured, because
it does not trigger any hook, but it currently dies while parsing
the configuration files in git_default_config():

 $ git -c core.hookspath log
 error: missing value for 'core.hookspath'
 fatal: unable to parse 'core.hookspath' from command-line config.

which we may want to fix.  Unsurprisingly there are other variables
that do behave correctly, like

 $ git -c core.pager --no-pager log -20 >/dev/null
 $ git -c core.pager log -20 >/dev/null
 error: missing value for 'core.pager'
 fatal: unable to parse command-line config

where a misconfigured core.pager does not cause any trouble when the
pager is not in use.

Any of the above are not within the scope of this topic, obviously.

cf. <20241014204427.1712182-1-gitster@pobox.com>

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2024-12-04  2:39 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-21 20:41 [PATCH 0/5] propagate fsck message severity for bundle fetch Justin Tobler
2024-11-21 20:41 ` [PATCH 1/5] bundle: add bundle verification options type Justin Tobler
2024-11-22  1:21   ` Junio C Hamano
2024-11-22 15:22     ` Justin Tobler
2024-11-26  9:08   ` Patrick Steinhardt
2024-11-26 15:59     ` Justin Tobler
2024-11-21 20:41 ` [PATCH 2/5] bundle: support fsck message configuration Justin Tobler
2024-11-22  1:30   ` Junio C Hamano
2024-11-22 15:44     ` Justin Tobler
2024-11-25  1:33       ` Junio C Hamano
2024-11-21 20:41 ` [PATCH 3/5] fetch-pack: introduce `fetch_pack_options` Justin Tobler
2024-11-22  1:46   ` Junio C Hamano
2024-11-22  3:46     ` Junio C Hamano
2024-11-22 17:31     ` Justin Tobler
2024-11-21 20:41 ` [PATCH 4/5] fetch-pack: expose `fetch_pack_config_cb()` Justin Tobler
2024-11-22  1:57   ` Junio C Hamano
2024-11-22 17:41     ` Justin Tobler
2024-11-22 16:45   ` shejialuo
2024-11-27  1:21     ` Justin Tobler
2024-11-21 20:41 ` [PATCH 5/5] transport: propagate fsck configuration during bundle fetch Justin Tobler
2024-11-22  1:59   ` Junio C Hamano
2024-11-27  0:57 ` [PATCH v2 0/4] propagate fsck message severity for " Justin Tobler
2024-11-27  0:57   ` [PATCH v2 1/4] bundle: add bundle verification options type Justin Tobler
2024-11-27  0:57   ` [PATCH v2 2/4] bundle: support fsck message configuration Justin Tobler
2024-11-27  5:44     ` Patrick Steinhardt
2024-11-27  0:57   ` [PATCH v2 3/4] fetch-pack: split out fsck config parsing Justin Tobler
2024-11-27  5:44     ` Patrick Steinhardt
2024-11-27 17:37       ` Justin Tobler
2024-11-27  0:57   ` [PATCH v2 4/4] transport: propagate fsck configuration during bundle fetch Justin Tobler
2024-11-27  1:39     ` Junio C Hamano
2024-11-27 23:33   ` [PATCH v3 0/4] propagate fsck message severity for " Justin Tobler
2024-11-27 23:33     ` [PATCH v3 1/4] bundle: add bundle verification options type Justin Tobler
2024-11-27 23:33     ` [PATCH v3 2/4] bundle: support fsck message configuration Justin Tobler
2024-11-27 23:33     ` [PATCH v3 3/4] fetch-pack: split out fsck config parsing Justin Tobler
2024-11-28  3:25       ` Junio C Hamano
2024-12-03  9:34         ` Patrick Steinhardt
2024-12-03 14:23           ` Justin Tobler
2024-12-03 14:28             ` Patrick Steinhardt
2024-12-03 23:17               ` Junio C Hamano
2024-12-04  2:39                 ` Junio C Hamano
2024-11-27 23:33     ` [PATCH v3 4/4] transport: propagate fsck configuration during bundle fetch Justin Tobler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).