All of lore.kernel.org
 help / color / mirror / Atom feed
From: "blanet via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>,
	Karthik Nayak <karthik.188@gmail.com>,
	blanet <bupt_xingxin@163.com>
Subject: [PATCH v4 0/4] object checking related additions and fixes for bundles in fetches
Date: Thu, 30 May 2024 08:21:26 +0000	[thread overview]
Message-ID: <pull.1730.v4.git.1717057290.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1730.v3.git.1716824518.gitgitgadget@gmail.com>

While attempting to fix a reference negotiation bug in bundle-uri, we
identified that the fetch process lacks some crucial object validation
checks when processing bundles. The primary issues are:

 1. In the bundle-uri scenario, object IDs were not validated before writing
    bundle references. This was the root cause of the original negotiation
    bug in bundle-uri and could lead to potential repository corruption.
 2. The existing "fetch.fsckObjects" and "transfer.fsckObjects"
    configurations were not applied when directly fetching bundles or
    fetching with bundle-uri enabled. In fact, there were no object
    validation supports for unbundle.

The first patch addresses the bundle-uri negotiation issue by removing the
REF_SKIP_OID_VERIFICATION flag when writing bundle references.

Patches 2 through 4 extend verify_bundle_flags for bundle.c:unbundle to add
support for object validation (fsck) in different scenarios, mainly
following the suggestions from Junio on the mailing list.

Xing Xin (4):
  bundle-uri: verify oid before writing refs
  unbundle: extend verify_bundle_flags to support fsck-objects
  fetch-pack: expose fsckObjects configuration logic
  unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH

 bundle-uri.c                |   5 +-
 bundle.c                    |  10 ++
 bundle.h                    |   2 +
 fetch-pack.c                |  17 ++--
 fetch-pack.h                |   5 +
 t/t5558-clone-bundle-uri.sh | 186 +++++++++++++++++++++++++++++++++++-
 t/t5607-clone-bundle.sh     |  33 +++++++
 transport.c                 |   2 +-
 8 files changed, 246 insertions(+), 14 deletions(-)


base-commit: b9cfe4845cb2562584837bc0101c0ab76490a239
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1730%2Fblanet%2Fxx%2Fbundle-uri-bug-using-bundle-list-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1730/blanet/xx/bundle-uri-bug-using-bundle-list-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1730

Range-diff vs v3:

 1:  8f488a5eeaa ! 1:  e958a3ab20c bundle-uri: verify oid before writing refs
     @@ Commit message
          When using the bundle-uri mechanism with a bundle list containing
          multiple interrelated bundles, we encountered a bug where tips from
          downloaded bundles were not discovered, thus resulting in rather slow
     -    clones. This was particularly problematic when employing the heuristic
     -    `creationTokens`.
     +    clones. This was particularly problematic when employing the
     +    "creationTokens" heuristic.
      
     -    And this is easy to reproduce. Suppose we have a repository with a
     -    single branch `main` pointing to commit `A`, firstly we create a base
     -    bundle with
     +    To reproduce this issue, consider a repository with a single branch
     +    "main" pointing to commit "A". Firstly, create a base bundle with:
      
            git bundle create base.bundle main
      
     -    Then let's add a new commit `B` on top of `A`, so that an incremental
     -    bundle for `main` can be created with
     +    Then, add a new commit "B" on top of "A", and create an incremental
     +    bundle for "main":
      
            git bundle create incr.bundle A..main
      
     -    Now we can generate a bundle list with the following content:
     +    Now, generate a bundle list with the following content:
      
            [bundle]
                version = 1
     @@ Commit message
                uri = incr.bundle
                creationToken = 2
      
     -    A fresh clone with the bundle list above would give the expected
     -    `refs/bundles/main` pointing at `B` in new repository, in other words we
     -    already had everything locally from the bundles, but git would still
     -    download everything from server as if we got nothing.
     +    A fresh clone with the bundle list above should result in a reference
     +    "refs/bundles/main" pointing to "B" in the new repository. However, git
     +    would still download everything from the server, as if it had fetched
     +    nothing locally.
      
     -    So why the `refs/bundles/main` is not discovered? After some digging I
     +    So why the "refs/bundles/main" is not discovered? After some digging I
          found that:
      
          1. Bundles in bundle list are downloaded to local files via
     -       `download_bundle_list` or via `fetch_bundles_by_token` for the
     -       creationToken heuristic case.
     -    2. Then it tries to unbundle each bundle via `unbundle_from_file`, which
     -       is called by `unbundle_all_bundles` or called within
     -       `fetch_bundles_by_token` for the creationToken heuristic case.
     -    3. Here, we first read the bundle header to get all the prerequisites
     -       for the bundle, this is done in `read_bundle_header`.
     -    4. Then we call `unbundle`, which calls `verify_bundle` to ensure that
     -       the repository does indeed contain the prerequisites mentioned in the
     +       `bundle-uri.c:download_bundle_list` or via
     +       `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
     +       heuristic.
     +    2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which
     +       is called by `bundle-uri.c:unbundle_all_bundles` or called within
     +       `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
     +       heuristic.
     +    3. To get all prerequisites of the bundle, the bundle header is read
     +       inside `bundle-uri.c:unbundle_from_file` to by calling
     +       `bundle.c:read_bundle_header`.
     +    4. Then it calls `bundle.c:unbundle`, which calls
     +       `bundle.c:verify_bundle` to ensure the repository contains all the
     +       prerequisites.
     +    5. `bundle.c:verify_bundle` calls `parse_object`, which eventually
     +       invokes `packfile.c:prepare_packed_git` or
     +       `packfile.c:reprepare_packed_git`, filling
     +       `raw_object_store->packed_git` and setting `packed_git_initialized`.
     +    6. If `bundle.c:unbundle` succeeds, it writes refs via
     +       `refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here
     +       bundle refs which can target arbitrary objects are written to the
     +       repository.
     +    7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions
     +       `fetch-pack.c:mark_complete_and_common_ref` and
     +       `fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to
     +       find local tips for negotiation. The `OBJECT_INFO_QUICK` flag
     +       prevents `packfile.c:reprepare_packed_git` from being called,
     +       resulting in failures to parse OIDs that reside only in the latest
             bundle.
     -    5. The `verify_bundle` will call `parse_object`, within which the
     -       `prepare_packed_git` or `reprepare_packed_git` is eventually called,
     -       which means that the `raw_object_store->packed_git` data gets filled
     -       in and ``packed_git_initialized` is set. This also means consecutive
     -       calls to `prepare_packed_git` doesn't re-initiate
     -       `raw_object_store->packed_git` since `packed_git_initialized` already
     -       is set.
     -    6. If `unbundle` succeeds, it writes some refs via `refs_update_ref`
     -       with `REF_SKIP_OID_VERIFICATION` set. So the bundle refs which can
     -       target arbitrary objects are written to the repository.
     -    7. Finally in `do_fetch_pack_v2`, `mark_complete_and_common_ref` and
     -       `mark_tips` are called with `OBJECT_INFO_QUICK` set to find local
     -       tips. Here it won't call `reprepare_packed_git` anymore so it would
     -       fail to parse oids that only reside in the last bundle.
     -
     -    Back to the example above, when unbunding `incr.bundle`, `base.pack` is
     -    enlisted to `packed_git` bacause of the prerequisites to verify. While
     -    we can not find `B` for negotiation at a latter time because `B` exists
     -    in `incr.pack` which is not enlisted in `packed_git`.
     -
     -    This commit fixes this bug by dropping the `REF_SKIP_OID_VERIFICATION`
     -    flag when writing bundle refs, so we can:
     -
     -    1. Ensure that the bundle refs we are writing are pointing to valid
     -       objects.
     -    2. Ensure all the tips from bundle refs can be correctly parsed.
     -
     -    And a set of negotiation related tests for bundle-uri are added.
      
     +    In the example above, when unbunding "incr.bundle", "base.pack" is added
     +    to `packed_git` due to prerequisites verification. However, "B" cannot
     +    be found for negotiation because it exists in "incr.pack", which is not
     +    included in `packed_git`.
     +
     +    This commit fixes the bug by removing `REF_SKIP_OID_VERIFICATION` flag
     +    when writing bundle refs. When `refs.c:refs_update_ref` is called to to
     +    write the corresponding bundle refs, it triggers
     +    `refs.c:ref_transaction_commit`.  This, in turn, invokes
     +    `refs.c:ref_transaction_prepare`, which calls `transaction_prepare` of
     +    the refs storage backend. For files backend, this function is
     +    `files-backend.c:files_transaction_prepare`, and for reftable backend,
     +    it is `reftable-backend.c:reftable_be_transaction_prepare`. Both
     +    functions eventually call `object.c:parse_object`, which can invoke
     +    `packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures
     +    that bundle refs point to valid objects and that all tips from bundle
     +    refs are correctly parsed during subsequent negotiations.
     +
     +    A test has been added to demonstrate that bundles with incorrect
     +    headers, where refs point to non-existent objects, do not result in any
     +    bundle refs being created in the repository. Additionally, a set of
     +    negotiation-related tests for fetching with bundle-uri has been
     +    included.
     +
     +    Reviewed-by: Karthik Nayak <karthik.188@gmail.com>
     +    Reviewed-by: Patrick Steinhardt <ps@pks.im>
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
       ## bundle-uri.c ##
     @@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *fi
      
       ## t/t5558-clone-bundle-uri.sh ##
      @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'fail to clone from non-bundle file' '
     + 
       test_expect_success 'create bundle' '
       	git init clone-from &&
     - 	git -C clone-from checkout -b topic &&
     +-	git -C clone-from checkout -b topic &&
     +-	test_commit -C clone-from A &&
     +-	test_commit -C clone-from B &&
     +-	git -C clone-from bundle create B.bundle topic
     ++	(
     ++		cd clone-from &&
     ++		git checkout -b topic &&
     ++
     ++		test_commit A &&
     ++		git bundle create A.bundle topic &&
      +
     - 	test_commit -C clone-from A &&
     -+	git -C clone-from bundle create A.bundle topic &&
     ++		test_commit B &&
     ++		git bundle create B.bundle topic &&
      +
     - 	test_commit -C clone-from B &&
     - 	git -C clone-from bundle create B.bundle topic
     ++		# Create a bundle with reference pointing to non-existent object.
     ++		sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle
     ++	)
       '
     + 
     + test_expect_success 'clone with path bundle' '
     +@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with path bundle' '
     + 	test_cmp expect actual
     + '
     + 
     ++test_expect_success 'clone with bundle that has bad header' '
     ++	git clone --bundle-uri="clone-from/bad-header.bundle" \
     ++		clone-from clone-bad-header 2>err &&
     ++	# Write bundle ref fails, but clone can still proceed.
     ++	commit_b=$(git -C clone-from rev-parse B) &&
     ++	test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err &&
     ++	git -C clone-bad-header for-each-ref --format="%(refname)" >refs &&
     ++	! grep "refs/bundles/" refs
     ++'
     ++
     + test_expect_success 'clone with path bundle and non-default hash' '
     + 	test_when_finished "rm -rf clone-path-non-default-hash" &&
     + 	GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="clone-from/B.bundle" \
      @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any mode, all failures)' '
       	! grep "refs/bundles/" refs
       '
 2:  057c697970f ! 2:  beb70735811 unbundle: introduce unbundle_fsck_flags for fsckobjects handling
     @@ Metadata
      Author: Xing Xin <xingxin.xx@bytedance.com>
      
       ## Commit message ##
     -    unbundle: introduce unbundle_fsck_flags for fsckobjects handling
     +    unbundle: extend verify_bundle_flags to support fsck-objects
      
     -    This commit adds a new enum `unbundle_fsck_flags` which is designed to
     -    control the fsck behavior when unbundling. `unbundle` can use this newly
     -    passed in enum to further decide whether to enable `--fsck-objects` for
     -    "git-index-pack".
     +    This commit extends `verify_bundle_flags` by adding a new option
     +    `VERIFY_BUNDLE_FSCK_ALWAYS`, which enables checks for broken objects in
     +    `bundle.c:unbundle`. This option is now used as the default for fetches
     +    involving bundles, specifically by `transport.c:fetch_refs_from_bundle`
     +    for direct bundle fetches and by `bundle-uri.c:unbundle_from_file` for
     +    _bundle-uri_ enabled fetches.
      
     -    Currently only `UNBUNDLE_FSCK_NEVER` and `UNBUNDLE_FSCK_ALWAYS` are
     -    supported as the very basic options. Another interesting option would be
     -    added in later commits.
     +    Upcoming commits will introduce another option as a replacement that
     +    fits better with fetch operations. `VERIFY_BUNDLE_FSCK_ALWAYS` will be
     +    further used to add "--fsck-objects" support for "git bundle unbundle"
     +    and "git bundle verify".
      
     +    Reviewed-by: Patrick Steinhardt <ps@pks.im>
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
     - ## builtin/bundle.c ##
     -@@ builtin/bundle.c: static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix)
     - 		strvec_pushl(&extra_index_pack_args, "-v", "--progress-title",
     - 			     _("Unbundling objects"), NULL);
     - 	ret = !!unbundle(the_repository, &header, bundle_fd,
     --			 &extra_index_pack_args, 0) ||
     -+			 &extra_index_pack_args, 0, UNBUNDLE_FSCK_NEVER) ||
     - 		list_bundle_refs(&header, argc, argv);
     - 	bundle_header_release(&header);
     - cleanup:
     -
       ## bundle-uri.c ##
      @@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *file)
       	 * the prerequisite commits.
       	 */
       	if ((result = unbundle(r, &header, bundle_fd, NULL,
      -			       VERIFY_BUNDLE_QUIET)))
     -+			       VERIFY_BUNDLE_QUIET, UNBUNDLE_FSCK_ALWAYS)))
     ++			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_ALWAYS)))
       		return 1;
       
       	/*
      
       ## bundle.c ##
     -@@ bundle.c: int create_bundle(struct repository *r, const char *path,
     - 
     - int unbundle(struct repository *r, struct bundle_header *header,
     - 	     int bundle_fd, struct strvec *extra_index_pack_args,
     --	     enum verify_bundle_flags flags)
     -+	     enum verify_bundle_flags flags,
     -+	     enum unbundle_fsck_flags fsck_flags)
     - {
     - 	struct child_process ip = CHILD_PROCESS_INIT;
     - 
      @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
       	if (header->filter.choice)
       		strvec_push(&ip.args, "--promisor=from-bundle");
       
     -+	switch (fsck_flags) {
     -+	case UNBUNDLE_FSCK_ALWAYS:
     ++	if (flags & VERIFY_BUNDLE_FSCK_ALWAYS)
      +		strvec_push(&ip.args, "--fsck-objects");
     -+		break;
     -+	case UNBUNDLE_FSCK_NEVER:
     -+	default:
     -+		break;
     -+	}
      +
       	if (extra_index_pack_args) {
       		strvec_pushv(&ip.args, extra_index_pack_args->v);
     @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
      
       ## bundle.h ##
      @@ bundle.h: int create_bundle(struct repository *r, const char *path,
     - 		  int argc, const char **argv, struct strvec *pack_options,
     - 		  int version);
     - 
     -+enum unbundle_fsck_flags {
     -+	UNBUNDLE_FSCK_NEVER = 0,
     -+	UNBUNDLE_FSCK_ALWAYS,
     -+};
     -+
       enum verify_bundle_flags {
       	VERIFY_BUNDLE_VERBOSE = (1 << 0),
       	VERIFY_BUNDLE_QUIET = (1 << 1),
     -@@ bundle.h: int verify_bundle(struct repository *r, struct bundle_header *header,
     -  */
     - int unbundle(struct repository *r, struct bundle_header *header,
     - 	     int bundle_fd, struct strvec *extra_index_pack_args,
     --	     enum verify_bundle_flags flags);
     -+	     enum verify_bundle_flags flags,
     -+	     enum unbundle_fsck_flags fsck_flags);
     - int list_bundle_refs(struct bundle_header *header,
     - 		int argc, const char **argv);
     ++	VERIFY_BUNDLE_FSCK_ALWAYS = (1 << 2),
     + };
       
     + int verify_bundle(struct repository *r, struct bundle_header *header,
      
       ## transport.c ##
      @@ transport.c: static int fetch_refs_from_bundle(struct transport *transport,
     @@ transport.c: static int fetch_refs_from_bundle(struct transport *transport,
       		get_refs_from_bundle_inner(transport);
       	ret = unbundle(the_repository, &data->header, data->fd,
      -		       &extra_index_pack_args, 0);
     -+		       &extra_index_pack_args, 0, UNBUNDLE_FSCK_ALWAYS);
     ++		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_ALWAYS);
       	transport->hash_algo = data->header.hash_algo;
       	return ret;
       }
 3:  67401d4fbcb ! 3:  5ddc894c2c1 fetch-pack: expose fsckObjects configuration logic
     @@ Metadata
       ## Commit message ##
          fetch-pack: expose fsckObjects configuration logic
      
     -    Currently we can use "transfer.fsckObjects" or "fetch.fsckObjects" to
     -    control whether to enable checks for broken objects during fetching. But
     -    these configs are only acknowledged by `fetch-pack.c:get_pack` and do
     -    not make sense when fetching from bundles or using bundle-uris.
     +    Currently, we can use "transfer.fsckObjects" and the more specific
     +    "fetch.fsckObjects" to control checks for broken objects in received
     +    packs during fetches. However, these configurations were only
     +    acknowledged by `fetch-pack.c:get_pack` and did not take effect in
     +    direct bundle fetches and fetches with _bundle-uri_ enabled.
      
     -    This commit exposed the fetch-then-transfer configuration logic by
     -    adding a new function `fetch_pack_fsck_objects` in fetch-pack.h. In next
     -    commit, this new function will be used by `unbundle` in fetching
     -    scenarios.
     +    This commit exposes the fetch-then-transfer configuration logic by
     +    adding a new function `fetch_pack_fsck_objects` in fetch-pack.h. This
     +    new function is used to replace the assignment for `fsck_objects` in
     +    `fetch-pack.c:get_pack`. In the next commit, it will also be used by
     +    `bundle.c:unbundle` to better fit fetching scenarios.
      
     +    Helped-by: Junio C Hamano <gitster@pobox.com>
     +    Helped-by: Patrick Steinhardt <ps@pks.im>
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
       ## fetch-pack.c ##
     @@ fetch-pack.c: static const struct object_id *iterate_ref_map(void *cb_data)
      +int fetch_pack_fsck_objects(void)
      +{
      +	fetch_pack_setup();
     -+
     -+	return fetch_fsck_objects >= 0
     -+	       ? fetch_fsck_objects
     -+	       : transfer_fsck_objects >= 0
     -+	       ? transfer_fsck_objects
     -+	       : 0;
     ++	if (fetch_fsck_objects >= 0)
     ++		return fetch_fsck_objects;
     ++	if (transfer_fsck_objects >= 0)
     ++		return transfer_fsck_objects;
     ++	return 0;
      +}
      +
       struct ref *fetch_pack(struct fetch_pack_args *args,
     @@ fetch-pack.h: void negotiate_using_fetch(const struct oid_array *negotiation_tip
        */
       int report_unmatched_refs(struct ref **sought, int nr_sought);
       
     ++/*
     ++ * Return true if checks for broken objects in received pack are required.
     ++ */
      +int fetch_pack_fsck_objects(void);
      +
       #endif
 4:  c19b8f633cb ! 4:  68b9bca9f8b unbundle: introduce new option UNBUNDLE_FSCK_FOLLOW_FETCH
     @@ Metadata
      Author: Xing Xin <xingxin.xx@bytedance.com>
      
       ## Commit message ##
     -    unbundle: introduce new option UNBUNDLE_FSCK_FOLLOW_FETCH
     +    unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH
      
     -    This commit adds a new option `UNBUNDLE_FSCK_FOLLOW_FETCH` to
     -    `unbundle_fsck_flags`, this new flag is currently used in the _fetch_
     -    process by
     +    This commit introduces a new option `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` to
     +    `verify_bundle_flags`. In `bundle.c:unbundle`, this new option controls
     +    whether broken object checks should be enabled by invoking
     +    `fetch-pack.c:fetch_pack_fsck_objects`. Note that the option
     +    `VERIFY_BUNDLE_FSCK_ALWAYS` takes precedence over
     +    `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH`.
      
     -    - `transport.c:fetch_refs_from_bundle` for fetching directly from a
     -      bundle.
     -    - `bundle-uri.c:unbundle_from_file` for unbundling bundles downloaded
     -      from bundle-uri.
     +    This flag is now used in the fetching process by:
      
     -    So we now have a relatively consistent logic for checking objects during
     -    fetching. Add tests for the above two situations are added.
     +    - `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
     +    - `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.
     +
     +    This addition ensures a consistent logic for object verification during
     +    fetch operations. Tests have been added to confirm functionality in the
     +    scenarios mentioned above.
      
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
     @@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *fi
       	 * the prerequisite commits.
       	 */
       	if ((result = unbundle(r, &header, bundle_fd, NULL,
     --			       VERIFY_BUNDLE_QUIET, UNBUNDLE_FSCK_ALWAYS)))
     -+			       VERIFY_BUNDLE_QUIET, UNBUNDLE_FSCK_FOLLOW_FETCH)))
     +-			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_ALWAYS)))
     ++			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)))
       		return 1;
       
       	/*
     @@ bundle.c
       static const char v2_bundle_signature[] = "# v2 git bundle\n";
       static const char v3_bundle_signature[] = "# v3 git bundle\n";
      @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     - 	     enum unbundle_fsck_flags fsck_flags)
     + 	     enum verify_bundle_flags flags)
       {
       	struct child_process ip = CHILD_PROCESS_INIT;
      +	int fsck_objects = 0;
     @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
       	if (verify_bundle(r, header, flags))
       		return -1;
      @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     + 		strvec_push(&ip.args, "--promisor=from-bundle");
       
     - 	switch (fsck_flags) {
     - 	case UNBUNDLE_FSCK_ALWAYS:
     --		strvec_push(&ip.args, "--fsck-objects");
     + 	if (flags & VERIFY_BUNDLE_FSCK_ALWAYS)
      +		fsck_objects = 1;
     -+		break;
     -+	case UNBUNDLE_FSCK_FOLLOW_FETCH:
     ++	else if (flags & VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)
      +		fsck_objects = fetch_pack_fsck_objects();
     - 		break;
     - 	case UNBUNDLE_FSCK_NEVER:
     - 	default:
     - 		break;
     - 	}
     - 
     -+	if (fsck_objects)
     -+		strvec_push(&ip.args, "--fsck-objects");
      +
     ++	if (fsck_objects)
     + 		strvec_push(&ip.args, "--fsck-objects");
     + 
       	if (extra_index_pack_args) {
     - 		strvec_pushv(&ip.args, extra_index_pack_args->v);
     - 		strvec_clear(extra_index_pack_args);
      
       ## bundle.h ##
     -@@ bundle.h: int create_bundle(struct repository *r, const char *path,
     - enum unbundle_fsck_flags {
     - 	UNBUNDLE_FSCK_NEVER = 0,
     - 	UNBUNDLE_FSCK_ALWAYS,
     -+	UNBUNDLE_FSCK_FOLLOW_FETCH,
     +@@ bundle.h: enum verify_bundle_flags {
     + 	VERIFY_BUNDLE_VERBOSE = (1 << 0),
     + 	VERIFY_BUNDLE_QUIET = (1 << 1),
     + 	VERIFY_BUNDLE_FSCK_ALWAYS = (1 << 2),
     ++	VERIFY_BUNDLE_FSCK_FOLLOW_FETCH = (1 << 3),
       };
       
     - enum verify_bundle_flags {
     + int verify_bundle(struct repository *r, struct bundle_header *header,
      
       ## t/t5558-clone-bundle-uri.sh ##
     -@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'fail to clone from non-bundle file' '
     +@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'create bundle' '
     + 		git bundle create B.bundle topic &&
       
     - test_expect_success 'create bundle' '
     - 	git init clone-from &&
     --	git -C clone-from checkout -b topic &&
     -+	(
     -+		cd clone-from &&
     -+		git checkout -b topic &&
     -+
     -+		test_commit A &&
     -+		git bundle create A.bundle topic &&
     -+
     -+		test_commit B &&
     -+		git bundle create B.bundle topic &&
     + 		# Create a bundle with reference pointing to non-existent object.
     +-		sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle
     ++		sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle &&
      +
      +		cat >data <<-EOF &&
      +		tree $(git rev-parse HEAD^{tree})
      +		parent $(git rev-parse HEAD)
      +		author A U Thor
      +		committer A U Thor
     - 
     --	test_commit -C clone-from A &&
     --	git -C clone-from bundle create A.bundle topic &&
     ++
      +		commit: this is a commit with bad emails
     - 
     --	test_commit -C clone-from B &&
     --	git -C clone-from bundle create B.bundle topic
     ++
      +		EOF
      +		git hash-object --literally -t commit -w --stdin <data >commit &&
      +		git branch bad $(cat commit) &&
     -+		git bundle create bad.bundle bad &&
     ++		git bundle create bad-object.bundle bad &&
      +		git update-ref -d refs/heads/bad
     -+	)
     + 	)
       '
       
     - test_expect_success 'clone with path bundle' '
     -@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with path bundle' '
     - 	test_cmp expect actual
     +@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with bundle that has bad header' '
     + 	! grep "refs/bundles/" refs
       '
       
     -+test_expect_success 'clone with bad bundle' '
     -+	git -c fetch.fsckObjects=true clone --bundle-uri="clone-from/bad.bundle" \
     -+		clone-from clone-bad 2>err &&
     -+	# Unbundle fails, but clone can still proceed.
     ++test_expect_success 'clone with bundle that has bad object' '
     ++	# Unbundle succeeds if no fsckObjects confugured.
     ++	git clone --bundle-uri="clone-from/bad-object.bundle" \
     ++		clone-from clone-bad-object-no-fsck &&
     ++	git -C clone-bad-object-no-fsck for-each-ref --format="%(refname)" >refs &&
     ++	grep "refs/bundles/" refs >actual &&
     ++	cat >expect <<-\EOF &&
     ++	refs/bundles/bad
     ++	EOF
     ++	test_cmp expect actual &&
     ++
     ++	# Unbundle fails with fsckObjects set true, but clone can still proceed.
     ++	git -c fetch.fsckObjects=true clone --bundle-uri="clone-from/bad-object.bundle" \
     ++		clone-from clone-bad-object-fsck 2>err &&
      +	test_grep "missingEmail" err &&
     -+	git -C clone-bad for-each-ref --format="%(refname)" >refs &&
     ++	git -C clone-bad-object-fsck for-each-ref --format="%(refname)" >refs &&
      +	! grep "refs/bundles/" refs
      +'
      +
     @@ t/t5607-clone-bundle.sh: test_expect_success 'fetch SHA-1 from bundle' '
       	git fetch --no-tags foo/tip.bundle "$(cat hash)"
       '
       
     -+test_expect_success 'clone bundle with fetch.fsckObjects' '
     ++test_expect_success 'clone bundle with different fsckObjects configurations' '
      +	test_create_repo bundle-fsck &&
      +	(
      +		cd bundle-fsck &&
     @@ t/t5607-clone-bundle.sh: test_expect_success 'fetch SHA-1 from bundle' '
      +		git branch bad $(cat commit) &&
      +		git bundle create bad.bundle bad
      +	) &&
     ++
     ++	git clone bundle-fsck/bad.bundle bundle-no-fsck &&
     ++
     ++	git -c fetch.fsckObjects=false -c transfer.fsckObjects=true \
     ++		clone bundle-fsck/bad.bundle bundle-fetch-no-fsck &&
     ++
      +	test_must_fail git -c fetch.fsckObjects=true \
     -+		clone bundle-fsck/bad.bundle bundle-fsck-clone 2>err &&
     ++		clone bundle-fsck/bad.bundle bundle-fetch-fsck 2>err &&
     ++	test_grep "missingEmail" err &&
     ++
     ++	test_must_fail git -c transfer.fsckObjects=true \
     ++		clone bundle-fsck/bad.bundle bundle-transfer-fsck 2>err &&
      +	test_grep "missingEmail" err
      +'
      +
     @@ transport.c: static int fetch_refs_from_bundle(struct transport *transport,
       	if (!data->get_refs_from_bundle_called)
       		get_refs_from_bundle_inner(transport);
       	ret = unbundle(the_repository, &data->header, data->fd,
     --		       &extra_index_pack_args, 0, UNBUNDLE_FSCK_ALWAYS);
     -+		       &extra_index_pack_args, 0, UNBUNDLE_FSCK_FOLLOW_FETCH);
     +-		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_ALWAYS);
     ++		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_FOLLOW_FETCH);
       	transport->hash_algo = data->header.hash_algo;
       	return ret;
       }

-- 
gitgitgadget

  parent reply	other threads:[~2024-05-30  8:21 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15  3:01 [PATCH] bundle-uri: refresh packed_git if unbundle succeed blanet via GitGitGadget
2024-05-17  5:00 ` Patrick Steinhardt
2024-05-17 16:14   ` Junio C Hamano
2024-05-20 11:48     ` Xing Xin
2024-05-20 17:19       ` Junio C Hamano
2024-05-27 16:04         ` Xing Xin
2024-05-20  9:41   ` Xing Xin
2024-05-17  7:36 ` Karthik Nayak
2024-05-20 10:19   ` Xing Xin
2024-05-20 12:36 ` [PATCH v2] bundle-uri: verify oid before writing refs blanet via GitGitGadget
2024-05-21 15:41   ` Karthik Nayak
2024-05-27 15:41   ` [PATCH v3 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-05-27 15:41     ` [PATCH v3 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-28 11:55       ` Patrick Steinhardt
2024-05-30  8:32         ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 2/4] unbundle: introduce unbundle_fsck_flags for fsckobjects handling Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-29 18:12         ` Xing Xin
2024-05-30  4:38           ` Patrick Steinhardt
2024-05-30  8:46             ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-28 17:10         ` Junio C Hamano
2024-05-28 17:24           ` Junio C Hamano
2024-05-29  5:52             ` Patrick Steinhardt
2024-05-30  8:48             ` Xing Xin
2024-05-29  5:52           ` Patrick Steinhardt
2024-05-27 15:41     ` [PATCH v3 4/4] unbundle: introduce new option UNBUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-05-28 12:05       ` Patrick Steinhardt
2024-05-30  8:54         ` Xing Xin
2024-05-30  8:21     ` blanet via GitGitGadget [this message]
2024-05-30  8:21       ` [PATCH v4 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 2/4] unbundle: extend verify_bundle_flags to support fsck-objects Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-05-30  8:21       ` [PATCH v4 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 4/4] unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-06-11  6:42       ` [PATCH v5 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 2/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 3/4] unbundle: extend options to support object verification Xing Xin via GitGitGadget
2024-06-11  9:11           ` Patrick Steinhardt
2024-06-11 12:47             ` Xing Xin
2024-06-11  6:42         ` [PATCH v5 4/4] unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches Xing Xin via GitGitGadget
2024-06-11 12:45         ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-11 12:45           ` [PATCH v6 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11 19:08             ` Junio C Hamano
2024-06-17 13:53               ` Xing Xin
2024-06-11 12:45           ` [PATCH v6 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11 19:20             ` Junio C Hamano
2024-06-11 12:45           ` [PATCH v6 3/3] unbundle: support object verification for fetches Xing Xin via GitGitGadget
2024-06-11 20:05             ` Junio C Hamano
2024-06-12 18:33               ` Xing Xin
2024-06-11 13:14           ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches Patrick Steinhardt
2024-06-17 13:55           ` [PATCH v7 " blanet via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-18 17:37               ` Junio C Hamano
2024-06-19  6:30                 ` Xing Xin
2024-06-17 13:55             ` [PATCH v7 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget
2024-06-19  4:07             ` [PATCH v8 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1730.v4.git.1717057290.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=bupt_xingxin@163.com \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.