All of lore.kernel.org
 help / color / mirror / Atom feed
From: "blanet via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>,
	Karthik Nayak <karthik.188@gmail.com>,
	blanet <bupt_xingxin@163.com>
Subject: [PATCH v5 0/4] object checking related additions and fixes for bundles in fetches
Date: Tue, 11 Jun 2024 06:42:02 +0000	[thread overview]
Message-ID: <pull.1730.v5.git.1718088126.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1730.v4.git.1717057290.gitgitgadget@gmail.com>

While attempting to fix a reference negotiation bug in bundle-uri, we
identified that the fetch process lacks some crucial object validation
checks when processing bundles. The primary issues are:

 1. In the bundle-uri scenario, object IDs were not validated before writing
    bundle references. This was the root cause of the original negotiation
    bug in bundle-uri and could lead to potential repository corruption.
 2. The existing "fetch.fsckObjects" and "transfer.fsckObjects"
    configurations were not applied when directly fetching bundles or
    fetching with bundle-uri enabled. In fact, there were no object
    validation supports for unbundle.

The first patch addresses the bundle-uri negotiation issue by removing the
REF_SKIP_OID_VERIFICATION flag when writing bundle references.

Patches 2 through 4 extend verify_bundle_flags for bundle.c:unbundle to add
support for object validation (fsck) in different scenarios, mainly
following the suggestions from Junio on the mailing list.

Xing Xin (4):
  bundle-uri: verify oid before writing refs
  fetch-pack: expose fsckObjects configuration logic
  unbundle: extend options to support object verification
  unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches

 bundle-uri.c                |   5 +-
 bundle.c                    |  10 ++
 bundle.h                    |   2 +
 fetch-pack.c                |  17 ++--
 fetch-pack.h                |   5 +
 t/t5558-clone-bundle-uri.sh | 186 +++++++++++++++++++++++++++++++++++-
 t/t5607-clone-bundle.sh     |  33 +++++++
 transport.c                 |   2 +-
 8 files changed, 246 insertions(+), 14 deletions(-)


base-commit: b9cfe4845cb2562584837bc0101c0ab76490a239
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1730%2Fblanet%2Fxx%2Fbundle-uri-bug-using-bundle-list-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1730/blanet/xx/bundle-uri-bug-using-bundle-list-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1730

Range-diff vs v4:

 1:  e958a3ab20c = 1:  e958a3ab20c bundle-uri: verify oid before writing refs
 3:  5ddc894c2c1 = 2:  d21c236b8de fetch-pack: expose fsckObjects configuration logic
 2:  beb70735811 ! 3:  0a18d7839be unbundle: extend verify_bundle_flags to support fsck-objects
     @@ Metadata
      Author: Xing Xin <xingxin.xx@bytedance.com>
      
       ## Commit message ##
     -    unbundle: extend verify_bundle_flags to support fsck-objects
     +    unbundle: extend options to support object verification
      
     -    This commit extends `verify_bundle_flags` by adding a new option
     -    `VERIFY_BUNDLE_FSCK_ALWAYS`, which enables checks for broken objects in
     -    `bundle.c:unbundle`. This option is now used as the default for fetches
     -    involving bundles, specifically by `transport.c:fetch_refs_from_bundle`
     -    for direct bundle fetches and by `bundle-uri.c:unbundle_from_file` for
     -    _bundle-uri_ enabled fetches.
     +    This commit extends object verification support in `bundle.c:unbundle`
     +    by adding two new options to `verify_bundle_flags`:
      
     -    Upcoming commits will introduce another option as a replacement that
     -    fits better with fetch operations. `VERIFY_BUNDLE_FSCK_ALWAYS` will be
     -    further used to add "--fsck-objects" support for "git bundle unbundle"
     -    and "git bundle verify".
     +    - `VERIFY_BUNDLE_FSCK_ALWAYS` explicitly enables checks for broken
     +      objects. It will be used to add "--fsck-objects" support for "git
     +      bundle unbundle" in a separate series.
     +    - `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` is designed to be used during fetch
     +      operations, specifically for direct bundle fetches and _bundle-uri_
     +      enabled fetches. When enabled, `bundle.c:unbundle` invokes
     +      `fetch-pack.c:fetch_pack_fsck_objects` to determine whether to enable
     +      checks for broken objects. Passing this flag during fetching will be
     +      implemented in a subsequent commit.
     +
     +    Note that the option `VERIFY_BUNDLE_FSCK_ALWAYS` takes precedence over
     +    `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH`.
      
          Reviewed-by: Patrick Steinhardt <ps@pks.im>
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
     - ## bundle-uri.c ##
     -@@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *file)
     - 	 * the prerequisite commits.
     - 	 */
     - 	if ((result = unbundle(r, &header, bundle_fd, NULL,
     --			       VERIFY_BUNDLE_QUIET)))
     -+			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_ALWAYS)))
     - 		return 1;
     - 
     - 	/*
     -
       ## bundle.c ##
     +@@
     + #include "list-objects-filter-options.h"
     + #include "connected.h"
     + #include "write-or-die.h"
     ++#include "fetch-pack.h"
     + 
     + static const char v2_bundle_signature[] = "# v2 git bundle\n";
     + static const char v3_bundle_signature[] = "# v3 git bundle\n";
     +@@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     + 	     enum verify_bundle_flags flags)
     + {
     + 	struct child_process ip = CHILD_PROCESS_INIT;
     ++	int fsck_objects = 0;
     + 
     + 	if (verify_bundle(r, header, flags))
     + 		return -1;
      @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
       	if (header->filter.choice)
       		strvec_push(&ip.args, "--promisor=from-bundle");
       
      +	if (flags & VERIFY_BUNDLE_FSCK_ALWAYS)
     ++		fsck_objects = 1;
     ++	else if (flags & VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)
     ++		fsck_objects = fetch_pack_fsck_objects();
     ++
     ++	if (fsck_objects)
      +		strvec_push(&ip.args, "--fsck-objects");
      +
       	if (extra_index_pack_args) {
     @@ bundle.h: int create_bundle(struct repository *r, const char *path,
       	VERIFY_BUNDLE_VERBOSE = (1 << 0),
       	VERIFY_BUNDLE_QUIET = (1 << 1),
      +	VERIFY_BUNDLE_FSCK_ALWAYS = (1 << 2),
     ++	VERIFY_BUNDLE_FSCK_FOLLOW_FETCH = (1 << 3),
       };
       
       int verify_bundle(struct repository *r, struct bundle_header *header,
     -
     - ## transport.c ##
     -@@ transport.c: static int fetch_refs_from_bundle(struct transport *transport,
     - 	if (!data->get_refs_from_bundle_called)
     - 		get_refs_from_bundle_inner(transport);
     - 	ret = unbundle(the_repository, &data->header, data->fd,
     --		       &extra_index_pack_args, 0);
     -+		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_ALWAYS);
     - 	transport->hash_algo = data->header.hash_algo;
     - 	return ret;
     - }
 4:  68b9bca9f8b ! 4:  eb9f21f16b5 unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH
     @@ Metadata
      Author: Xing Xin <xingxin.xx@bytedance.com>
      
       ## Commit message ##
     -    unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH
     +    unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches
      
     -    This commit introduces a new option `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` to
     -    `verify_bundle_flags`. In `bundle.c:unbundle`, this new option controls
     -    whether broken object checks should be enabled by invoking
     -    `fetch-pack.c:fetch_pack_fsck_objects`. Note that the option
     -    `VERIFY_BUNDLE_FSCK_ALWAYS` takes precedence over
     -    `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH`.
     -
     -    This flag is now used in the fetching process by:
     +    This commit passes `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` to `unbundle` in
     +    the fetching process, including:
      
          - `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
          - `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.
     @@ Commit message
          fetch operations. Tests have been added to confirm functionality in the
          scenarios mentioned above.
      
     +    Reviewed-by: Patrick Steinhardt <ps@pks.im>
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
       ## bundle-uri.c ##
     @@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *fi
       	 * the prerequisite commits.
       	 */
       	if ((result = unbundle(r, &header, bundle_fd, NULL,
     --			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_ALWAYS)))
     +-			       VERIFY_BUNDLE_QUIET)))
      +			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)))
       		return 1;
       
       	/*
      
     - ## bundle.c ##
     -@@
     - #include "list-objects-filter-options.h"
     - #include "connected.h"
     - #include "write-or-die.h"
     -+#include "fetch-pack.h"
     - 
     - static const char v2_bundle_signature[] = "# v2 git bundle\n";
     - static const char v3_bundle_signature[] = "# v3 git bundle\n";
     -@@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     - 	     enum verify_bundle_flags flags)
     - {
     - 	struct child_process ip = CHILD_PROCESS_INIT;
     -+	int fsck_objects = 0;
     - 
     - 	if (verify_bundle(r, header, flags))
     - 		return -1;
     -@@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     - 		strvec_push(&ip.args, "--promisor=from-bundle");
     - 
     - 	if (flags & VERIFY_BUNDLE_FSCK_ALWAYS)
     -+		fsck_objects = 1;
     -+	else if (flags & VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)
     -+		fsck_objects = fetch_pack_fsck_objects();
     -+
     -+	if (fsck_objects)
     - 		strvec_push(&ip.args, "--fsck-objects");
     - 
     - 	if (extra_index_pack_args) {
     -
     - ## bundle.h ##
     -@@ bundle.h: enum verify_bundle_flags {
     - 	VERIFY_BUNDLE_VERBOSE = (1 << 0),
     - 	VERIFY_BUNDLE_QUIET = (1 << 1),
     - 	VERIFY_BUNDLE_FSCK_ALWAYS = (1 << 2),
     -+	VERIFY_BUNDLE_FSCK_FOLLOW_FETCH = (1 << 3),
     - };
     - 
     - int verify_bundle(struct repository *r, struct bundle_header *header,
     -
       ## t/t5558-clone-bundle-uri.sh ##
      @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'create bundle' '
       		git bundle create B.bundle topic &&
     @@ transport.c: static int fetch_refs_from_bundle(struct transport *transport,
       	if (!data->get_refs_from_bundle_called)
       		get_refs_from_bundle_inner(transport);
       	ret = unbundle(the_repository, &data->header, data->fd,
     --		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_ALWAYS);
     +-		       &extra_index_pack_args, 0);
      +		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_FOLLOW_FETCH);
       	transport->hash_algo = data->header.hash_algo;
       	return ret;

-- 
gitgitgadget

  parent reply	other threads:[~2024-06-11  6:42 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15  3:01 [PATCH] bundle-uri: refresh packed_git if unbundle succeed blanet via GitGitGadget
2024-05-17  5:00 ` Patrick Steinhardt
2024-05-17 16:14   ` Junio C Hamano
2024-05-20 11:48     ` Xing Xin
2024-05-20 17:19       ` Junio C Hamano
2024-05-27 16:04         ` Xing Xin
2024-05-20  9:41   ` Xing Xin
2024-05-17  7:36 ` Karthik Nayak
2024-05-20 10:19   ` Xing Xin
2024-05-20 12:36 ` [PATCH v2] bundle-uri: verify oid before writing refs blanet via GitGitGadget
2024-05-21 15:41   ` Karthik Nayak
2024-05-27 15:41   ` [PATCH v3 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-05-27 15:41     ` [PATCH v3 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-28 11:55       ` Patrick Steinhardt
2024-05-30  8:32         ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 2/4] unbundle: introduce unbundle_fsck_flags for fsckobjects handling Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-29 18:12         ` Xing Xin
2024-05-30  4:38           ` Patrick Steinhardt
2024-05-30  8:46             ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-28 17:10         ` Junio C Hamano
2024-05-28 17:24           ` Junio C Hamano
2024-05-29  5:52             ` Patrick Steinhardt
2024-05-30  8:48             ` Xing Xin
2024-05-29  5:52           ` Patrick Steinhardt
2024-05-27 15:41     ` [PATCH v3 4/4] unbundle: introduce new option UNBUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-05-28 12:05       ` Patrick Steinhardt
2024-05-30  8:54         ` Xing Xin
2024-05-30  8:21     ` [PATCH v4 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 2/4] unbundle: extend verify_bundle_flags to support fsck-objects Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-05-30  8:21       ` [PATCH v4 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 4/4] unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-06-11  6:42       ` blanet via GitGitGadget [this message]
2024-06-11  6:42         ` [PATCH v5 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 2/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 3/4] unbundle: extend options to support object verification Xing Xin via GitGitGadget
2024-06-11  9:11           ` Patrick Steinhardt
2024-06-11 12:47             ` Xing Xin
2024-06-11  6:42         ` [PATCH v5 4/4] unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches Xing Xin via GitGitGadget
2024-06-11 12:45         ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-11 12:45           ` [PATCH v6 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11 19:08             ` Junio C Hamano
2024-06-17 13:53               ` Xing Xin
2024-06-11 12:45           ` [PATCH v6 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11 19:20             ` Junio C Hamano
2024-06-11 12:45           ` [PATCH v6 3/3] unbundle: support object verification for fetches Xing Xin via GitGitGadget
2024-06-11 20:05             ` Junio C Hamano
2024-06-12 18:33               ` Xing Xin
2024-06-11 13:14           ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches Patrick Steinhardt
2024-06-17 13:55           ` [PATCH v7 " blanet via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-18 17:37               ` Junio C Hamano
2024-06-19  6:30                 ` Xing Xin
2024-06-17 13:55             ` [PATCH v7 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget
2024-06-19  4:07             ` [PATCH v8 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1730.v5.git.1718088126.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=bupt_xingxin@163.com \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.