* [RFC PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks @ 2025-05-07 3:02 Justin Tobler 2025-05-07 3:02 ` [RFC PATCH 1/2] t5412: test receive-pack connectivity check Justin Tobler ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Justin Tobler @ 2025-05-07 3:02 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler Greetings, At GitLab, we are interested in introducing an optional means to bypass the connectivity checks performed through git-receive-pack(1). This series implements a `--skip-connectivity-check` option to facilitate this. I'm interested in collecting some thoughts before pursuing further though. For some background, we have a transaction management system that runs in our Git RPC service and wraps all repository operations. Operations that write to a repository are first recorded and staged outside of the repository. When committing a transaction, the connectivity of newly written objects is checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. The boundary objects are then checked in the canonical repository to ensure the new objects will connect as expected. All repository operations are run in a transaction and conflict checked before being applied serially to the canonical repository. This ensures that operations that would break the repository are not applied. For our specific use case, the conflict checks performed by git-receive-pack(1) are redundant and thus we would like to introduce an option that allows connectivity checks to be skipped. One concern I see could be the `--receive-pack` flag for git-push(1) or the `remote.<name>.receivepack` option which configures the receive-pack program that gets executed on the remote side for some protocols. This could provide a way for users to enable such a flag if the remote doesn't protect against arbitrary arguments being added. I don't think git-shell(1) would protect against this, so maybe instead of a flag a config option could be used or an ENV var? Thanks for taking a look. -Justin Justin Tobler (2): t5412: test receive-pack connectivity check builtin/receive-pack: add option to skip connectivity check builtin/receive-pack.c | 40 +++++++++++++++++++---------------- t/meson.build | 1 + t/t5412-receive-pack.sh | 47 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 70 insertions(+), 18 deletions(-) create mode 100755 t/t5412-receive-pack.sh base-commit: 6c0bd1fc70efaf053abe4e57c976afdc72d15377 -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC PATCH 1/2] t5412: test receive-pack connectivity check 2025-05-07 3:02 [RFC PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler @ 2025-05-07 3:02 ` Justin Tobler 2025-05-07 13:28 ` Patrick Steinhardt 2025-05-07 3:02 ` [RFC PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler 2025-05-20 1:49 ` [PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2 siblings, 1 reply; 25+ messages in thread From: Justin Tobler @ 2025-05-07 3:02 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler As part of git-recieve-pack(1), the connectivity of objects is checked. Add a test validating that git-receive-pack(1) fails due to an incoming packfile that would leave the repository with missing objects. Signed-off-by: Justin Tobler <jltobler@gmail.com> --- t/meson.build | 1 + t/t5412-receive-pack.sh | 27 +++++++++++++++++++++++++++ 2 files changed, 28 insertions(+) create mode 100755 t/t5412-receive-pack.sh diff --git a/t/meson.build b/t/meson.build index 43c9750b88..81066668b9 100644 --- a/t/meson.build +++ b/t/meson.build @@ -630,6 +630,7 @@ integration_tests = [ 't5409-colorize-remote-messages.sh', 't5410-receive-pack-alternates.sh', 't5411-proc-receive-hook.sh', + 't5412-receive-pack.sh', 't5500-fetch-pack.sh', 't5501-fetch-push-alternates.sh', 't5502-quickfetch.sh', diff --git a/t/t5412-receive-pack.sh b/t/t5412-receive-pack.sh new file mode 100755 index 0000000000..190c7d3624 --- /dev/null +++ b/t/t5412-receive-pack.sh @@ -0,0 +1,27 @@ +#!/bin/sh + +test_description='git receive-pack connectivity checks' + +. ./test-lib.sh + +test_expect_success 'receive-pack missing objects fails connectivity check' ' + test_when_finished rm -rf repo remote.git setup.git && + + git init repo && + git -C repo commit --allow-empty -m 1 && + git clone --bare repo setup.git && + git -C repo commit --allow-empty -m 2 && + + # Capture git-send-pack(1) output sent to git-receive-pack(1). + git -C repo send-pack ../setup.git --all \ + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + + # Replay captured git-send-pack(1) output on new empty repository. + git init --bare remote.git && + git receive-pack remote.git <out >actual && + + test_grep "fatal: Failed to traverse parents" actual && + test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) +' + +test_done -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [RFC PATCH 1/2] t5412: test receive-pack connectivity check 2025-05-07 3:02 ` [RFC PATCH 1/2] t5412: test receive-pack connectivity check Justin Tobler @ 2025-05-07 13:28 ` Patrick Steinhardt 2025-05-19 21:08 ` Justin Tobler 0 siblings, 1 reply; 25+ messages in thread From: Patrick Steinhardt @ 2025-05-07 13:28 UTC (permalink / raw) To: Justin Tobler; +Cc: git, karthik.188 On Tue, May 06, 2025 at 10:02:48PM -0500, Justin Tobler wrote: > As part of git-recieve-pack(1), the connectivity of objects is checked. > Add a test validating that git-receive-pack(1) fails due to an incoming > packfile that would leave the repository with missing objects. > > Signed-off-by: Justin Tobler <jltobler@gmail.com> > --- > t/meson.build | 1 + > t/t5412-receive-pack.sh | 27 +++++++++++++++++++++++++++ > 2 files changed, 28 insertions(+) > create mode 100755 t/t5412-receive-pack.sh > > diff --git a/t/meson.build b/t/meson.build > index 43c9750b88..81066668b9 100644 > --- a/t/meson.build > +++ b/t/meson.build > @@ -630,6 +630,7 @@ integration_tests = [ > 't5409-colorize-remote-messages.sh', > 't5410-receive-pack-alternates.sh', > 't5411-proc-receive-hook.sh', > + 't5412-receive-pack.sh', Instead of creating a new test file, do we maybe want to generalize "t5410-receive-pack-alternates.sh"? Just a suggestion, this is not a strong requirement from my side. > diff --git a/t/t5412-receive-pack.sh b/t/t5412-receive-pack.sh > new file mode 100755 > index 0000000000..190c7d3624 > --- /dev/null > +++ b/t/t5412-receive-pack.sh > @@ -0,0 +1,27 @@ > +#!/bin/sh > + > +test_description='git receive-pack connectivity checks' The description is way more specific than the file name suggests. > +. ./test-lib.sh > + > +test_expect_success 'receive-pack missing objects fails connectivity check' ' > + test_when_finished rm -rf repo remote.git setup.git && > + > + git init repo && > + git -C repo commit --allow-empty -m 1 && > + git clone --bare repo setup.git && > + git -C repo commit --allow-empty -m 2 && Okay, we create two repositories. "repo" contains the full history, "setup.git" only contains the first commit. > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > + git -C repo send-pack ../setup.git --all \ The `-C repo` shouldn't be necessary at all, should it? The repository in which it runs is specified via the first parameter. > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > + > + # Replay captured git-send-pack(1) output on new empty repository. > + git init --bare remote.git && > + git receive-pack remote.git <out >actual && And then we reply the packfile that only contains the second commit onto an empty repository, which should of course fail because we don't have all files. > + test_grep "fatal: Failed to traverse parents" actual && > + test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) I'm a bit surprised by the error message though. First, why is it on stdout? Second, shouldn't there be some hint that the connectivity check has failed in the error message? Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC PATCH 1/2] t5412: test receive-pack connectivity check 2025-05-07 13:28 ` Patrick Steinhardt @ 2025-05-19 21:08 ` Justin Tobler 0 siblings, 0 replies; 25+ messages in thread From: Justin Tobler @ 2025-05-19 21:08 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, karthik.188 On 25/05/07 03:28PM, Patrick Steinhardt wrote: > On Tue, May 06, 2025 at 10:02:48PM -0500, Justin Tobler wrote: > > @@ -630,6 +630,7 @@ integration_tests = [ > > 't5409-colorize-remote-messages.sh', > > 't5410-receive-pack-alternates.sh', > > 't5411-proc-receive-hook.sh', > > + 't5412-receive-pack.sh', > > Instead of creating a new test file, do we maybe want to generalize > "t5410-receive-pack-alternates.sh"? Just a suggestion, this is not a > strong requirement from my side. That makes sense to me. I'll generalize "t5410" and add the test there. > > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > > + git -C repo send-pack ../setup.git --all \ > > The `-C repo` shouldn't be necessary at all, should it? The repository > in which it runs is specified via the first parameter. The repository specified by the first parameter is the repository the pack is being sent to. We still need to define the repository on the sending side. > > + test_grep "fatal: Failed to traverse parents" actual && > > + test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > > I'm a bit surprised by the error message though. First, why is it on > stdout? Second, shouldn't there be some hint that the connectivity check > has failed in the error message? From my understanding, git-receive-pack(1) also writes the errors it encounters on stdout using sideband. On stderr, git-receive-pack(1) does print the git-rev-list(1) errors it encounters also so we could just capture that and use it in the assertions instead. When the connectivity check fails, the command error string is set to "missing necessary objects" and outputted on stdout. I will also add a check for this too. -Justin ^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC PATCH 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-07 3:02 [RFC PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-07 3:02 ` [RFC PATCH 1/2] t5412: test receive-pack connectivity check Justin Tobler @ 2025-05-07 3:02 ` Justin Tobler 2025-05-07 13:28 ` Patrick Steinhardt 2025-05-20 1:49 ` [PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2 siblings, 1 reply; 25+ messages in thread From: Justin Tobler @ 2025-05-07 3:02 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler During git-receive-pack(1), connectivity of the object graph is validated to ensure that the received packfile does not leave the repository in a broken state. Generally, this check is critical to avoid an incomplete receieved packfile from corrupting a repository. In situations where server operators validate the connectivity of incoming objects outside of Git, such a check may be redundant. Introduce the --skip-connectivity-check option for git-receive-pack(1) which bypasses this connectivity check to give more control to on the server-side. Signed-off-by: Justin Tobler <jltobler@gmail.com> --- builtin/receive-pack.c | 40 ++++++++++++++++++++++------------------ t/t5412-receive-pack.sh | 20 ++++++++++++++++++++ 2 files changed, 42 insertions(+), 18 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index be314879e8..66674bc408 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -81,6 +81,7 @@ static int prefer_ofs_delta = 1; static int auto_update_server_info; static int auto_gc = 1; static int reject_thin; +static int skip_connectivity_check; static int stateless_rpc; static const char *service_dir; static const char *head_name; @@ -1936,27 +1937,29 @@ static void execute_commands(struct command *commands, return; } - if (use_sideband) { - memset(&muxer, 0, sizeof(muxer)); - muxer.proc = copy_to_sideband; - muxer.in = -1; - if (!start_async(&muxer)) - err_fd = muxer.in; - /* ...else, continue without relaying sideband */ - } + if (!skip_connectivity_check) { + if (use_sideband) { + memset(&muxer, 0, sizeof(muxer)); + muxer.proc = copy_to_sideband; + muxer.in = -1; + if (!start_async(&muxer)) + err_fd = muxer.in; + /* ...else, continue without relaying sideband */ + } - data.cmds = commands; - data.si = si; - opt.err_fd = err_fd; - opt.progress = err_fd && !quiet; - opt.env = tmp_objdir_env(tmp_objdir); - opt.exclude_hidden_refs_section = "receive"; + data.cmds = commands; + data.si = si; + opt.err_fd = err_fd; + opt.progress = err_fd && !quiet; + opt.env = tmp_objdir_env(tmp_objdir); + opt.exclude_hidden_refs_section = "receive"; - if (check_connected(iterate_receive_command_list, &data, &opt)) - set_connectivity_errors(commands, si); + if (check_connected(iterate_receive_command_list, &data, &opt)) + set_connectivity_errors(commands, si); - if (use_sideband) - finish_async(&muxer); + if (use_sideband) + finish_async(&muxer); + } reject_updates_to_hidden(commands); @@ -2517,6 +2520,7 @@ int cmd_receive_pack(int argc, struct option options[] = { OPT__QUIET(&quiet, N_("quiet")), + OPT_HIDDEN_BOOL(0, "skip-connectivity-check", &skip_connectivity_check, NULL), OPT_HIDDEN_BOOL(0, "stateless-rpc", &stateless_rpc, NULL), OPT_HIDDEN_BOOL(0, "http-backend-info-refs", &advertise_refs, NULL), OPT_ALIAS(0, "advertise-refs", "http-backend-info-refs"), diff --git a/t/t5412-receive-pack.sh b/t/t5412-receive-pack.sh index 190c7d3624..426a3a3df4 100755 --- a/t/t5412-receive-pack.sh +++ b/t/t5412-receive-pack.sh @@ -24,4 +24,24 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) ' +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' + test_when_finished rm -rf repo remote.git setup.git && + + git init repo && + git -C repo commit --allow-empty -m 1 && + git clone --bare repo setup.git && + git -C repo commit --allow-empty -m 2 && + + # Capture git-send-pack(1) output sent to git-receive-pack(1). + git -C repo send-pack ../setup.git --all \ + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + + # Replay captured git-send-pack(1) output on new empty repository. + git init --bare remote.git && + git receive-pack --skip-connectivity-check remote.git <out >actual && + + test_grep ! "fatal: Failed to traverse parents" actual && + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) +' + test_done -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [RFC PATCH 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-07 3:02 ` [RFC PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler @ 2025-05-07 13:28 ` Patrick Steinhardt 2025-05-07 17:20 ` Junio C Hamano 0 siblings, 1 reply; 25+ messages in thread From: Patrick Steinhardt @ 2025-05-07 13:28 UTC (permalink / raw) To: Justin Tobler; +Cc: git, karthik.188 On Tue, May 06, 2025 at 10:02:49PM -0500, Justin Tobler wrote: > During git-receive-pack(1), connectivity of the object graph is > validated to ensure that the received packfile does not leave the > repository in a broken state. > > Generally, this check is critical to avoid an incomplete receieved s/receieved/received/ > packfile from corrupting a repository. In situations where server > operators validate the connectivity of incoming objects outside of Git, > such a check may be redundant. This is a bit handwavy. _I_ know why we at GitLab are doing this, but other readers won't have the necessary context to be able to judge whether this really is a good idea. I think the important question to answer is: why does the server side want to perform the check if Git already does it anyway? Why is it in a better position to do so? And why can't we instead have Git itself perform it in the same "better" way? Ultimately it boils down to having more knowledge around exactly how Git is being used on the server side. With the additional information we can make better decisions and we can make assumptions that a general user of the connectivity check cannot do. Most importantly, we know that all objects in the repository will always be fully connected. Received packfiles get filtered, so we won't ever accept an object that isn't fully connected. Neither will Gitaly write such an object. So what we can do in Gitaly specifically is similar to what I proposed in [1] a while ago: we simply walk all received objects and then verify that the edges resolve to objects in the existing repository. The idea was rightfully shot down because we cannot assume in general that a mere object walk in the quarantine directory really means that all objects are fully connected. But with the extra knowledge we have in Gitaly we can do this optimization indeed. Patrick [1]: <cover.1621451532.git.ps@pks.im> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC PATCH 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-07 13:28 ` Patrick Steinhardt @ 2025-05-07 17:20 ` Junio C Hamano 0 siblings, 0 replies; 25+ messages in thread From: Junio C Hamano @ 2025-05-07 17:20 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Justin Tobler, git, karthik.188 Patrick Steinhardt <ps@pks.im> writes: > On Tue, May 06, 2025 at 10:02:49PM -0500, Justin Tobler wrote: >> During git-receive-pack(1), connectivity of the object graph is >> validated to ensure that the received packfile does not leave the >> repository in a broken state. >> >> Generally, this check is critical to avoid an incomplete receieved > > s/receieved/received/ > >> packfile from corrupting a repository. In situations where server >> operators validate the connectivity of incoming objects outside of Git, >> such a check may be redundant. > > This is a bit handwavy. _I_ know why we at GitLab are doing this, but > other readers won't have the necessary context to be able to judge > whether this really is a good idea. I think the important question to > answer is: why does the server side want to perform the check if Git > already does it anyway? Why is it in a better position to do so? And why > can't we instead have Git itself perform it in the same "better" way? All of these would be interesting questions to be answered also in the documentation patch (yet to come, I presume) that warns against use of this new option by mere mortals without thinking things through. "Unless your receiving end has such and such facility to ensure that new data taken from the pack stream really makes the objects at the new tips of refs being proposed by this incoming "git push", you'll risk corrupting your repository" or something. Otherwise, I think the cover letter sells the "feature" nicely and in a convincing way. Thanks. ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks 2025-05-07 3:02 [RFC PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-07 3:02 ` [RFC PATCH 1/2] t5412: test receive-pack connectivity check Justin Tobler 2025-05-07 3:02 ` [RFC PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler @ 2025-05-20 1:49 ` Justin Tobler 2025-05-20 1:49 ` [PATCH 1/2] t5410: test receive-pack connectivity check Justin Tobler ` (2 more replies) 2 siblings, 3 replies; 25+ messages in thread From: Justin Tobler @ 2025-05-20 1:49 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler Greetings, At GitLab, we are interested in introducing an optional means to bypass the connectivity checks performed through git-receive-pack(1). This series implements a `--skip-connectivity-check` option to facilitate this. For some background, we have a transaction management system that runs in our Git RPC service and wraps all repository operations. Operations that write to a repository are first recorded and staged outside of the repository. When committing a transaction, the connectivity of newly written objects is checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. The boundary objects are then checked in the canonical repository to ensure the new objects will connect as expected. All repository operations are run in a transaction and conflict checked before being applied serially to the canonical repository. This ensures that operations that would break the repository are not applied. For our specific use case, the conflict checks performed by git-receive-pack(1) are redundant and thus we would like to introduce an option that allows connectivity checks to be skipped. Thanks for taking a look. Changes since V1: - Instead of creating a new test file, t5410 is generalized. - Test assertions updated to more explicitly check for object connectivity error messages. - Updated commit message and added documentation for the option. -Justin Justin Tobler (2): t5410: test receive-pack connectivity check builtin/receive-pack: add option to skip connectivity check Documentation/git-receive-pack.adoc | 12 ++++ builtin/receive-pack.c | 40 ++++++++------ t/meson.build | 2 +- t/t5410-receive-pack-alternates.sh | 44 --------------- t/t5410-receive-pack.sh | 86 +++++++++++++++++++++++++++++ 5 files changed, 121 insertions(+), 63 deletions(-) delete mode 100755 t/t5410-receive-pack-alternates.sh create mode 100755 t/t5410-receive-pack.sh base-commit: 6c0bd1fc70efaf053abe4e57c976afdc72d15377 -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/2] t5410: test receive-pack connectivity check 2025-05-20 1:49 ` [PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler @ 2025-05-20 1:49 ` Justin Tobler 2025-05-20 9:11 ` Karthik Nayak 2025-05-20 1:49 ` [PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler 2025-05-20 16:32 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2 siblings, 1 reply; 25+ messages in thread From: Justin Tobler @ 2025-05-20 1:49 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler As part of git-recieve-pack(1), the connectivity of objects is checked. Add a test validating that git-receive-pack(1) fails due to an incoming packfile that would leave the repository with missing objects. Instead of creating a new test file, "t5410" is generalized for receive-pack testing. Signed-off-by: Justin Tobler <jltobler@gmail.com> --- t/meson.build | 2 +- ...ck-alternates.sh => t5410-receive-pack.sh} | 23 ++++++++++++++++++- 2 files changed, 23 insertions(+), 2 deletions(-) rename t/{t5410-receive-pack-alternates.sh => t5410-receive-pack.sh} (57%) diff --git a/t/meson.build b/t/meson.build index 43c9750b88..6b7c0b167b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -628,7 +628,7 @@ integration_tests = [ 't5407-post-rewrite-hook.sh', 't5408-send-pack-stdin.sh', 't5409-colorize-remote-messages.sh', - 't5410-receive-pack-alternates.sh', + 't5410-receive-pack.sh', 't5411-proc-receive-hook.sh', 't5500-fetch-pack.sh', 't5501-fetch-push-alternates.sh', diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack.sh similarity index 57% rename from t/t5410-receive-pack-alternates.sh rename to t/t5410-receive-pack.sh index 4e82fd102e..9afea54a26 100755 --- a/t/t5410-receive-pack-alternates.sh +++ b/t/t5410-receive-pack.sh @@ -1,6 +1,6 @@ #!/bin/sh -test_description='git receive-pack with alternate ref filtering' +test_description='git receive-pack' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME @@ -41,4 +41,25 @@ test_expect_success 'with core.alternateRefsPrefixes' ' test_cmp expect actual.haves ' +test_expect_success 'receive-pack missing objects fails connectivity check' ' + test_when_finished rm -rf repo remote.git setup.git && + + git init repo && + git -C repo commit --allow-empty -m 1 && + git clone --bare repo setup.git && + git -C repo commit --allow-empty -m 2 && + + # Capture git-send-pack(1) output sent to git-receive-pack(1). + git -C repo send-pack ../setup.git --all \ + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + + # Replay captured git-send-pack(1) output on new empty repository. + git init --bare remote.git && + git receive-pack remote.git <out >actual 2>err && + + test_grep "missing necessary objects" actual && + test_grep "fatal: Failed to traverse parents" err && + test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) +' + test_done -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/2] t5410: test receive-pack connectivity check 2025-05-20 1:49 ` [PATCH 1/2] t5410: test receive-pack connectivity check Justin Tobler @ 2025-05-20 9:11 ` Karthik Nayak 0 siblings, 0 replies; 25+ messages in thread From: Karthik Nayak @ 2025-05-20 9:11 UTC (permalink / raw) To: Justin Tobler, git; +Cc: ps [-- Attachment #1: Type: text/plain, Size: 2963 bytes --] Justin Tobler <jltobler@gmail.com> writes: > As part of git-recieve-pack(1), the connectivity of objects is checked. > Add a test validating that git-receive-pack(1) fails due to an incoming > packfile that would leave the repository with missing objects. Instead > of creating a new test file, "t5410" is generalized for receive-pack > testing. > > Signed-off-by: Justin Tobler <jltobler@gmail.com> > --- > t/meson.build | 2 +- > ...ck-alternates.sh => t5410-receive-pack.sh} | 23 ++++++++++++++++++- > 2 files changed, 23 insertions(+), 2 deletions(-) > rename t/{t5410-receive-pack-alternates.sh => t5410-receive-pack.sh} (57%) > > diff --git a/t/meson.build b/t/meson.build > index 43c9750b88..6b7c0b167b 100644 > --- a/t/meson.build > +++ b/t/meson.build > @@ -628,7 +628,7 @@ integration_tests = [ > 't5407-post-rewrite-hook.sh', > 't5408-send-pack-stdin.sh', > 't5409-colorize-remote-messages.sh', > - 't5410-receive-pack-alternates.sh', > + 't5410-receive-pack.sh', I think this is much better that creating a new test file, since currently the file houses only 2 tests. When we do start adding a lot more tests we can branch out as necessary. > 't5411-proc-receive-hook.sh', > 't5500-fetch-pack.sh', > 't5501-fetch-push-alternates.sh', > diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack.sh > similarity index 57% > rename from t/t5410-receive-pack-alternates.sh > rename to t/t5410-receive-pack.sh > index 4e82fd102e..9afea54a26 100755 > --- a/t/t5410-receive-pack-alternates.sh > +++ b/t/t5410-receive-pack.sh > @@ -1,6 +1,6 @@ > #!/bin/sh > > -test_description='git receive-pack with alternate ref filtering' > +test_description='git receive-pack' > > GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > @@ -41,4 +41,25 @@ test_expect_success 'with core.alternateRefsPrefixes' ' > test_cmp expect actual.haves > ' > > +test_expect_success 'receive-pack missing objects fails connectivity check' ' > + test_when_finished rm -rf repo remote.git setup.git && > + > + git init repo && > + git -C repo commit --allow-empty -m 1 && > + git clone --bare repo setup.git && > + git -C repo commit --allow-empty -m 2 && > + > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > + git -C repo send-pack ../setup.git --all \ > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > + We don't remove 'out' with 'test_when_finished' but that's okay since 'tee' overrides the file by default. Makes sense. > + # Replay captured git-send-pack(1) output on new empty repository. > + git init --bare remote.git && > + git receive-pack remote.git <out >actual 2>err && > + > + test_grep "missing necessary objects" actual && > + test_grep "fatal: Failed to traverse parents" err && > + test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > +' > + > test_done > -- > 2.49.0.111.g5b97a56fa0 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-20 1:49 ` [PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-20 1:49 ` [PATCH 1/2] t5410: test receive-pack connectivity check Justin Tobler @ 2025-05-20 1:49 ` Justin Tobler 2025-05-20 5:17 ` Patrick Steinhardt 2025-05-20 9:16 ` Karthik Nayak 2025-05-20 16:32 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2 siblings, 2 replies; 25+ messages in thread From: Justin Tobler @ 2025-05-20 1:49 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler During git-receive-pack(1), connectivity of the object graph is validated to ensure that the received packfile does not leave the repository in a broken state. This is done via git-rev-list(1) and walking the objects which can be expensive for large repositories. Generally, this check is critical to avoid an incomplete received packfile from corrupting a repository. Server operators may have additional knowledge though around exactly how Git is being used on the server-side which can be used to facilitate more efficient connectivity computatation of incoming objects. For example, if it can be ensured that all objects in a repository are connected and do not depend on any missing objects, the connectivity of newly written objects can be checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. These boundary objects can be checked in the canonical repository to ensure the new objects connect as expected and thus avoid walking the rest of the object graph. Git itself cannot make the guarantees required for such an optimization as it is possible for a repository to contain an unreachable object that references a missing object without the repository being considered corrupt. Introduce the --skip-connectivity-check option for git-receive-pack(1) which bypasses this connectivity check to give more control to the server-side. Note that without proper server-side validation of newly received objects handled outside of Git, usage of this option risks corrupting a repository. Signed-off-by: Justin Tobler <jltobler@gmail.com> --- Documentation/git-receive-pack.adoc | 12 +++++++++ builtin/receive-pack.c | 40 ++++++++++++++++------------- t/t5410-receive-pack.sh | 21 +++++++++++++++ 3 files changed, 55 insertions(+), 18 deletions(-) diff --git a/Documentation/git-receive-pack.adoc b/Documentation/git-receive-pack.adoc index 20aca92073..68427d93d9 100644 --- a/Documentation/git-receive-pack.adoc +++ b/Documentation/git-receive-pack.adoc @@ -46,6 +46,18 @@ OPTIONS `$GIT_URL/info/refs?service=git-receive-pack` requests. See `--http-backend-info-refs` in linkgit:git-upload-pack[1]. +--skip-connectivity-check:: + Bypasses the connectivity checks performed to validate incoming + objects. This option exists for server operators that may want to + implement their own object connectivity check outside of Git. This is + useful in such cases where the server-side knows additional information + about how Git is being used and thus can rely on guarantees to more + efficiently compute object connectivity that Git itself cannot make. + Usage of this option without a separate mechanism to validate and + ensure incoming objects connect properly to the references risks a + repository becoming corrupted and should not be used in the general + case. + PRE-RECEIVE HOOK ---------------- Before any ref is updated, if $GIT_DIR/hooks/pre-receive file exists diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index be314879e8..66674bc408 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -81,6 +81,7 @@ static int prefer_ofs_delta = 1; static int auto_update_server_info; static int auto_gc = 1; static int reject_thin; +static int skip_connectivity_check; static int stateless_rpc; static const char *service_dir; static const char *head_name; @@ -1936,27 +1937,29 @@ static void execute_commands(struct command *commands, return; } - if (use_sideband) { - memset(&muxer, 0, sizeof(muxer)); - muxer.proc = copy_to_sideband; - muxer.in = -1; - if (!start_async(&muxer)) - err_fd = muxer.in; - /* ...else, continue without relaying sideband */ - } + if (!skip_connectivity_check) { + if (use_sideband) { + memset(&muxer, 0, sizeof(muxer)); + muxer.proc = copy_to_sideband; + muxer.in = -1; + if (!start_async(&muxer)) + err_fd = muxer.in; + /* ...else, continue without relaying sideband */ + } - data.cmds = commands; - data.si = si; - opt.err_fd = err_fd; - opt.progress = err_fd && !quiet; - opt.env = tmp_objdir_env(tmp_objdir); - opt.exclude_hidden_refs_section = "receive"; + data.cmds = commands; + data.si = si; + opt.err_fd = err_fd; + opt.progress = err_fd && !quiet; + opt.env = tmp_objdir_env(tmp_objdir); + opt.exclude_hidden_refs_section = "receive"; - if (check_connected(iterate_receive_command_list, &data, &opt)) - set_connectivity_errors(commands, si); + if (check_connected(iterate_receive_command_list, &data, &opt)) + set_connectivity_errors(commands, si); - if (use_sideband) - finish_async(&muxer); + if (use_sideband) + finish_async(&muxer); + } reject_updates_to_hidden(commands); @@ -2517,6 +2520,7 @@ int cmd_receive_pack(int argc, struct option options[] = { OPT__QUIET(&quiet, N_("quiet")), + OPT_HIDDEN_BOOL(0, "skip-connectivity-check", &skip_connectivity_check, NULL), OPT_HIDDEN_BOOL(0, "stateless-rpc", &stateless_rpc, NULL), OPT_HIDDEN_BOOL(0, "http-backend-info-refs", &advertise_refs, NULL), OPT_ALIAS(0, "advertise-refs", "http-backend-info-refs"), diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh index 9afea54a26..10c67c2bf8 100755 --- a/t/t5410-receive-pack.sh +++ b/t/t5410-receive-pack.sh @@ -62,4 +62,25 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) ' +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' + test_when_finished rm -rf repo remote.git setup.git && + + git init repo && + git -C repo commit --allow-empty -m 1 && + git clone --bare repo setup.git && + git -C repo commit --allow-empty -m 2 && + + # Capture git-send-pack(1) output sent to git-receive-pack(1). + git -C repo send-pack ../setup.git --all \ + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + + # Replay captured git-send-pack(1) output on new empty repository. + git init --bare remote.git && + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && + + test_grep ! "missing necessary objects" actual && + test_must_be_empty err && + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) +' + test_done -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-20 1:49 ` [PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler @ 2025-05-20 5:17 ` Patrick Steinhardt 2025-05-20 15:10 ` Justin Tobler 2025-05-20 9:16 ` Karthik Nayak 1 sibling, 1 reply; 25+ messages in thread From: Patrick Steinhardt @ 2025-05-20 5:17 UTC (permalink / raw) To: Justin Tobler; +Cc: git, karthik.188 On Mon, May 19, 2025 at 08:49:20PM -0500, Justin Tobler wrote: > During git-receive-pack(1), connectivity of the object graph is > validated to ensure that the received packfile does not leave the > repository in a broken state. This is done via git-rev-list(1) and > walking the objects which can be expensive for large repositories. s/objects/&,/ > > Generally, this check is critical to avoid an incomplete received > packfile from corrupting a repository. Server operators may have > additional knowledge though around exactly how Git is being used on the > server-side which can be used to facilitate more efficient connectivity > computatation of incoming objects. > > For example, if it can be ensured that all objects in a repository are > connected and do not depend on any missing objects, the connectivity of > newly written objects can be checked by walking the object graph > containing only the new objects from the updated tips and identifying > the missing objects which represent the boundary between the new objects > and the repository. These boundary objects can be checked in the > canonical repository to ensure the new objects connect as expected and > thus avoid walking the rest of the object graph. > > Git itself cannot make the guarantees required for such an optimization > as it is possible for a repository to contain an unreachable object that > references a missing object without the repository being considered > corrupt. Yup, this reads very well to me now, and clearly lays out why it is an assumption that _some_ setups can do, but others can't. > Introduce the --skip-connectivity-check option for git-receive-pack(1) > which bypasses this connectivity check to give more control to the > server-side. Note that without proper server-side validation of newly > received objects handled outside of Git, usage of this option risks > corrupting a repository. > > Signed-off-by: Justin Tobler <jltobler@gmail.com> > --- > Documentation/git-receive-pack.adoc | 12 +++++++++ > builtin/receive-pack.c | 40 ++++++++++++++++------------- > t/t5410-receive-pack.sh | 21 +++++++++++++++ > 3 files changed, 55 insertions(+), 18 deletions(-) > > diff --git a/Documentation/git-receive-pack.adoc b/Documentation/git-receive-pack.adoc > index 20aca92073..68427d93d9 100644 > --- a/Documentation/git-receive-pack.adoc > +++ b/Documentation/git-receive-pack.adoc > @@ -46,6 +46,18 @@ OPTIONS > `$GIT_URL/info/refs?service=git-receive-pack` requests. See > `--http-backend-info-refs` in linkgit:git-upload-pack[1]. > > +--skip-connectivity-check:: > + Bypasses the connectivity checks performed to validate incoming > + objects. This option exists for server operators that may want to > + implement their own object connectivity check outside of Git. This is > + useful in such cases where the server-side knows additional information > + about how Git is being used and thus can rely on guarantees to more > + efficiently compute object connectivity that Git itself cannot make. > + Usage of this option without a separate mechanism to validate and > + ensure incoming objects connect properly to the references risks a > + repository becoming corrupted and should not be used in the general > + case. Nit: the connectivity check doesn't only have to verify that objects connect to existing refs, but also that all objects part of the transitive closure of reachable objects exist. Might be worthwhile to point out here. > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh > index 9afea54a26..10c67c2bf8 100755 > --- a/t/t5410-receive-pack.sh > +++ b/t/t5410-receive-pack.sh > @@ -62,4 +62,25 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' > test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > ' > > +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' > + test_when_finished rm -rf repo remote.git setup.git && > + > + git init repo && > + git -C repo commit --allow-empty -m 1 && > + git clone --bare repo setup.git && > + git -C repo commit --allow-empty -m 2 && > + > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > + git -C repo send-pack ../setup.git --all \ > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > + > + # Replay captured git-send-pack(1) output on new empty repository. > + git init --bare remote.git && > + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && > + > + test_grep ! "missing necessary objects" actual && > + test_must_be_empty err && Yup, the connectivity check shouldn't fail anymore. > + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) And we do have the object now. Do we maybe also want to have a check though that the repository itself _isn't_ fully connected to ensure that the test setup isn't broken? Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-20 5:17 ` Patrick Steinhardt @ 2025-05-20 15:10 ` Justin Tobler 0 siblings, 0 replies; 25+ messages in thread From: Justin Tobler @ 2025-05-20 15:10 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, karthik.188 On 25/05/20 07:17AM, Patrick Steinhardt wrote: > On Mon, May 19, 2025 at 08:49:20PM -0500, Justin Tobler wrote: > > diff --git a/Documentation/git-receive-pack.adoc b/Documentation/git-receive-pack.adoc > > index 20aca92073..68427d93d9 100644 > > --- a/Documentation/git-receive-pack.adoc > > +++ b/Documentation/git-receive-pack.adoc > > @@ -46,6 +46,18 @@ OPTIONS > > `$GIT_URL/info/refs?service=git-receive-pack` requests. See > > `--http-backend-info-refs` in linkgit:git-upload-pack[1]. > > > > +--skip-connectivity-check:: > > + Bypasses the connectivity checks performed to validate incoming > > + objects. This option exists for server operators that may want to > > + implement their own object connectivity check outside of Git. This is > > + useful in such cases where the server-side knows additional information > > + about how Git is being used and thus can rely on guarantees to more > > + efficiently compute object connectivity that Git itself cannot make. > > + Usage of this option without a separate mechanism to validate and > > + ensure incoming objects connect properly to the references risks a > > + repository becoming corrupted and should not be used in the general > > + case. > > Nit: the connectivity check doesn't only have to verify that objects > connect to existing refs, but also that all objects part of the > transitive closure of reachable objects exist. Might be worthwhile to > point out here. That's a good point, I'll teak the wording here so something like this: Bypasses the connectivity checks that validate the existence of all objects in the transitive closure of reachable objects. This option is intended for server operators that want to implement their own object connectivity validation outside of Git. This is useful in such cases where the server-side knows additional information about how Git is being used and thus can rely on certain guarantees to more efficiently compute object connectivity that Git itself cannot make. Usage of this option without a reliable external mechanism to ensure full reachable object connectivity risks corrupting the repository and should not be used in the general case. > > + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > > And we do have the object now. Do we maybe also want to have a check > though that the repository itself _isn't_ fully connected to ensure that > the test setup isn't broken? That makes sense. I'll do something like this in the next version: diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh index 10c67c2bf8..f76a22943e 100755 --- a/t/t5410-receive-pack.sh +++ b/t/t5410-receive-pack.sh @@ -80,7 +80,8 @@ test_expect_success 'receive-pack missing objects bypasses connectivity check' ' test_grep ! "missing necessary objects" actual && test_must_be_empty err && - git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && + test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) ' test_done Thanks for the review! -Justin ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-20 1:49 ` [PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler 2025-05-20 5:17 ` Patrick Steinhardt @ 2025-05-20 9:16 ` Karthik Nayak 1 sibling, 0 replies; 25+ messages in thread From: Karthik Nayak @ 2025-05-20 9:16 UTC (permalink / raw) To: Justin Tobler, git; +Cc: ps [-- Attachment #1: Type: text/plain, Size: 1839 bytes --] Justin Tobler <jltobler@gmail.com> writes: > During git-receive-pack(1), connectivity of the object graph is > validated to ensure that the received packfile does not leave the > repository in a broken state. This is done via git-rev-list(1) and > walking the objects which can be expensive for large repositories. > > Generally, this check is critical to avoid an incomplete received > packfile from corrupting a repository. Server operators may have > additional knowledge though around exactly how Git is being used on the > server-side which can be used to facilitate more efficient connectivity > computatation of incoming objects. > s/computatation/computation > > For example, if it can be ensured that all objects in a repository are > connected and do not depend on any missing objects, the connectivity of > newly written objects can be checked by walking the object graph > containing only the new objects from the updated tips and identifying > the missing objects which represent the boundary between the new objects > and the repository. These boundary objects can be checked in the > canonical repository to ensure the new objects connect as expected and > thus avoid walking the rest of the object graph. > > Git itself cannot make the guarantees required for such an optimization > as it is possible for a repository to contain an unreachable object that > references a missing object without the repository being considered > corrupt. > > Introduce the --skip-connectivity-check option for git-receive-pack(1) > which bypasses this connectivity check to give more control to the > server-side. Note that without proper server-side validation of newly > received objects handled outside of Git, usage of this option risks > corrupting a repository. > I don't have any comments for the patch itself, looked good to me! [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks 2025-05-20 1:49 ` [PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-20 1:49 ` [PATCH 1/2] t5410: test receive-pack connectivity check Justin Tobler 2025-05-20 1:49 ` [PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler @ 2025-05-20 16:32 ` Justin Tobler 2025-05-20 16:32 ` [PATCH v2 1/2] t5410: test receive-pack connectivity check Justin Tobler ` (2 more replies) 2 siblings, 3 replies; 25+ messages in thread From: Justin Tobler @ 2025-05-20 16:32 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler Greetings, At GitLab, we are interested in introducing an optional means to bypass the connectivity checks performed through git-receive-pack(1). This series implements a `--skip-connectivity-check` option to facilitate this. For some background, we have a transaction management system that runs in our Git RPC service and wraps all repository operations. Operations that write to a repository are first recorded and staged outside of the repository. When committing a transaction, the connectivity of newly written objects is checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. The boundary objects are then checked in the canonical repository to ensure the new objects will connect as expected. All repository operations are run in a transaction and conflict checked before being applied serially to the canonical repository. This ensures that operations that would break the repository are not applied. For our specific use case, the conflict checks performed by git-receive-pack(1) are redundant and thus we would like to introduce an option that allows connectivity checks to be skipped. Thanks for taking a look. -Justin Justin Tobler (2): t5410: test receive-pack connectivity check builtin/receive-pack: add option to skip connectivity check Documentation/git-receive-pack.adoc | 12 ++++ builtin/receive-pack.c | 40 +++++++------ t/meson.build | 2 +- t/t5410-receive-pack-alternates.sh | 44 --------------- t/t5410-receive-pack.sh | 87 +++++++++++++++++++++++++++++ 5 files changed, 122 insertions(+), 63 deletions(-) delete mode 100755 t/t5410-receive-pack-alternates.sh create mode 100755 t/t5410-receive-pack.sh Range-diff against v1: 1: f659612c9d = 1: f659612c9d t5410: test receive-pack connectivity check 2: 31e5f41983 ! 2: f6dbb02778 builtin/receive-pack: add option to skip connectivity check @@ Commit message During git-receive-pack(1), connectivity of the object graph is validated to ensure that the received packfile does not leave the repository in a broken state. This is done via git-rev-list(1) and - walking the objects which can be expensive for large repositories. + walking the objects, which can be expensive for large repositories. Generally, this check is critical to avoid an incomplete received packfile from corrupting a repository. Server operators may have additional knowledge though around exactly how Git is being used on the server-side which can be used to facilitate more efficient connectivity - computatation of incoming objects. + computation of incoming objects. For example, if it can be ensured that all objects in a repository are connected and do not depend on any missing objects, the connectivity of @@ Documentation/git-receive-pack.adoc: OPTIONS `--http-backend-info-refs` in linkgit:git-upload-pack[1]. +--skip-connectivity-check:: -+ Bypasses the connectivity checks performed to validate incoming -+ objects. This option exists for server operators that may want to -+ implement their own object connectivity check outside of Git. This is -+ useful in such cases where the server-side knows additional information -+ about how Git is being used and thus can rely on guarantees to more -+ efficiently compute object connectivity that Git itself cannot make. -+ Usage of this option without a separate mechanism to validate and -+ ensure incoming objects connect properly to the references risks a -+ repository becoming corrupted and should not be used in the general -+ case. ++ Bypasses the connectivity checks that validate the existence of all ++ objects in the transitive closure of reachable objects. This option is ++ intended for server operators that want to implement their own object ++ connectivity validation outside of Git. This is useful in such cases ++ where the server-side knows additional information about how Git is ++ being used and thus can rely on certain guarantees to more efficiently ++ compute object connectivity that Git itself cannot make. Usage of this ++ option without a reliable external mechanism to ensure full reachable ++ object connectivity risks corrupting the repository and should not be ++ used in the general case. + PRE-RECEIVE HOOK ---------------- @@ t/t5410-receive-pack.sh: test_expect_success 'receive-pack missing objects fails + + test_grep ! "missing necessary objects" actual && + test_must_be_empty err && -+ git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) ++ git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && ++ test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) +' + test_done base-commit: 6c0bd1fc70efaf053abe4e57c976afdc72d15377 -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2 1/2] t5410: test receive-pack connectivity check 2025-05-20 16:32 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler @ 2025-05-20 16:32 ` Justin Tobler 2025-05-20 16:32 ` [PATCH v2 2/2] builtin/receive-pack: add option to skip " Justin Tobler 2025-05-22 9:09 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Karthik Nayak 2 siblings, 0 replies; 25+ messages in thread From: Justin Tobler @ 2025-05-20 16:32 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler As part of git-recieve-pack(1), the connectivity of objects is checked. Add a test validating that git-receive-pack(1) fails due to an incoming packfile that would leave the repository with missing objects. Instead of creating a new test file, "t5410" is generalized for receive-pack testing. Signed-off-by: Justin Tobler <jltobler@gmail.com> --- t/meson.build | 2 +- ...ck-alternates.sh => t5410-receive-pack.sh} | 23 ++++++++++++++++++- 2 files changed, 23 insertions(+), 2 deletions(-) rename t/{t5410-receive-pack-alternates.sh => t5410-receive-pack.sh} (57%) diff --git a/t/meson.build b/t/meson.build index 43c9750b88..6b7c0b167b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -628,7 +628,7 @@ integration_tests = [ 't5407-post-rewrite-hook.sh', 't5408-send-pack-stdin.sh', 't5409-colorize-remote-messages.sh', - 't5410-receive-pack-alternates.sh', + 't5410-receive-pack.sh', 't5411-proc-receive-hook.sh', 't5500-fetch-pack.sh', 't5501-fetch-push-alternates.sh', diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack.sh similarity index 57% rename from t/t5410-receive-pack-alternates.sh rename to t/t5410-receive-pack.sh index 4e82fd102e..9afea54a26 100755 --- a/t/t5410-receive-pack-alternates.sh +++ b/t/t5410-receive-pack.sh @@ -1,6 +1,6 @@ #!/bin/sh -test_description='git receive-pack with alternate ref filtering' +test_description='git receive-pack' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME @@ -41,4 +41,25 @@ test_expect_success 'with core.alternateRefsPrefixes' ' test_cmp expect actual.haves ' +test_expect_success 'receive-pack missing objects fails connectivity check' ' + test_when_finished rm -rf repo remote.git setup.git && + + git init repo && + git -C repo commit --allow-empty -m 1 && + git clone --bare repo setup.git && + git -C repo commit --allow-empty -m 2 && + + # Capture git-send-pack(1) output sent to git-receive-pack(1). + git -C repo send-pack ../setup.git --all \ + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + + # Replay captured git-send-pack(1) output on new empty repository. + git init --bare remote.git && + git receive-pack remote.git <out >actual 2>err && + + test_grep "missing necessary objects" actual && + test_grep "fatal: Failed to traverse parents" err && + test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) +' + test_done -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-20 16:32 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-20 16:32 ` [PATCH v2 1/2] t5410: test receive-pack connectivity check Justin Tobler @ 2025-05-20 16:32 ` Justin Tobler 2025-06-02 15:01 ` Johannes Schindelin 2025-05-22 9:09 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Karthik Nayak 2 siblings, 1 reply; 25+ messages in thread From: Justin Tobler @ 2025-05-20 16:32 UTC (permalink / raw) To: git; +Cc: ps, karthik.188, Justin Tobler During git-receive-pack(1), connectivity of the object graph is validated to ensure that the received packfile does not leave the repository in a broken state. This is done via git-rev-list(1) and walking the objects, which can be expensive for large repositories. Generally, this check is critical to avoid an incomplete received packfile from corrupting a repository. Server operators may have additional knowledge though around exactly how Git is being used on the server-side which can be used to facilitate more efficient connectivity computation of incoming objects. For example, if it can be ensured that all objects in a repository are connected and do not depend on any missing objects, the connectivity of newly written objects can be checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. These boundary objects can be checked in the canonical repository to ensure the new objects connect as expected and thus avoid walking the rest of the object graph. Git itself cannot make the guarantees required for such an optimization as it is possible for a repository to contain an unreachable object that references a missing object without the repository being considered corrupt. Introduce the --skip-connectivity-check option for git-receive-pack(1) which bypasses this connectivity check to give more control to the server-side. Note that without proper server-side validation of newly received objects handled outside of Git, usage of this option risks corrupting a repository. Signed-off-by: Justin Tobler <jltobler@gmail.com> --- Documentation/git-receive-pack.adoc | 12 +++++++++ builtin/receive-pack.c | 40 ++++++++++++++++------------- t/t5410-receive-pack.sh | 22 ++++++++++++++++ 3 files changed, 56 insertions(+), 18 deletions(-) diff --git a/Documentation/git-receive-pack.adoc b/Documentation/git-receive-pack.adoc index 20aca92073..0956086d61 100644 --- a/Documentation/git-receive-pack.adoc +++ b/Documentation/git-receive-pack.adoc @@ -46,6 +46,18 @@ OPTIONS `$GIT_URL/info/refs?service=git-receive-pack` requests. See `--http-backend-info-refs` in linkgit:git-upload-pack[1]. +--skip-connectivity-check:: + Bypasses the connectivity checks that validate the existence of all + objects in the transitive closure of reachable objects. This option is + intended for server operators that want to implement their own object + connectivity validation outside of Git. This is useful in such cases + where the server-side knows additional information about how Git is + being used and thus can rely on certain guarantees to more efficiently + compute object connectivity that Git itself cannot make. Usage of this + option without a reliable external mechanism to ensure full reachable + object connectivity risks corrupting the repository and should not be + used in the general case. + PRE-RECEIVE HOOK ---------------- Before any ref is updated, if $GIT_DIR/hooks/pre-receive file exists diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index be314879e8..66674bc408 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -81,6 +81,7 @@ static int prefer_ofs_delta = 1; static int auto_update_server_info; static int auto_gc = 1; static int reject_thin; +static int skip_connectivity_check; static int stateless_rpc; static const char *service_dir; static const char *head_name; @@ -1936,27 +1937,29 @@ static void execute_commands(struct command *commands, return; } - if (use_sideband) { - memset(&muxer, 0, sizeof(muxer)); - muxer.proc = copy_to_sideband; - muxer.in = -1; - if (!start_async(&muxer)) - err_fd = muxer.in; - /* ...else, continue without relaying sideband */ - } + if (!skip_connectivity_check) { + if (use_sideband) { + memset(&muxer, 0, sizeof(muxer)); + muxer.proc = copy_to_sideband; + muxer.in = -1; + if (!start_async(&muxer)) + err_fd = muxer.in; + /* ...else, continue without relaying sideband */ + } - data.cmds = commands; - data.si = si; - opt.err_fd = err_fd; - opt.progress = err_fd && !quiet; - opt.env = tmp_objdir_env(tmp_objdir); - opt.exclude_hidden_refs_section = "receive"; + data.cmds = commands; + data.si = si; + opt.err_fd = err_fd; + opt.progress = err_fd && !quiet; + opt.env = tmp_objdir_env(tmp_objdir); + opt.exclude_hidden_refs_section = "receive"; - if (check_connected(iterate_receive_command_list, &data, &opt)) - set_connectivity_errors(commands, si); + if (check_connected(iterate_receive_command_list, &data, &opt)) + set_connectivity_errors(commands, si); - if (use_sideband) - finish_async(&muxer); + if (use_sideband) + finish_async(&muxer); + } reject_updates_to_hidden(commands); @@ -2517,6 +2520,7 @@ int cmd_receive_pack(int argc, struct option options[] = { OPT__QUIET(&quiet, N_("quiet")), + OPT_HIDDEN_BOOL(0, "skip-connectivity-check", &skip_connectivity_check, NULL), OPT_HIDDEN_BOOL(0, "stateless-rpc", &stateless_rpc, NULL), OPT_HIDDEN_BOOL(0, "http-backend-info-refs", &advertise_refs, NULL), OPT_ALIAS(0, "advertise-refs", "http-backend-info-refs"), diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh index 9afea54a26..f76a22943e 100755 --- a/t/t5410-receive-pack.sh +++ b/t/t5410-receive-pack.sh @@ -62,4 +62,26 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) ' +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' + test_when_finished rm -rf repo remote.git setup.git && + + git init repo && + git -C repo commit --allow-empty -m 1 && + git clone --bare repo setup.git && + git -C repo commit --allow-empty -m 2 && + + # Capture git-send-pack(1) output sent to git-receive-pack(1). + git -C repo send-pack ../setup.git --all \ + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + + # Replay captured git-send-pack(1) output on new empty repository. + git init --bare remote.git && + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && + + test_grep ! "missing necessary objects" actual && + test_must_be_empty err && + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && + test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) +' + test_done -- 2.49.0.111.g5b97a56fa0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-05-20 16:32 ` [PATCH v2 2/2] builtin/receive-pack: add option to skip " Justin Tobler @ 2025-06-02 15:01 ` Johannes Schindelin 2025-06-02 15:59 ` Justin Tobler 0 siblings, 1 reply; 25+ messages in thread From: Johannes Schindelin @ 2025-06-02 15:01 UTC (permalink / raw) To: Justin Tobler; +Cc: git, ps, karthik.188 Hi Justin, On Tue, 20 May 2025, Justin Tobler wrote: > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh > index 9afea54a26..f76a22943e 100755 > --- a/t/t5410-receive-pack.sh > +++ b/t/t5410-receive-pack.sh > @@ -62,4 +62,26 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' > test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > ' > > +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' > + test_when_finished rm -rf repo remote.git setup.git && > + > + git init repo && > + git -C repo commit --allow-empty -m 1 && > + git clone --bare repo setup.git && > + git -C repo commit --allow-empty -m 2 && > + > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > + git -C repo send-pack ../setup.git --all \ > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > + > + # Replay captured git-send-pack(1) output on new empty repository. > + git init --bare remote.git && > + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && > + > + test_grep ! "missing necessary objects" actual && > + test_must_be_empty err && > + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && > + test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) > +' > + > test_done This test case seems to hang occasionally in the "win+Meson test" jobs on GitHub (I tried to find the same failure at https://gitlab.com/gitlab-org/git/-/pipelines but couldn't find any). See for example https://github.com/gitgitgadget/git/actions/runs/15383915635/job/43279134837#step:6:627 Note that this problem afflicts only the "win+Meson test" jobs; The corresponding "win test" job seems not to hang. Even in the Git for Windows project, where the `win+VS test` jobs are run, the t5410 test passes within a dozen seconds or so, see e.g. https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279689086#step:5:143 (confusingly, the subset of tests run in the matrix jobs differs between the `win+Meson test` jobs and the `win+VS test` jobs, but if you click through all of the `win+Meson test` jobs, expand the `test` step, patiently wait a few seconds for the log to be lazy loaded "enough" for the search to work, you will notice that t5410 is not mentioned in any of them, and the only one that times out after 4h37m11s is https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279753911, likely while running 5410, too). Do you have any idea why this particular test case, in conjunction with Windows and Meson (and only on GitHub) acts up like this? Ciao, Johannes ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-06-02 15:01 ` Johannes Schindelin @ 2025-06-02 15:59 ` Justin Tobler 2025-06-02 16:06 ` Johannes Schindelin 2025-06-03 12:40 ` Patrick Steinhardt 0 siblings, 2 replies; 25+ messages in thread From: Justin Tobler @ 2025-06-02 15:59 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, ps, karthik.188 On 25/06/02 05:01PM, Johannes Schindelin wrote: > Hi Justin, > > On Tue, 20 May 2025, Justin Tobler wrote: > > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh > > index 9afea54a26..f76a22943e 100755 > > --- a/t/t5410-receive-pack.sh > > +++ b/t/t5410-receive-pack.sh > > @@ -62,4 +62,26 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' > > test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > > ' > > > > +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' > > + test_when_finished rm -rf repo remote.git setup.git && > > + > > + git init repo && > > + git -C repo commit --allow-empty -m 1 && > > + git clone --bare repo setup.git && > > + git -C repo commit --allow-empty -m 2 && > > + > > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > > + git -C repo send-pack ../setup.git --all \ > > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > > + > > + # Replay captured git-send-pack(1) output on new empty repository. > > + git init --bare remote.git && > > + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && > > + > > + test_grep ! "missing necessary objects" actual && > > + test_must_be_empty err && > > + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && > > + test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) > > +' > > + > > test_done > > This test case seems to hang occasionally in the "win+Meson test" jobs on > GitHub (I tried to find the same failure at > https://gitlab.com/gitlab-org/git/-/pipelines but couldn't find any). See > for example > https://github.com/gitgitgadget/git/actions/runs/15383915635/job/43279134837#step:6:627 > > Note that this problem afflicts only the "win+Meson test" jobs; The > corresponding "win test" job seems not to hang. > > Even in the Git for Windows project, where the `win+VS test` jobs are run, > the t5410 test passes within a dozen seconds or so, see e.g. > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279689086#step:5:143 > (confusingly, the subset of tests run in the matrix jobs differs between > the `win+Meson test` jobs and the `win+VS test` jobs, but if you click > through all of the `win+Meson test` jobs, expand the `test` step, > patiently wait a few seconds for the log to be lazy loaded "enough" for > the search to work, you will notice that t5410 is not mentioned in any of > them, and the only one that times out after 4h37m11s is > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279753911, > likely while running 5410, too). > > Do you have any idea why this particular test case, in conjunction with > Windows and Meson (and only on GitHub) acts up like this? Thanks Johannes for the report. I'm not quite sure yet what is going on here, but I'll dig into this a bit and see what I can figure out. :) -Justin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-06-02 15:59 ` Justin Tobler @ 2025-06-02 16:06 ` Johannes Schindelin 2025-06-02 18:16 ` Junio C Hamano 2025-06-03 12:40 ` Patrick Steinhardt 1 sibling, 1 reply; 25+ messages in thread From: Johannes Schindelin @ 2025-06-02 16:06 UTC (permalink / raw) To: Justin Tobler; +Cc: git, ps, karthik.188 Hi Justin, On Mon, 2 Jun 2025, Justin Tobler wrote: > On 25/06/02 05:01PM, Johannes Schindelin wrote: > > > On Tue, 20 May 2025, Justin Tobler wrote: > > > > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh > > > index 9afea54a26..f76a22943e 100755 > > > --- a/t/t5410-receive-pack.sh > > > +++ b/t/t5410-receive-pack.sh > > > @@ -62,4 +62,26 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' > > > test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > > > ' > > > > > > +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' > > > + test_when_finished rm -rf repo remote.git setup.git && > > > + > > > + git init repo && > > > + git -C repo commit --allow-empty -m 1 && > > > + git clone --bare repo setup.git && > > > + git -C repo commit --allow-empty -m 2 && > > > + > > > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > > > + git -C repo send-pack ../setup.git --all \ > > > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > > > + > > > + # Replay captured git-send-pack(1) output on new empty repository. > > > + git init --bare remote.git && > > > + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && > > > + > > > + test_grep ! "missing necessary objects" actual && > > > + test_must_be_empty err && > > > + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && > > > + test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) > > > +' > > > + > > > test_done > > > > This test case seems to hang occasionally in the "win+Meson test" jobs on > > GitHub (I tried to find the same failure at > > https://gitlab.com/gitlab-org/git/-/pipelines but couldn't find any). See > > for example > > https://github.com/gitgitgadget/git/actions/runs/15383915635/job/43279134837#step:6:627 > > > > Note that this problem afflicts only the "win+Meson test" jobs; The > > corresponding "win test" job seems not to hang. > > > > Even in the Git for Windows project, where the `win+VS test` jobs are run, > > the t5410 test passes within a dozen seconds or so, see e.g. > > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279689086#step:5:143 > > (confusingly, the subset of tests run in the matrix jobs differs between > > the `win+Meson test` jobs and the `win+VS test` jobs, but if you click > > through all of the `win+Meson test` jobs, expand the `test` step, > > patiently wait a few seconds for the log to be lazy loaded "enough" for > > the search to work, you will notice that t5410 is not mentioned in any of > > them, and the only one that times out after 4h37m11s is > > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279753911, > > likely while running 5410, too). > > > > Do you have any idea why this particular test case, in conjunction with > > Windows and Meson (and only on GitHub) acts up like this? > > Thanks Johannes for the report. I'm not quite sure yet what is going on > here, but I'll dig into this a bit and see what I can figure out. :) Thank you so much! Johannes ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-06-02 16:06 ` Johannes Schindelin @ 2025-06-02 18:16 ` Junio C Hamano 0 siblings, 0 replies; 25+ messages in thread From: Junio C Hamano @ 2025-06-02 18:16 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Justin Tobler, git, ps, karthik.188 Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> > Do you have any idea why this particular test case, in conjunction with >> > Windows and Meson (and only on GitHub) acts up like this? >> >> Thanks Johannes for the report. I'm not quite sure yet what is going on >> here, but I'll dig into this a bit and see what I can figure out. :) > > Thank you so much! Thanks for working well together. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-06-02 15:59 ` Justin Tobler 2025-06-02 16:06 ` Johannes Schindelin @ 2025-06-03 12:40 ` Patrick Steinhardt 2025-06-05 10:17 ` Johannes Schindelin 1 sibling, 1 reply; 25+ messages in thread From: Patrick Steinhardt @ 2025-06-03 12:40 UTC (permalink / raw) To: Justin Tobler; +Cc: Johannes Schindelin, git, karthik.188 On Mon, Jun 02, 2025 at 10:59:53AM -0500, Justin Tobler wrote: > On 25/06/02 05:01PM, Johannes Schindelin wrote: > > Hi Justin, > > > > On Tue, 20 May 2025, Justin Tobler wrote: > > > > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh > > > index 9afea54a26..f76a22943e 100755 > > > --- a/t/t5410-receive-pack.sh > > > +++ b/t/t5410-receive-pack.sh > > > @@ -62,4 +62,26 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' > > > test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > > > ' > > > > > > +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' > > > + test_when_finished rm -rf repo remote.git setup.git && > > > + > > > + git init repo && > > > + git -C repo commit --allow-empty -m 1 && > > > + git clone --bare repo setup.git && > > > + git -C repo commit --allow-empty -m 2 && > > > + > > > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > > > + git -C repo send-pack ../setup.git --all \ > > > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > > > + > > > + # Replay captured git-send-pack(1) output on new empty repository. > > > + git init --bare remote.git && > > > + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && > > > + > > > + test_grep ! "missing necessary objects" actual && > > > + test_must_be_empty err && > > > + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && > > > + test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) > > > +' > > > + > > > test_done > > > > This test case seems to hang occasionally in the "win+Meson test" jobs on > > GitHub (I tried to find the same failure at > > https://gitlab.com/gitlab-org/git/-/pipelines but couldn't find any). See > > for example > > https://github.com/gitgitgadget/git/actions/runs/15383915635/job/43279134837#step:6:627 > > > > Note that this problem afflicts only the "win+Meson test" jobs; The > > corresponding "win test" job seems not to hang. > > > > Even in the Git for Windows project, where the `win+VS test` jobs are run, > > the t5410 test passes within a dozen seconds or so, see e.g. > > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279689086#step:5:143 > > (confusingly, the subset of tests run in the matrix jobs differs between > > the `win+Meson test` jobs and the `win+VS test` jobs, but if you click > > through all of the `win+Meson test` jobs, expand the `test` step, > > patiently wait a few seconds for the log to be lazy loaded "enough" for > > the search to work, you will notice that t5410 is not mentioned in any of > > them, and the only one that times out after 4h37m11s is > > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279753911, > > likely while running 5410, too). > > > > Do you have any idea why this particular test case, in conjunction with > > Windows and Meson (and only on GitHub) acts up like this? > > Thanks Johannes for the report. I'm not quite sure yet what is going on > here, but I'll dig into this a bit and see what I can figure out. :) I've been banging my head against this issue for a bit today. A couple of findings: - The issue is specific to Git for Windows, I could only reproduce it when working with aa550efd0bb (fixup??? survey: add command line opts to select references, 2025-05-08). - When working on top of the above commit the bug is consistent. It doesn't only happen in GitHub, but also happens in GitLab CI [1]. - That being said, I still can't reproduce it locally?! This one is quite puzzling to me. I have tried to get my environment as close as possible to the environment we have in the CI systems. - I have a fix, see the patch further down. But I don't understand that fix just yet. I saw that all other sites where inject a custom receive-pack command also use a wrapper script, so it's not the worst thing to do. But it would be great to understand why this issue exists in the first place. Patrick [1]: https://github.com/pks-t/git/actions/runs/15416185892/job/43379399861 diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh index f76a22943ef..112da408d45 100755 --- a/t/t5410-receive-pack.sh +++ b/t/t5410-receive-pack.sh @@ -49,9 +49,13 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' git clone --bare repo setup.git && git -C repo commit --allow-empty -m 2 && + write_script receive-pack-wrapper <<-EOF && + tee "$(pwd)/out" | git-receive-pack "\$@" + EOF + # Capture git-send-pack(1) output sent to git-receive-pack(1). git -C repo send-pack ../setup.git --all \ - --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + --receive-pack="${SQ}$(pwd)${SQ}/receive-pack-wrapper" && # Replay captured git-send-pack(1) output on new empty repository. git init --bare remote.git && @@ -70,9 +74,13 @@ test_expect_success 'receive-pack missing objects bypasses connectivity check' ' git clone --bare repo setup.git && git -C repo commit --allow-empty -m 2 && + write_script receive-pack-wrapper <<-EOF && + tee "$(pwd)/out" | git-receive-pack "\$@" + EOF + # Capture git-send-pack(1) output sent to git-receive-pack(1). git -C repo send-pack ../setup.git --all \ - --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && + --receive-pack="${SQ}$(pwd)${SQ}/receive-pack-wrapper" && # Replay captured git-send-pack(1) output on new empty repository. git init --bare remote.git && ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-06-03 12:40 ` Patrick Steinhardt @ 2025-06-05 10:17 ` Johannes Schindelin 2025-06-05 11:00 ` Patrick Steinhardt 0 siblings, 1 reply; 25+ messages in thread From: Johannes Schindelin @ 2025-06-05 10:17 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Justin Tobler, git, karthik.188 Hi Patrick, On Tue, 3 Jun 2025, Patrick Steinhardt wrote: > On Mon, Jun 02, 2025 at 10:59:53AM -0500, Justin Tobler wrote: > > On 25/06/02 05:01PM, Johannes Schindelin wrote: > > > Hi Justin, > > > > > > On Tue, 20 May 2025, Justin Tobler wrote: > > > > > > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh > > > > index 9afea54a26..f76a22943e 100755 > > > > --- a/t/t5410-receive-pack.sh > > > > +++ b/t/t5410-receive-pack.sh > > > > @@ -62,4 +62,26 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' > > > > test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > > > > ' > > > > > > > > +test_expect_success 'receive-pack missing objects bypasses connectivity check' ' > > > > + test_when_finished rm -rf repo remote.git setup.git && > > > > + > > > > + git init repo && > > > > + git -C repo commit --allow-empty -m 1 && > > > > + git clone --bare repo setup.git && > > > > + git -C repo commit --allow-empty -m 2 && > > > > + > > > > + # Capture git-send-pack(1) output sent to git-receive-pack(1). > > > > + git -C repo send-pack ../setup.git --all \ > > > > + --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > > > > + > > > > + # Replay captured git-send-pack(1) output on new empty repository. > > > > + git init --bare remote.git && > > > > + git receive-pack --skip-connectivity-check remote.git <out >actual 2>err && > > > > + > > > > + test_grep ! "missing necessary objects" actual && > > > > + test_must_be_empty err && > > > > + git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && > > > > + test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) > > > > +' > > > > + > > > > test_done > > > > > > This test case seems to hang occasionally in the "win+Meson test" jobs on > > > GitHub (I tried to find the same failure at > > > https://gitlab.com/gitlab-org/git/-/pipelines but couldn't find any). See > > > for example > > > https://github.com/gitgitgadget/git/actions/runs/15383915635/job/43279134837#step:6:627 > > > > > > Note that this problem afflicts only the "win+Meson test" jobs; The > > > corresponding "win test" job seems not to hang. > > > > > > Even in the Git for Windows project, where the `win+VS test` jobs are run, > > > the t5410 test passes within a dozen seconds or so, see e.g. > > > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279689086#step:5:143 > > > (confusingly, the subset of tests run in the matrix jobs differs between > > > the `win+Meson test` jobs and the `win+VS test` jobs, but if you click > > > through all of the `win+Meson test` jobs, expand the `test` step, > > > patiently wait a few seconds for the log to be lazy loaded "enough" for > > > the search to work, you will notice that t5410 is not mentioned in any of > > > them, and the only one that times out after 4h37m11s is > > > https://github.com/git-for-windows/git/actions/runs/15383945895/job/43279753911, > > > likely while running 5410, too). > > > > > > Do you have any idea why this particular test case, in conjunction with > > > Windows and Meson (and only on GitHub) acts up like this? > > > > Thanks Johannes for the report. I'm not quite sure yet what is going on > > here, but I'll dig into this a bit and see what I can figure out. :) > > I've been banging my head against this issue for a bit today. A couple > of findings: > > - The issue is specific to Git for Windows, I could only reproduce it > when working with aa550efd0bb (fixup??? survey: add command line > opts to select references, 2025-05-08). I can reproduce it consistently with Git's `master`, see e.g. https://github.com/git/git/actions/runs/15454602308/job/43504424816#step:6:627 > - When working on top of the above commit the bug is consistent. It > doesn't only happen in GitHub, but also happens in GitLab CI [1]. > > - That being said, I still can't reproduce it locally?! This one is > quite puzzling to me. I have tried to get my environment as close as > possible to the environment we have in the CI systems. I, too, was unable to reproduce locally (probably because of the way the runners start the processes, without an initial Win32 Console and all). So I took to mxschmitt/action-tmate to debug on the runner itself. It is a bit tricky to do, as MSVC's debugger runs in a graphical IDE and gdb is unable to find the symbols. > - I have a fix, see the patch further down. But I don't understand > that fix just yet. I would like to propose an alternative: https://lore.kernel.org/git/pull.1932.git.1749118606047.gitgitgadget@gmail.com The reason why I prefer that solution is that I suspect the extra script to make the conditions only less likely, but not impossible, for the bug to rear its ugly head. Ciao, Johannes > > I saw that all other sites where inject a custom receive-pack command > also use a wrapper script, so it's not the worst thing to do. But it > would be great to understand why this issue exists in the first place. > > Patrick > > [1]: https://github.com/pks-t/git/actions/runs/15416185892/job/43379399861 > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh > index f76a22943ef..112da408d45 100755 > --- a/t/t5410-receive-pack.sh > +++ b/t/t5410-receive-pack.sh > @@ -49,9 +49,13 @@ test_expect_success 'receive-pack missing objects fails connectivity check' ' > git clone --bare repo setup.git && > git -C repo commit --allow-empty -m 2 && > > + write_script receive-pack-wrapper <<-EOF && > + tee "$(pwd)/out" | git-receive-pack "\$@" > + EOF > + > # Capture git-send-pack(1) output sent to git-receive-pack(1). > git -C repo send-pack ../setup.git --all \ > - --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > + --receive-pack="${SQ}$(pwd)${SQ}/receive-pack-wrapper" && > > # Replay captured git-send-pack(1) output on new empty repository. > git init --bare remote.git && > @@ -70,9 +74,13 @@ test_expect_success 'receive-pack missing objects bypasses connectivity check' ' > git clone --bare repo setup.git && > git -C repo commit --allow-empty -m 2 && > > + write_script receive-pack-wrapper <<-EOF && > + tee "$(pwd)/out" | git-receive-pack "\$@" > + EOF > + > # Capture git-send-pack(1) output sent to git-receive-pack(1). > git -C repo send-pack ../setup.git --all \ > - --receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" && > + --receive-pack="${SQ}$(pwd)${SQ}/receive-pack-wrapper" && > > # Replay captured git-send-pack(1) output on new empty repository. > git init --bare remote.git && > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/2] builtin/receive-pack: add option to skip connectivity check 2025-06-05 10:17 ` Johannes Schindelin @ 2025-06-05 11:00 ` Patrick Steinhardt 0 siblings, 0 replies; 25+ messages in thread From: Patrick Steinhardt @ 2025-06-05 11:00 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Justin Tobler, git, karthik.188 On Thu, Jun 05, 2025 at 12:17:17PM +0200, Johannes Schindelin wrote: > Hi Patrick, > On Tue, 3 Jun 2025, Patrick Steinhardt wrote: > > On Mon, Jun 02, 2025 at 10:59:53AM -0500, Justin Tobler wrote: > > > On 25/06/02 05:01PM, Johannes Schindelin wrote: > > > Thanks Johannes for the report. I'm not quite sure yet what is going on > > > here, but I'll dig into this a bit and see what I can figure out. :) > > > > I've been banging my head against this issue for a bit today. A couple > > of findings: > > > > - The issue is specific to Git for Windows, I could only reproduce it > > when working with aa550efd0bb (fixup??? survey: add command line > > opts to select references, 2025-05-08). > > I can reproduce it consistently with Git's `master`, see e.g. > https://github.com/git/git/actions/runs/15454602308/job/43504424816#step:6:627 Huh, interesting. Now that makes me wonder why I couldn't. > > - When working on top of the above commit the bug is consistent. It > > doesn't only happen in GitHub, but also happens in GitLab CI [1]. > > > > - That being said, I still can't reproduce it locally?! This one is > > quite puzzling to me. I have tried to get my environment as close as > > possible to the environment we have in the CI systems. > > I, too, was unable to reproduce locally (probably because of the way the > runners start the processes, without an initial Win32 Console and all). So > I took to mxschmitt/action-tmate to debug on the runner itself. It is a > bit tricky to do, as MSVC's debugger runs in a graphical IDE and gdb is > unable to find the symbols. > > > - I have a fix, see the patch further down. But I don't understand > > that fix just yet. > > I would like to propose an alternative: > https://lore.kernel.org/git/pull.1932.git.1749118606047.gitgitgadget@gmail.com > > The reason why I prefer that solution is that I suspect the extra script > to make the conditions only less likely, but not impossible, for the bug > to rear its ugly head. Interesting find! I'll comment on that thread, thanks! Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks 2025-05-20 16:32 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-20 16:32 ` [PATCH v2 1/2] t5410: test receive-pack connectivity check Justin Tobler 2025-05-20 16:32 ` [PATCH v2 2/2] builtin/receive-pack: add option to skip " Justin Tobler @ 2025-05-22 9:09 ` Karthik Nayak 2 siblings, 0 replies; 25+ messages in thread From: Karthik Nayak @ 2025-05-22 9:09 UTC (permalink / raw) To: Justin Tobler, git; +Cc: ps [-- Attachment #1: Type: text/plain, Size: 3590 bytes --] Justin Tobler <jltobler@gmail.com> writes: [snip] > Range-diff against v1: > 1: f659612c9d = 1: f659612c9d t5410: test receive-pack connectivity check > 2: 31e5f41983 ! 2: f6dbb02778 builtin/receive-pack: add option to skip connectivity check > @@ Commit message > During git-receive-pack(1), connectivity of the object graph is > validated to ensure that the received packfile does not leave the > repository in a broken state. This is done via git-rev-list(1) and > - walking the objects which can be expensive for large repositories. > + walking the objects, which can be expensive for large repositories. > > Generally, this check is critical to avoid an incomplete received > packfile from corrupting a repository. Server operators may have > additional knowledge though around exactly how Git is being used on the > server-side which can be used to facilitate more efficient connectivity > - computatation of incoming objects. > + computation of incoming objects. > > For example, if it can be ensured that all objects in a repository are > connected and do not depend on any missing objects, the connectivity of > @@ Documentation/git-receive-pack.adoc: OPTIONS > `--http-backend-info-refs` in linkgit:git-upload-pack[1]. > > +--skip-connectivity-check:: > -+ Bypasses the connectivity checks performed to validate incoming > -+ objects. This option exists for server operators that may want to > -+ implement their own object connectivity check outside of Git. This is > -+ useful in such cases where the server-side knows additional information > -+ about how Git is being used and thus can rely on guarantees to more > -+ efficiently compute object connectivity that Git itself cannot make. > -+ Usage of this option without a separate mechanism to validate and > -+ ensure incoming objects connect properly to the references risks a > -+ repository becoming corrupted and should not be used in the general > -+ case. > ++ Bypasses the connectivity checks that validate the existence of all > ++ objects in the transitive closure of reachable objects. This option is > ++ intended for server operators that want to implement their own object > ++ connectivity validation outside of Git. This is useful in such cases > ++ where the server-side knows additional information about how Git is > ++ being used and thus can rely on certain guarantees to more efficiently > ++ compute object connectivity that Git itself cannot make. Usage of this > ++ option without a reliable external mechanism to ensure full reachable > ++ object connectivity risks corrupting the repository and should not be > ++ used in the general case. > + > PRE-RECEIVE HOOK > ---------------- > @@ t/t5410-receive-pack.sh: test_expect_success 'receive-pack missing objects fails > + > + test_grep ! "missing necessary objects" actual && > + test_must_be_empty err && > -+ git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) > ++ git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) && > ++ test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD) > +' > + > test_done > > base-commit: 6c0bd1fc70efaf053abe4e57c976afdc72d15377 > -- > 2.49.0.111.g5b97a56fa0 The range-diff looks good, and seems to address the review comments from the previous iteration. The series looks good to me! Thanks, Karthik [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2025-06-05 11:00 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-07 3:02 [RFC PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-07 3:02 ` [RFC PATCH 1/2] t5412: test receive-pack connectivity check Justin Tobler 2025-05-07 13:28 ` Patrick Steinhardt 2025-05-19 21:08 ` Justin Tobler 2025-05-07 3:02 ` [RFC PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler 2025-05-07 13:28 ` Patrick Steinhardt 2025-05-07 17:20 ` Junio C Hamano 2025-05-20 1:49 ` [PATCH 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-20 1:49 ` [PATCH 1/2] t5410: test receive-pack connectivity check Justin Tobler 2025-05-20 9:11 ` Karthik Nayak 2025-05-20 1:49 ` [PATCH 2/2] builtin/receive-pack: add option to skip " Justin Tobler 2025-05-20 5:17 ` Patrick Steinhardt 2025-05-20 15:10 ` Justin Tobler 2025-05-20 9:16 ` Karthik Nayak 2025-05-20 16:32 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Justin Tobler 2025-05-20 16:32 ` [PATCH v2 1/2] t5410: test receive-pack connectivity check Justin Tobler 2025-05-20 16:32 ` [PATCH v2 2/2] builtin/receive-pack: add option to skip " Justin Tobler 2025-06-02 15:01 ` Johannes Schindelin 2025-06-02 15:59 ` Justin Tobler 2025-06-02 16:06 ` Johannes Schindelin 2025-06-02 18:16 ` Junio C Hamano 2025-06-03 12:40 ` Patrick Steinhardt 2025-06-05 10:17 ` Johannes Schindelin 2025-06-05 11:00 ` Patrick Steinhardt 2025-05-22 9:09 ` [PATCH v2 0/2] builtin/receive-pack: introduce option to skip connectivity checks Karthik Nayak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).