All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nikolay Edigaryev via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Derrick Stolee <stolee@gmail.com>,
	Nikolay Edigaryev <edigaryev@gmail.com>,
	Nikolay Edigaryev <edigaryev@gmail.com>
Subject: [PATCH] clone: support cloning of filtered bundles
Date: Sun, 14 Jan 2024 11:16:49 +0000	[thread overview]
Message-ID: <pull.1644.git.git.1705231010118.gitgitgadget@gmail.com> (raw)

From: Nikolay Edigaryev <edigaryev@gmail.com>

f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
incredibly useful ability to create filtered bundles, which advances
the partial clone/promisor support in Git and allows for archiving
large repositories to object storages like S3 in bundles that are:

* easy to manage
  * bundle is just a single file, it's easier to guarantee atomic
    replacements in object storages like S3 and they are faster to
    fetch than a bare repository since there's only a single GET
    request involved
* incredibly tiny
  * no indexes (which may be more than 10 MB for some repositories)
    and other fluff, compared to cloning a bare repository
  * bundle can be filtered to only contain the tips of refs neccessary
    for e.g. code-analysis purposes

However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
bundle, 2022-03-09) the cloning of such bundles was disabled, with a
note that this behavior is not desired, and it the long-term this
should be possible.

The commit above states that it's not possible to have this at the
moment due to lack of remote and a repository-global config that
specifies an object filter, yet it's unclear why a remote-specific
config can't be used instead, which is what this change does.

Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>
---
    clone: support cloning of filtered bundles
    
    f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
    incredibly useful ability to create filtered bundles, which advances the
    partial clone/promisor support in Git and allows for archiving large
    repositories to object storages like S3 in bundles that are:
    
     * easy to manage
       * bundle is just a single file, it's easier to guarantee atomic
         replacements in object storages like S3 and they are faster to
         fetch than a bare repository since there's only a single GET
         request involved
     * incredibly tiny
       * no indexes (which may be more than 10 MB for some repositories) and
         other fluff, compared to cloning a bare repository
       * bundle can be filtered to only contain the tips of refs neccessary
         for e.g. code-analysis purposes
    
    However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
    bundle, 2022-03-09) the cloning of such bundles was disabled, with a
    note that this behavior is not desired, and it the long-term this should
    be possible.
    
    The commit above states that it's not possible to have this at the
    moment due to lack of remote and a repository-global config that
    specifies an object filter, yet it's unclear why a remote-specific
    config can't be used instead, which is what this change does.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1644%2Fedigaryev%2Fclone-filtered-bundles-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1644/edigaryev/clone-filtered-bundles-v1
Pull-Request: https://github.com/git/git/pull/1644

 builtin/clone.c        | 13 +++++++++++--
 t/t6020-bundle-misc.sh | 13 +++----------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index c6357af9498..4b3fedf78ed 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1227,9 +1227,18 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 		if (fd > 0)
 			close(fd);
+
+		if (has_filter) {
+			strbuf_addf(&key, "remote.%s.promisor", remote_name);
+			git_config_set(key.buf, "true");
+			strbuf_reset(&key);
+
+			strbuf_addf(&key, "remote.%s.partialclonefilter", remote_name);
+			git_config_set(key.buf, expand_list_objects_filter_spec(&header.filter));
+			strbuf_reset(&key);
+		}
+
 		bundle_header_release(&header);
-		if (has_filter)
-			die(_("cannot clone from filtered bundle"));
 	}
 
 	transport_set_option(transport, TRANS_OPT_KEEP, "yes");
diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
index 3e6bcbf30cd..f449df00642 100755
--- a/t/t6020-bundle-misc.sh
+++ b/t/t6020-bundle-misc.sh
@@ -555,16 +555,9 @@ do
 	'
 done
 
-# NEEDSWORK: 'git clone --bare' should be able to clone from a filtered
-# bundle, but that requires a change to promisor/filter config options.
-# For now, we fail gracefully with a helpful error. This behavior can be
-# changed in the future to succeed as much as possible.
-test_expect_success 'cloning from filtered bundle has useful error' '
-	git bundle create partial.bdl \
-		--all \
-		--filter=blob:none &&
-	test_must_fail git clone --bare partial.bdl partial 2>err &&
-	grep "cannot clone from filtered bundle" err
+test_expect_success 'cloning from filtered bundle works' '
+	git bundle create partial.bdl --all --filter=blob:none &&
+	git clone --bare partial.bdl partial 2>err
 '
 
 test_expect_success 'verify catches unreachable, broken prerequisites' '

base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
-- 
gitgitgadget

             reply	other threads:[~2024-01-14 11:16 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-14 11:16 Nikolay Edigaryev via GitGitGadget [this message]
2024-01-14 18:00 ` [PATCH] clone: support cloning of filtered bundles Phillip Wood
2024-01-14 19:39   ` Nikolay Edigaryev
2024-01-14 21:26     ` Nikolay Edigaryev
2024-01-15 10:35       ` phillip.wood123
2024-01-15 10:18     ` phillip.wood123
2024-01-15  1:13 ` Junio C Hamano
2024-01-15  2:09   ` Junio C Hamano
2024-01-16 20:06     ` Nikolay Edigaryev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1644.git.git.1705231010118.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=edigaryev@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.