From: Jeff Hostetler <git@jeffhostetler.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, peff@peff.net, jonathantanmy@google.com,
Jeff Hostetler <jeffhost@microsoft.com>
Subject: [PATCH v4 10/10] gc: do not repack promisor packfiles
Date: Thu, 16 Nov 2017 18:12:57 +0000 [thread overview]
Message-ID: <20171116181257.61673-11-git@jeffhostetler.com> (raw)
In-Reply-To: <20171116181257.61673-1-git@jeffhostetler.com>
From: Jonathan Tan <jonathantanmy@google.com>
Teach gc to stop traversal at promisor objects, and to leave promisor
packfiles alone. This has the effect of only repacking non-promisor
packfiles, and preserves the distinction between promisor packfiles and
non-promisor packfiles.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
Documentation/git-pack-objects.txt | 12 ++++++++-
builtin/gc.c | 3 +++
builtin/pack-objects.c | 36 ++++++++++++++++++++++++++
builtin/prune.c | 7 +++++
builtin/repack.c | 8 ++++--
t/t0410-partial-clone.sh | 52 +++++++++++++++++++++++++++++++++++++-
6 files changed, 114 insertions(+), 4 deletions(-)
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 5fad696..33a824e 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -242,9 +242,19 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle.
the resulting packfile. See linkgit:git-rev-list[1] for valid
`<filter-spec>` forms.
---missing=(error|allow-any):
+--missing=(error|allow-any|allow-promisor):
Specifies how missing objects are handled. This is useful, for
example, when there are missing objects from a prior partial clone.
+ This is stronger than `--missing=allow-promisor` because it limits
+ the traversal, rather than just silencing errors about missing
+ objects.
+
+--exclude-promisor-objects::
+ Omit objects that are known to be in the promisor remote". (This
+ option has the purpose of operating only on locally created objects,
+ so that when we repack, we still maintain a distinction between
+ locally created objects [without .promisor] and objects from the
+ promisor remote [with .promisor].) This is used with partial clone.
SEE ALSO
--------
diff --git a/builtin/gc.c b/builtin/gc.c
index 3c5eae0..77fa720 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -458,6 +458,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
argv_array_push(&prune, prune_expire);
if (quiet)
argv_array_push(&prune, "--no-progress");
+ if (repository_format_partial_clone)
+ argv_array_push(&prune,
+ "--exclude-promisor-objects");
if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
return error(FAILED_RUN, prune.argv[0]);
}
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 45ad35d..4534209 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -75,6 +75,8 @@ static int use_bitmap_index = -1;
static int write_bitmap_index;
static uint16_t write_bitmap_options;
+static int exclude_promisor_objects;
+
static unsigned long delta_cache_size = 0;
static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
static unsigned long cache_max_small_delta_size = 1000;
@@ -86,6 +88,7 @@ static struct list_objects_filter_options filter_options;
enum missing_action {
MA_ERROR = 0, /* fail if any missing objects are encountered */
MA_ALLOW_ANY, /* silently allow ALL missing objects */
+ MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
};
static enum missing_action arg_missing_action;
static show_object_fn fn_show_object;
@@ -2577,6 +2580,20 @@ static void show_object__ma_allow_any(struct object *obj, const char *name, void
show_object(obj, name, data);
}
+static void show_object__ma_allow_promisor(struct object *obj, const char *name, void *data)
+{
+ assert(arg_missing_action == MA_ALLOW_PROMISOR);
+
+ /*
+ * Quietly ignore EXPECTED missing objects. This avoids problems with
+ * staging them now and getting an odd error later.
+ */
+ if (!has_object_file(&obj->oid) && is_promisor_object(&obj->oid))
+ return;
+
+ show_object(obj, name, data);
+}
+
static int option_parse_missing_action(const struct option *opt,
const char *arg, int unset)
{
@@ -2591,10 +2608,18 @@ static int option_parse_missing_action(const struct option *opt,
if (!strcmp(arg, "allow-any")) {
arg_missing_action = MA_ALLOW_ANY;
+ fetch_if_missing = 0;
fn_show_object = show_object__ma_allow_any;
return 0;
}
+ if (!strcmp(arg, "allow-promisor")) {
+ arg_missing_action = MA_ALLOW_PROMISOR;
+ fetch_if_missing = 0;
+ fn_show_object = show_object__ma_allow_promisor;
+ return 0;
+ }
+
die(_("invalid value for --missing"));
return 0;
}
@@ -3008,6 +3033,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
{ OPTION_CALLBACK, 0, "missing", NULL, N_("action"),
N_("handling for missing objects"), PARSE_OPT_NONEG,
option_parse_missing_action },
+ OPT_BOOL(0, "exclude-promisor-objects", &exclude_promisor_objects,
+ N_("do not pack objects in promisor packfiles")),
OPT_END(),
};
@@ -3053,6 +3080,15 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
argv_array_push(&rp, "--unpacked");
}
+ if (exclude_promisor_objects) {
+ use_internal_rev_list = 1;
+ fetch_if_missing = 0;
+ argv_array_push(&rp, "--exclude-promisor-objects");
+ /* silently override any "--missing=" value */
+ arg_missing_action = MA_ALLOW_PROMISOR;
+ fn_show_object = show_object__ma_allow_promisor;
+ }
+
if (!reuse_object)
reuse_delta = 0;
if (pack_compression_level == -1)
diff --git a/builtin/prune.c b/builtin/prune.c
index cddabf2..be34645 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -101,12 +101,15 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
{
struct rev_info revs;
struct progress *progress = NULL;
+ int exclude_promisor_objects = 0;
const struct option options[] = {
OPT__DRY_RUN(&show_only, N_("do not remove, show only")),
OPT__VERBOSE(&verbose, N_("report pruned objects")),
OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
OPT_EXPIRY_DATE(0, "expire", &expire,
N_("expire objects older than <time>")),
+ OPT_BOOL(0, "exclude-promisor-objects", &exclude_promisor_objects,
+ N_("limit traversal to objects outside promisor packfiles")),
OPT_END()
};
char *s;
@@ -139,6 +142,10 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
show_progress = isatty(2);
if (show_progress)
progress = start_delayed_progress(_("Checking connectivity"), 0);
+ if (exclude_promisor_objects) {
+ fetch_if_missing = 0;
+ revs.exclude_promisor_objects = 1;
+ }
mark_reachable_objects(&revs, 1, expire, progress);
stop_progress(&progress);
diff --git a/builtin/repack.c b/builtin/repack.c
index f17a68a..7bdb401 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -83,7 +83,8 @@ static void remove_pack_on_signal(int signo)
/*
* Adds all packs hex strings to the fname list, which do not
- * have a corresponding .keep file.
+ * have a corresponding .keep or .promisor file. These packs are not to
+ * be kept if we are going to pack everything into one file.
*/
static void get_non_kept_pack_filenames(struct string_list *fname_list)
{
@@ -101,7 +102,8 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list)
fname = xmemdupz(e->d_name, len);
- if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
+ if (!file_exists(mkpath("%s/%s.keep", packdir, fname)) &&
+ !file_exists(mkpath("%s/%s.promisor", packdir, fname)))
string_list_append_nodup(fname_list, fname);
else
free(fname);
@@ -232,6 +234,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
argv_array_push(&cmd.args, "--all");
argv_array_push(&cmd.args, "--reflog");
argv_array_push(&cmd.args, "--indexed-objects");
+ if (repository_format_partial_clone)
+ argv_array_push(&cmd.args, "--exclude-promisor-objects");
if (window)
argv_array_pushf(&cmd.args, "--window=%s", window);
if (window_memory)
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 3ca6af5..071736c 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -11,13 +11,15 @@ delete_object () {
pack_as_from_promisor () {
HASH=$(git -C repo pack-objects .git/objects/pack/pack) &&
>repo/.git/objects/pack/pack-$HASH.promisor
+ echo $HASH
}
promise_and_delete () {
HASH=$(git -C repo rev-parse "$1") &&
git -C repo tag -a -m message my_annotated_tag "$HASH" &&
git -C repo rev-parse my_annotated_tag | pack_as_from_promisor &&
- git -C repo tag -d my_annotated_tag &&
+ # tag -d prints a message to stdout, so redirect it
+ git -C repo tag -d my_annotated_tag >/dev/null &&
delete_object repo "$HASH"
}
@@ -261,6 +263,54 @@ test_expect_success 'rev-list accepts missing and promised objects on command li
git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" "$TREE" "$BLOB"
'
+test_expect_success 'gc does not repack promisor objects' '
+ rm -rf repo &&
+ test_create_repo repo &&
+ test_commit -C repo my_commit &&
+
+ TREE_HASH=$(git -C repo rev-parse HEAD^{tree}) &&
+ HASH=$(printf "$TREE_HASH\n" | pack_as_from_promisor) &&
+
+ git -C repo config core.repositoryformatversion 1 &&
+ git -C repo config extensions.partialclone "arbitrary string" &&
+ git -C repo gc &&
+
+ # Ensure that the promisor packfile still exists, and remove it
+ test -e repo/.git/objects/pack/pack-$HASH.pack &&
+ rm repo/.git/objects/pack/pack-$HASH.* &&
+
+ # Ensure that the single other pack contains the commit, but not the tree
+ ls repo/.git/objects/pack/pack-*.pack >packlist &&
+ test_line_count = 1 packlist &&
+ git verify-pack repo/.git/objects/pack/pack-*.pack -v >out &&
+ grep "$(git -C repo rev-parse HEAD)" out &&
+ ! grep "$TREE_HASH" out
+'
+
+test_expect_success 'gc stops traversal when a missing but promised object is reached' '
+ rm -rf repo &&
+ test_create_repo repo &&
+ test_commit -C repo my_commit &&
+
+ TREE_HASH=$(git -C repo rev-parse HEAD^{tree}) &&
+ HASH=$(promise_and_delete $TREE_HASH) &&
+
+ git -C repo config core.repositoryformatversion 1 &&
+ git -C repo config extensions.partialclone "arbitrary string" &&
+ git -C repo gc &&
+
+ # Ensure that the promisor packfile still exists, and remove it
+ test -e repo/.git/objects/pack/pack-$HASH.pack &&
+ rm repo/.git/objects/pack/pack-$HASH.* &&
+
+ # Ensure that the single other pack contains the commit, but not the tree
+ ls repo/.git/objects/pack/pack-*.pack >packlist &&
+ test_line_count = 1 packlist &&
+ git verify-pack repo/.git/objects/pack/pack-*.pack -v >out &&
+ grep "$(git -C repo rev-parse HEAD)" out &&
+ ! grep "$TREE_HASH" out
+'
+
LIB_HTTPD_PORT=12345 # default port, 410, cannot be used as non-root
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
--
2.9.3
next prev parent reply other threads:[~2017-11-16 18:13 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-16 18:12 [PATCH v4 00/10] Partial clone part 2: fsck and promisors Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 01/10] extension.partialclone: introduce partial clone extension Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 02/10] fsck: introduce partialclone extension Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 03/10] fsck: support refs pointing to promisor objects Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 04/10] fsck: support referenced " Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 05/10] fsck: support promisor objects as CLI argument Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 06/10] index-pack: refactor writing of .keep files Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 07/10] introduce fetch-object: fetch one promisor object Jeff Hostetler
2017-11-16 19:57 ` Ramsay Jones
2017-11-17 19:49 ` Jeff Hostetler
2017-11-17 20:17 ` Stefan Beller
2017-11-20 15:56 ` Jeff Hostetler
2017-11-17 20:22 ` Ramsay Jones
2017-11-16 18:12 ` [PATCH v4 08/10] sha1_file: support lazily fetching missing objects Jeff Hostetler
2017-11-16 18:12 ` [PATCH v4 09/10] rev-list: support termination at promisor objects Jeff Hostetler
2017-11-16 18:12 ` Jeff Hostetler [this message]
2017-11-16 21:57 ` [PATCH v4 00/10] Partial clone part 2: fsck and promisors Jonathan Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171116181257.61673-11-git@jeffhostetler.com \
--to=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jeffhost@microsoft.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.