From: Junio C Hamano <gitster@pobox.com>
To: Rafael Silva <rafaeloliveira.cs@gmail.com>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
"Jonathan Tan" <jonathantanmy@google.com>,
"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: Re: [PATCH v2 1/1] repack: avoid loosening promisor objects in partial clones
Date: Mon, 19 Apr 2021 16:09:03 -0700 [thread overview]
Message-ID: <xmqqa6pt98j4.fsf@gitster.g> (raw)
In-Reply-To: <20210418135749.27152-2-rafaeloliveira.cs@gmail.com> (Rafael Silva's message of "Sun, 18 Apr 2021 15:57:49 +0200")
Rafael Silva <rafaeloliveira.cs@gmail.com> writes:
> When `git repack -A -d` is run in a partial clone, `pack-objects`
> is invoked twice: once to repack all promisor objects, and once to
> repack all non-promisor objects. The latter `pack-objects` invocation
> is with --exclude-promisor-objects and --unpack-unreachable, which
> loosens all unused objects. Unfortunately, this includes promisor
> objects.
>
> Because the -d argument to `git repack` subsequently deletes all loose
> objects also in packs, these just-loosened promisor objects will be
> immediately deleted. However, this extra disk churn is unnecessary in
> the first place. For example, a newly-clone partial repo that filters
"in a newly-cloned partial repo", I'd think.
> For testing, we need to validate whether any object was loosened.
> However, the "evidence" (loosened objects) is deleted during the
> process which prevents us from inspecting the object directory.
> Instead, let's teach `pack-objects` to count loosened objects and
> emit via trace2 thus allowing inspecting the debug events after the
> process is finished. This new event is used on the added regression
> test.
Nicely designed.
> + uint32_t loosened_objects_nr = 0;
> struct object_id oid;
>
> for (p = get_all_packs(the_repository); p; p = p->next) {
> @@ -3492,11 +3493,16 @@ static void loosen_unused_packed_objects(void)
> nth_packed_object_id(&oid, p, i);
> if (!packlist_find(&to_pack, &oid) &&
> !has_sha1_pack_kept_or_nonlocal(&oid) &&
> - !loosened_object_can_be_discarded(&oid, p->mtime))
> + !loosened_object_can_be_discarded(&oid, p->mtime)) {
> if (force_object_loose(&oid, p->mtime))
> die(_("unable to force loose object"));
> + loosened_objects_nr++;
> + }
> }
> }
> +
> + trace2_data_intmax("pack-objects", the_repository,
> + "loosen_unused_packed_objects/loosened", loosened_objects_nr);
> }
OK, so this is just the "stats".
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 2847fdfbab..5f9bc74adc 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -20,7 +20,7 @@ static int delta_base_offset = 1;
> static int pack_kept_objects = -1;
> static int write_bitmaps = -1;
> static int use_delta_islands;
> -static char *packdir, *packtmp;
> +static char *packdir, *packtmp_name, *packtmp;
>
> static const char *const git_repack_usage[] = {
> N_("git repack [<options>]"),
> @@ -530,7 +530,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
> }
>
> packdir = mkpathdup("%s/pack", get_object_directory());
> - packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid());
> + packtmp_name = xstrfmt(".tmp-%d-pack", (int)getpid());
> + packtmp = mkpathdup("%s/%s", packdir, packtmp_name);
Just a mental note, but we should move away from ".tmp-$$" that is a
remnant from the days back when this was a shell script, and use the
tempfile.h API (#leftoverbits). Such a change must not be part of
this topic, of course.
Thanks. Will queue and see what others say.
next prev parent reply other threads:[~2021-04-19 23:09 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-03 9:04 rather slow 'git repack' in 'blob:none' partial clones SZEDER Gábor
2021-04-05 1:02 ` Rafael Silva
2021-04-07 21:17 ` Jeff King
2021-04-08 0:02 ` Jonathan Tan
2021-04-08 0:35 ` Jeff King
2021-04-12 7:09 ` Rafael Silva
2021-04-12 21:36 ` SZEDER Gábor
2021-04-12 21:49 ` Bryan Turner
2021-04-12 23:51 ` Jeff King
2021-04-12 23:47 ` Jeff King
2021-04-13 7:12 ` [PATCH 0/3] low-hanging performance fruit with promisor packs Jeff King
2021-04-13 7:15 ` [PATCH 1/3] is_promisor_object(): free tree buffer after parsing Jeff King
2021-04-13 20:17 ` Junio C Hamano
2021-04-14 5:18 ` Jeff King
2021-04-13 7:16 ` [PATCH 2/3] lookup_unknown_object(): take a repository argument Jeff King
2021-04-13 7:17 ` [PATCH 3/3] revision: avoid parsing with --exclude-promisor-objects Jeff King
2021-04-13 20:22 ` Junio C Hamano
2021-04-13 18:10 ` [PATCH 0/3] low-hanging performance fruit with promisor packs SZEDER Gábor
2021-04-14 17:14 ` Jonathan Tan
2021-04-14 19:22 ` Rafael Silva
2021-04-13 18:05 ` rather slow 'git repack' in 'blob:none' partial clones SZEDER Gábor
2021-04-14 5:14 ` Jeff King
2021-04-11 10:59 ` SZEDER Gábor
2021-04-12 7:53 ` Rafael Silva
2021-04-14 19:14 ` [PATCH 0/2] prevent `repack` to unpack and delete promisor objects Rafael Silva
2021-04-14 19:14 ` [PATCH 1/2] repack: teach --no-prune-packed to skip `git prune-packed` Rafael Silva
2021-04-14 23:50 ` Jonathan Tan
2021-04-18 14:15 ` Rafael Silva
2021-04-14 19:14 ` [PATCH 2/2] repack: avoid loosening promisor pack objects in partial clones Rafael Silva
2021-04-15 1:04 ` Jonathan Tan
2021-04-15 3:51 ` Junio C Hamano
2021-04-15 9:03 ` Jeff King
2021-04-15 9:05 ` Jeff King
2021-04-18 7:12 ` Rafael Silva
2021-04-15 18:06 ` Junio C Hamano
2021-04-18 8:40 ` Rafael Silva
2021-04-14 22:10 ` [PATCH 0/2] prevent `repack` to unpack and delete promisor objects Junio C Hamano
2021-04-15 9:15 ` Jeff King
2021-04-18 8:20 ` Rafael Silva
2021-04-18 13:57 ` [PATCH v2 0/1] " Rafael Silva
2021-04-18 13:57 ` [PATCH v2 1/1] repack: avoid loosening promisor objects in partial clones Rafael Silva
2021-04-19 19:15 ` Jonathan Tan
2021-04-21 18:54 ` Rafael Silva
2021-04-19 23:09 ` Junio C Hamano [this message]
2021-04-21 19:25 ` Rafael Silva
2021-04-21 19:32 ` [PATCH v3] " Rafael Silva
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqa6pt98j4.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=rafaeloliveira.cs@gmail.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.