* [PATCH 0/2] builtin/repack: fix `--keep-unreachable` when there are no packs
@ 2025-02-03 13:06 Patrick Steinhardt
2025-02-03 13:06 ` [PATCH 1/2] t7700: add tests for `--keep-unreachable` Patrick Steinhardt
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 13:06 UTC (permalink / raw)
To: git
Hi,
this small patch series fixes `git repack -ad --keep-unreachable` when
there aren't any preexisting packfiles.
Thanks!
Patrick
---
Patrick Steinhardt (2):
t7700: add tests for `--keep-unreachable`
builtin/repack: fix `--keep-unreachable` when there are no packs
builtin/repack.c | 5 +++-
t/t7700-repack.sh | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 77 insertions(+), 1 deletion(-)
---
base-commit: 3b0d05c4a79d0e441283680a864529b02dca5f08
change-id: 20250203-b4-pks-repack-unreachable-objects-wo-packfiles-33c26066e5c3
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] t7700: add tests for `--keep-unreachable`
2025-02-03 13:06 [PATCH 0/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
@ 2025-02-03 13:06 ` Patrick Steinhardt
2025-02-03 17:13 ` Junio C Hamano
2025-02-03 18:32 ` Jeff King
2025-02-03 13:06 ` [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
2025-02-04 7:00 ` [PATCH v2] " Patrick Steinhardt
2 siblings, 2 replies; 15+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 13:06 UTC (permalink / raw)
To: git
We don't have any tests for `git repack --keep-unreachable`. Add three
tests that exercise its behaviour with different packed states for the
unreachable object.
Note that the last test is failing. This will be fixed in the next
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
t/t7700-repack.sh | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index be1188e736..b26566473f 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -826,4 +826,77 @@ test_expect_success '-n overrides repack.updateServerInfo=true' '
test_server_info_missing
'
+expect_object_count () {
+ find .git/objects \( -type d \( -name pack -o -name info \) -prune \) -o -type f -print >objects &&
+ test_line_count = "$1" objects
+}
+
+expect_object_in_idx () {
+ git verify-pack -v "$1" >objects &&
+ test_grep "^$2" objects
+}
+
+test_expect_success '--keep-unreachable appends unreachable packed objects to new pack' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ git config set core.logAllRefUpdates false &&
+
+ # Set up the repo so that all objects, including the
+ # unreachable one, are packed.
+ test_commit --no-tag unreachable &&
+ git repack -ad &&
+ expect_object_count 0 &&
+ unreachable_oid=$(git rev-parse --verify HEAD) &&
+ git commit --amend --message rewritten &&
+
+ git repack -ad --keep-unreachable &&
+ expect_object_count 0 &&
+ expect_object_in_idx .git/objects/pack/*.idx "$unreachable_oid"
+ )
+'
+
+test_expect_success '--keep-unreachable packs unreachable loose object with existing packs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ git config set core.logAllRefUpdates false &&
+
+ # Set up the repo so that we have an existing packfile with
+ # reachable objects, only. The unreachable object as well as
+ # the rewritten commit are both loose.
+ test_commit --no-tag initial &&
+ git repack -ad &&
+ git commit --amend --message unreachable &&
+ unreachable_oid=$(git rev-parse --verify HEAD) &&
+ git commit --amend --message rewritten &&
+ expect_object_count 2 &&
+
+ git repack -ad --keep-unreachable &&
+ expect_object_count 0 &&
+ expect_object_in_idx .git/objects/pack/*.idx "$unreachable_oid"
+ )
+'
+
+test_expect_failure '--keep-unreachable packs unreachable loose object without existing packs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ git config set core.logAllRefUpdates false &&
+
+ # Set up the repo so that all objects are unpacked.
+ test_commit --no-tag unreachable &&
+ unreachable_oid=$(git rev-parse --verify HEAD) &&
+ git commit --amend --message rewritten &&
+ expect_object_count 4 &&
+
+ git repack -ad --keep-unreachable &&
+ expect_object_count 0 &&
+ expect_object_in_idx .git/objects/pack/*.idx "$unreachable_oid"
+ )
+'
+
test_done
--
2.48.1.502.g6dc24dfdaf.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-03 13:06 [PATCH 0/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
2025-02-03 13:06 ` [PATCH 1/2] t7700: add tests for `--keep-unreachable` Patrick Steinhardt
@ 2025-02-03 13:06 ` Patrick Steinhardt
2025-02-04 3:01 ` Jeff King
2025-02-04 13:28 ` Junio C Hamano
2025-02-04 7:00 ` [PATCH v2] " Patrick Steinhardt
2 siblings, 2 replies; 15+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 13:06 UTC (permalink / raw)
To: git
The "--keep-unreachable" flag is supposed to append any unreachable
objects to the newly written pack. This flag is explicitly documented as
appending both packed and loose unreachable objects to the new packfile.
And while this works alright when repacking with preexisting packfiles,
it stops working when the repository does not have any packfiles at all.
The root cause are the conditions used to decide whether or not we want
to append "--pack-loose-unreachable" to git-pack-objects(1). There are
a couple of conditions here:
- `has_existing_non_kept_packs()` checks whether there are existing
packfiles. This condition makes sense to guard "--keep-pack=",
"--unpack-unreachable" and "--keep-unreachable", because all of
these flags only make sense in combination with existing packfiles.
But it does not make sense to disable `--pack-loose-unreachable`
when there aren't any preexisting packfiles, as loose objects can be
packed into the new packfile regardless of that.
- `delete_redundant` checks whether we want to delete any objects or
packs that are about to become redundant. The documentation of
`--keep-unreachable` explicitly says that `git repack -ad` needs to
be executed for the flag to have an effect.
It is not immediately obvious why such redundant objects need to be
deleted in order for "--pack-unreachable-objects" to be effective.
But as things are working as documented this is nothing we'll change
for now.
- `pack_everything & PACK_CRUFT` checks that we're not creating a
cruft pack. This condition makes sense in the context of
"--pack-loose-unreachable", as unreachable objects would end up in
the cruft pack anyway.
So while the second and third condition are sensible, it does not make
any sense to condition `--pack-loose-unreachable` on the existence of
packfiles.
Fix the bug by splitting out the "--pack-loose-unreachable" and only
making it depend on the second and third condition. Like this, loose
unreachable objects will be packed regardless of any preexisting
packfiles.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/repack.c | 5 ++++-
t/t7700-repack.sh | 2 +-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/builtin/repack.c b/builtin/repack.c
index 81d13630ea..8194344b04 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1370,9 +1370,12 @@ int cmd_repack(int argc,
"--unpack-unreachable");
} else if (keep_unreachable) {
strvec_push(&cmd.args, "--keep-unreachable");
- strvec_push(&cmd.args, "--pack-loose-unreachable");
}
}
+
+ if (keep_unreachable && delete_redundant &&
+ !(pack_everything & PACK_CRUFT))
+ strvec_push(&cmd.args, "--pack-loose-unreachable");
} else if (geometry.split_factor) {
strvec_push(&cmd.args, "--stdin-packs");
strvec_push(&cmd.args, "--unpacked");
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index b26566473f..57523db696 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -880,7 +880,7 @@ test_expect_success '--keep-unreachable packs unreachable loose object with exis
)
'
-test_expect_failure '--keep-unreachable packs unreachable loose object without existing packs' '
+test_expect_success '--keep-unreachable packs unreachable loose object without existing packs' '
test_when_finished "rm -rf repo" &&
git init repo &&
(
--
2.48.1.502.g6dc24dfdaf.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] t7700: add tests for `--keep-unreachable`
2025-02-03 13:06 ` [PATCH 1/2] t7700: add tests for `--keep-unreachable` Patrick Steinhardt
@ 2025-02-03 17:13 ` Junio C Hamano
2025-02-03 18:32 ` Jeff King
1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2025-02-03 17:13 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> +expect_object_count () {
> + find .git/objects \( -type d \( -name pack -o -name info \) -prune \) -o -type f -print >objects &&
> + test_line_count = "$1" objects
> +}
So this is counting "loose" objects. Do we want to have "loose"
somewhere in its name?
> +expect_object_in_idx () {
> + git verify-pack -v "$1" >objects &&
> + test_grep "^$2" objects
> +}
And this one checks if an object exists in a given pack in somewhat
an expensive way. It can take either .idx or .pack but the name
makes the caller assume it should be called with .idx, which is OK.
Would it achieve the same effect and be faster to do something like
git show-index <"$1" >objects &&
test_grep "^[0-9]* $2 (" objects
instead?
Thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] t7700: add tests for `--keep-unreachable`
2025-02-03 13:06 ` [PATCH 1/2] t7700: add tests for `--keep-unreachable` Patrick Steinhardt
2025-02-03 17:13 ` Junio C Hamano
@ 2025-02-03 18:32 ` Jeff King
2025-02-03 23:53 ` Junio C Hamano
1 sibling, 1 reply; 15+ messages in thread
From: Jeff King @ 2025-02-03 18:32 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Feb 03, 2025 at 02:06:54PM +0100, Patrick Steinhardt wrote:
> We don't have any tests for `git repack --keep-unreachable`. Add three
> tests that exercise its behaviour with different packed states for the
> unreachable object.
There are a few in t7701. It's spelled "-k" there, so a grep for
"--keep-unreachable" would not find them.
> +test_expect_success '--keep-unreachable appends unreachable packed objects to new pack' '
> [...]
> +test_expect_success '--keep-unreachable packs unreachable loose object with existing packs' '
I think these match the two that are in t7701.
-Peff
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] t7700: add tests for `--keep-unreachable`
2025-02-03 18:32 ` Jeff King
@ 2025-02-03 23:53 ` Junio C Hamano
2025-02-04 2:35 ` Jeff King
0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2025-02-03 23:53 UTC (permalink / raw)
To: Jeff King; +Cc: Patrick Steinhardt, git
Jeff King <peff@peff.net> writes:
> On Mon, Feb 03, 2025 at 02:06:54PM +0100, Patrick Steinhardt wrote:
>
>> We don't have any tests for `git repack --keep-unreachable`. Add three
>> tests that exercise its behaviour with different packed states for the
>> unreachable object.
>
> There are a few in t7701. It's spelled "-k" there, so a grep for
> "--keep-unreachable" would not find them.
Ahh, good eyes. Thanks.
>
>> +test_expect_success '--keep-unreachable appends unreachable packed objects to new pack' '
>> [...]
>> +test_expect_success '--keep-unreachable packs unreachable loose object with existing packs' '
>
> I think these match the two that are in t7701.
Doubly thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] t7700: add tests for `--keep-unreachable`
2025-02-03 23:53 ` Junio C Hamano
@ 2025-02-04 2:35 ` Jeff King
2025-02-04 7:00 ` Patrick Steinhardt
0 siblings, 1 reply; 15+ messages in thread
From: Jeff King @ 2025-02-04 2:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
On Mon, Feb 03, 2025 at 03:53:46PM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > On Mon, Feb 03, 2025 at 02:06:54PM +0100, Patrick Steinhardt wrote:
> >
> >> We don't have any tests for `git repack --keep-unreachable`. Add three
> >> tests that exercise its behaviour with different packed states for the
> >> unreachable object.
> >
> > There are a few in t7701. It's spelled "-k" there, so a grep for
> > "--keep-unreachable" would not find them.
>
> Ahh, good eyes. Thanks.
I got to cheat a little as the original author of the flag. ;)
-Peff
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-03 13:06 ` [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
@ 2025-02-04 3:01 ` Jeff King
2025-02-04 7:01 ` Patrick Steinhardt
2025-02-04 13:28 ` Junio C Hamano
1 sibling, 1 reply; 15+ messages in thread
From: Jeff King @ 2025-02-04 3:01 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Feb 03, 2025 at 02:06:55PM +0100, Patrick Steinhardt wrote:
> The "--keep-unreachable" flag is supposed to append any unreachable
> objects to the newly written pack. This flag is explicitly documented as
> appending both packed and loose unreachable objects to the new packfile.
> And while this works alright when repacking with preexisting packfiles,
> it stops working when the repository does not have any packfiles at all.
OK. I thought from the subject / cover letter this was going to be about
the fact that "git repack -adk" may sometimes say "Nothing new to pack".
And the issue there is that if there are no reachable objects, we don't
actually pack at al.
But this is a separate issue, where we actually do repack, but don't
correctly pass the options. Let's read on...
> The root cause are the conditions used to decide whether or not we want
> to append "--pack-loose-unreachable" to git-pack-objects(1). There are
> a couple of conditions here:
>
> - `has_existing_non_kept_packs()` checks whether there are existing
> packfiles. This condition makes sense to guard "--keep-pack=",
> "--unpack-unreachable" and "--keep-unreachable", because all of
> these flags only make sense in combination with existing packfiles.
> But it does not make sense to disable `--pack-loose-unreachable`
> when there aren't any preexisting packfiles, as loose objects can be
> packed into the new packfile regardless of that.
Yeah, this analysis makes sense, and is the root of the problem.
> - `delete_redundant` checks whether we want to delete any objects or
> packs that are about to become redundant. The documentation of
> `--keep-unreachable` explicitly says that `git repack -ad` needs to
> be executed for the flag to have an effect.
>
> It is not immediately obvious why such redundant objects need to be
> deleted in order for "--pack-unreachable-objects" to be effective.
> But as things are working as documented this is nothing we'll change
> for now.
I don't think it's strictly necessary to require "-d" here. The original
concept of "-k" was to modify "-d" to keep objects instead of loosening
(so an alternative to --unpack-unreachable/-A). And then it expanded to
collecting the loose objects, too, for the reasons given in e26a8c4721
(repack: extend --keep-unreachable to loose objects, 2016-06-13).
I think you could conceive of "-k" as an alternative to "-d", rather
than an alternative to "-A". I.e., so that "repack -ak" did the same as
"repack -adk" does now.
And it would probably not even be a big code change, but it's possible
there would be some unexpected fallout (the logic in repack is quite
tortured and intricate from my recollection).
So I don't know that it's really worth it to change now. Especially
because I think "-k" has mostly outlived its usefulness. Cruft packs
solve the same problem but keep the extra objects in their own pack,
where they're less likely to interfere with normal operations. I'd
recommend anybody considering "-k" now to look into cruft packs instead.
> - `pack_everything & PACK_CRUFT` checks that we're not creating a
> cruft pack. This condition makes sense in the context of
> "--pack-loose-unreachable", as unreachable objects would end up in
> the cruft pack anyway.
>
> So while the second and third condition are sensible, it does not make
> any sense to condition `--pack-loose-unreachable` on the existence of
> packfiles.
Yup, agreed.
> Fix the bug by splitting out the "--pack-loose-unreachable" and only
> making it depend on the second and third condition. Like this, loose
> unreachable objects will be packed regardless of any preexisting
> packfiles.
Makes sense. My only question would be whether there are any gotchas
inside pack-objects about using --pack-loose-unreachable without
--keep-unreachable (since the two were up until now always used
together).
It was added by e26a8c4721. And looking over that patch, I don't see
anything that would let the options be used independently. So this seems
like a good solution.
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 81d13630ea..8194344b04 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -1370,9 +1370,12 @@ int cmd_repack(int argc,
> "--unpack-unreachable");
> } else if (keep_unreachable) {
> strvec_push(&cmd.args, "--keep-unreachable");
> - strvec_push(&cmd.args, "--pack-loose-unreachable");
> }
> }
> +
> + if (keep_unreachable && delete_redundant &&
> + !(pack_everything & PACK_CRUFT))
> + strvec_push(&cmd.args, "--pack-loose-unreachable");
One funny thing here is that previously unpack_unreachable took
precedence over keep_unreachable in the if-else chain. I wondered if we
could end up invoking pack-objects with both --unpack-unreachable and
--pack-loose-unreachable, which is nonsense.
But I think the answer is no, because we forbid --unpack-unreachable/-A
and --keep-unreachable from both being passed up front.
> -test_expect_failure '--keep-unreachable packs unreachable loose object without existing packs' '
> +test_expect_success '--keep-unreachable packs unreachable loose object without existing packs' '
> test_when_finished "rm -rf repo" &&
> git init repo &&
> (
Your test from patch 1 looked reasonable to me. If you fold patch 1 into
the existing tests in t7701, you might want to adjust it to match the
techniques those tests use for checking the object (rather than the new
helpers you added).
-Peff
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-03 13:06 [PATCH 0/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
2025-02-03 13:06 ` [PATCH 1/2] t7700: add tests for `--keep-unreachable` Patrick Steinhardt
2025-02-03 13:06 ` [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
@ 2025-02-04 7:00 ` Patrick Steinhardt
2025-02-04 15:22 ` Jeff King
2 siblings, 1 reply; 15+ messages in thread
From: Patrick Steinhardt @ 2025-02-04 7:00 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
The "--keep-unreachable" flag is supposed to append any unreachable
objects to the newly written pack. This flag is explicitly documented as
appending both packed and loose unreachable objects to the new packfile.
And while this works alright when repacking with preexisting packfiles,
it stops working when the repository does not have any packfiles at all.
The root cause are the conditions used to decide whether or not we want
to append "--pack-loose-unreachable" to git-pack-objects(1). There are
a couple of conditions here:
- `has_existing_non_kept_packs()` checks whether there are existing
packfiles. This condition makes sense to guard "--keep-pack=",
"--unpack-unreachable" and "--keep-unreachable", because all of
these flags only make sense in combination with existing packfiles.
But it does not make sense to disable `--pack-loose-unreachable`
when there aren't any preexisting packfiles, as loose objects can be
packed into the new packfile regardless of that.
- `delete_redundant` checks whether we want to delete any objects or
packs that are about to become redundant. The documentation of
`--keep-unreachable` explicitly says that `git repack -ad` needs to
be executed for the flag to have an effect.
It is not immediately obvious why such redundant objects need to be
deleted in order for "--pack-unreachable-objects" to be effective.
But as things are working as documented this is nothing we'll change
for now.
- `pack_everything & PACK_CRUFT` checks that we're not creating a
cruft pack. This condition makes sense in the context of
"--pack-loose-unreachable", as unreachable objects would end up in
the cruft pack anyway.
So while the second and third condition are sensible, it does not make
any sense to condition `--pack-loose-unreachable` on the existence of
packfiles.
Fix the bug by splitting out the "--pack-loose-unreachable" and only
making it depend on the second and third condition. Like this, loose
unreachable objects will be packed regardless of any preexisting
packfiles.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
Hi,
this small patch series fixes `git repack -ad --keep-unreachable` when
there aren't any preexisting packfiles.
Changes in v2:
- Merge tests into t7701.
- Link to v1: https://lore.kernel.org/r/20250203-b4-pks-repack-unreachable-objects-wo-packfiles-v1-0-7c4d69c5072c@pks.im
Thanks!
Patrick
---
builtin/repack.c | 5 ++++-
t/t7701-repack-unpack-unreachable.sh | 16 ++++++++++++++++
2 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/builtin/repack.c b/builtin/repack.c
index 81d13630ea..8194344b04 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1370,9 +1370,12 @@ int cmd_repack(int argc,
"--unpack-unreachable");
} else if (keep_unreachable) {
strvec_push(&cmd.args, "--keep-unreachable");
- strvec_push(&cmd.args, "--pack-loose-unreachable");
}
}
+
+ if (keep_unreachable && delete_redundant &&
+ !(pack_everything & PACK_CRUFT))
+ strvec_push(&cmd.args, "--pack-loose-unreachable");
} else if (geometry.split_factor) {
strvec_push(&cmd.args, "--stdin-packs");
strvec_push(&cmd.args, "--unpacked");
diff --git a/t/t7701-repack-unpack-unreachable.sh b/t/t7701-repack-unpack-unreachable.sh
index 5715f4d69a..5559d4ccb4 100755
--- a/t/t7701-repack-unpack-unreachable.sh
+++ b/t/t7701-repack-unpack-unreachable.sh
@@ -195,4 +195,20 @@ test_expect_success 'repack -k packs unreachable loose objects' '
git cat-file -p $sha1
'
+test_expect_success 'repack -k packs unreachable loose objects without existing packfiles' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+
+ oid=$(echo would-be-deleted-loose | git hash-object -w --stdin) &&
+ objpath=.git/objects/$(echo $sha1 | sed "s,..,&/,") &&
+ test_path_is_file $objpath &&
+
+ git repack -ad --keep-unreachable &&
+ test_path_is_missing $objpath &&
+ git cat-file -p $oid
+ )
+'
+
test_done
---
base-commit: 3b0d05c4a79d0e441283680a864529b02dca5f08
change-id: 20250203-b4-pks-repack-unreachable-objects-wo-packfiles-33c26066e5c3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] t7700: add tests for `--keep-unreachable`
2025-02-04 2:35 ` Jeff King
@ 2025-02-04 7:00 ` Patrick Steinhardt
0 siblings, 0 replies; 15+ messages in thread
From: Patrick Steinhardt @ 2025-02-04 7:00 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Mon, Feb 03, 2025 at 09:35:38PM -0500, Jeff King wrote:
> On Mon, Feb 03, 2025 at 03:53:46PM -0800, Junio C Hamano wrote:
>
> > Jeff King <peff@peff.net> writes:
> >
> > > On Mon, Feb 03, 2025 at 02:06:54PM +0100, Patrick Steinhardt wrote:
> > >
> > >> We don't have any tests for `git repack --keep-unreachable`. Add three
> > >> tests that exercise its behaviour with different packed states for the
> > >> unreachable object.
> > >
> > > There are a few in t7701. It's spelled "-k" there, so a grep for
> > > "--keep-unreachable" would not find them.
> >
> > Ahh, good eyes. Thanks.
>
> I got to cheat a little as the original author of the flag. ;)
The file is even *named* "unpack-unreachable". I didn't figure to grep
for "-k" though. Thanks for the pointer!
Patrick
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-04 3:01 ` Jeff King
@ 2025-02-04 7:01 ` Patrick Steinhardt
2025-02-04 14:58 ` Jeff King
0 siblings, 1 reply; 15+ messages in thread
From: Patrick Steinhardt @ 2025-02-04 7:01 UTC (permalink / raw)
To: Jeff King; +Cc: git
On Mon, Feb 03, 2025 at 10:01:57PM -0500, Jeff King wrote:
> On Mon, Feb 03, 2025 at 02:06:55PM +0100, Patrick Steinhardt wrote:
> > Fix the bug by splitting out the "--pack-loose-unreachable" and only
> > making it depend on the second and third condition. Like this, loose
> > unreachable objects will be packed regardless of any preexisting
> > packfiles.
>
> Makes sense. My only question would be whether there are any gotchas
> inside pack-objects about using --pack-loose-unreachable without
> --keep-unreachable (since the two were up until now always used
> together).
>
> It was added by e26a8c4721. And looking over that patch, I don't see
> anything that would let the options be used independently. So this seems
> like a good solution.
You probably meant "I don't see anything that would *not* let the
options be used independently." But yeah, they don't seem to require one
another.
> > diff --git a/builtin/repack.c b/builtin/repack.c
> > index 81d13630ea..8194344b04 100644
> > --- a/builtin/repack.c
> > +++ b/builtin/repack.c
> > @@ -1370,9 +1370,12 @@ int cmd_repack(int argc,
> > "--unpack-unreachable");
> > } else if (keep_unreachable) {
> > strvec_push(&cmd.args, "--keep-unreachable");
> > - strvec_push(&cmd.args, "--pack-loose-unreachable");
> > }
> > }
> > +
> > + if (keep_unreachable && delete_redundant &&
> > + !(pack_everything & PACK_CRUFT))
> > + strvec_push(&cmd.args, "--pack-loose-unreachable");
>
> One funny thing here is that previously unpack_unreachable took
> precedence over keep_unreachable in the if-else chain. I wondered if we
> could end up invoking pack-objects with both --unpack-unreachable and
> --pack-loose-unreachable, which is nonsense.
>
> But I think the answer is no, because we forbid --unpack-unreachable/-A
> and --keep-unreachable from both being passed up front.
Yup.
> > -test_expect_failure '--keep-unreachable packs unreachable loose object without existing packs' '
> > +test_expect_success '--keep-unreachable packs unreachable loose object without existing packs' '
> > test_when_finished "rm -rf repo" &&
> > git init repo &&
> > (
>
> Your test from patch 1 looked reasonable to me. If you fold patch 1 into
> the existing tests in t7701, you might want to adjust it to match the
> techniques those tests use for checking the object (rather than the new
> helpers you added).
Will do.
Thanks!
Patrick
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-03 13:06 ` [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
2025-02-04 3:01 ` Jeff King
@ 2025-02-04 13:28 ` Junio C Hamano
1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2025-02-04 13:28 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> The "--keep-unreachable" flag is supposed to append any unreachable
> objects to the newly written pack. This flag is explicitly documented as
> appending both packed and loose unreachable objects to the new packfile.
> And while this works alright when repacking with preexisting packfiles,
> it stops working when the repository does not have any packfiles at all.
Chuckle. That's a cute corner case the developers never considered,
it seems ;-).
> The root cause are the conditions used to decide whether or not we want
> to append "--pack-loose-unreachable" to git-pack-objects(1). There are
> a couple of conditions here:
>
> - `has_existing_non_kept_packs()` checks whether there are existing
> packfiles. This condition makes sense to guard "--keep-pack=",
> "--unpack-unreachable" and "--keep-unreachable", because all of
> these flags only make sense in combination with existing packfiles.
> But it does not make sense to disable `--pack-loose-unreachable`
> when there aren't any preexisting packfiles, as loose objects can be
> packed into the new packfile regardless of that.
>
> - `delete_redundant` checks whether we want to delete any objects or
> packs that are about to become redundant. The documentation of
> `--keep-unreachable` explicitly says that `git repack -ad` needs to
> be executed for the flag to have an effect.
>
> It is not immediately obvious why such redundant objects need to be
> deleted in order for "--pack-unreachable-objects" to be effective.
> But as things are working as documented this is nothing we'll change
> for now.
>
> - `pack_everything & PACK_CRUFT` checks that we're not creating a
> cruft pack. This condition makes sense in the context of
> "--pack-loose-unreachable", as unreachable objects would end up in
> the cruft pack anyway.
>
> So while the second and third condition are sensible, it does not make
> any sense to condition `--pack-loose-unreachable` on the existence of
> packfiles.
>
> Fix the bug by splitting out the "--pack-loose-unreachable" and only
> making it depend on the second and third condition. Like this, loose
> unreachable objects will be packed regardless of any preexisting
> packfiles.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/repack.c | 5 ++++-
> t/t7700-repack.sh | 2 +-
> 2 files changed, 5 insertions(+), 2 deletions(-)
Nicely analized and described. Thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-04 7:01 ` Patrick Steinhardt
@ 2025-02-04 14:58 ` Jeff King
0 siblings, 0 replies; 15+ messages in thread
From: Jeff King @ 2025-02-04 14:58 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Tue, Feb 04, 2025 at 08:01:01AM +0100, Patrick Steinhardt wrote:
> On Mon, Feb 03, 2025 at 10:01:57PM -0500, Jeff King wrote:
> > On Mon, Feb 03, 2025 at 02:06:55PM +0100, Patrick Steinhardt wrote:
> > > Fix the bug by splitting out the "--pack-loose-unreachable" and only
> > > making it depend on the second and third condition. Like this, loose
> > > unreachable objects will be packed regardless of any preexisting
> > > packfiles.
> >
> > Makes sense. My only question would be whether there are any gotchas
> > inside pack-objects about using --pack-loose-unreachable without
> > --keep-unreachable (since the two were up until now always used
> > together).
> >
> > It was added by e26a8c4721. And looking over that patch, I don't see
> > anything that would let the options be used independently. So this seems
> > like a good solution.
>
> You probably meant "I don't see anything that would *not* let the
> options be used independently." But yeah, they don't seem to require one
> another.
Whoops, yes. Last minute rephrasing strikes again. :)
-Peff
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-04 7:00 ` [PATCH v2] " Patrick Steinhardt
@ 2025-02-04 15:22 ` Jeff King
2025-02-05 5:36 ` Patrick Steinhardt
0 siblings, 1 reply; 15+ messages in thread
From: Jeff King @ 2025-02-04 15:22 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Junio C Hamano
On Tue, Feb 04, 2025 at 08:00:41AM +0100, Patrick Steinhardt wrote:
> this small patch series fixes `git repack -ad --keep-unreachable` when
> there aren't any preexisting packfiles.
>
> Changes in v2:
> - Merge tests into t7701.
> - Link to v1: https://lore.kernel.org/r/20250203-b4-pks-repack-unreachable-objects-wo-packfiles-v1-0-7c4d69c5072c@pks.im
This looks good to me.
One interesting thing I did notice:
> +test_expect_success 'repack -k packs unreachable loose objects without existing packfiles' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> +
> + oid=$(echo would-be-deleted-loose | git hash-object -w --stdin) &&
> + objpath=.git/objects/$(echo $sha1 | sed "s,..,&/,") &&
> + test_path_is_file $objpath &&
> +
> + git repack -ad --keep-unreachable &&
> + test_path_is_missing $objpath &&
> + git cat-file -p $oid
> + )
> +'
In the test in v1, we had reachable commits to pack. And here we don't.
So before your patch, the behavior in the v1 test was that we'd create a
new pack, but it wouldn't pick up the loose object. But the behavior of
this test is that we say "Nothing new to pack".
I originally thought that output meant that we were not running
pack-objects at all. But looking at builtin/repack.c, we do run it, and
it simply chooses not to make a pack (which makes sense; how would
repack even realize if there was stuff to pack, since pack-objects is
what does the traversal).
So the two outcomes are both the result of the same bug. In both cases
we do not correctly pack the loose objects, so whether we make a pack is
just a question of whether there was other reachable stuff to pack. And
since your patch is fixing the bug at its root, both outcomes are fixed.
And when I suggested in my response to v1 that "Nothing new to pack" in
an empty repo was a separate bug, I was just wrong. ;) There is nothing
else to fix after your patch.
Thanks for finding and fixing.
-Peff
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-04 15:22 ` Jeff King
@ 2025-02-05 5:36 ` Patrick Steinhardt
0 siblings, 0 replies; 15+ messages in thread
From: Patrick Steinhardt @ 2025-02-05 5:36 UTC (permalink / raw)
To: Jeff King; +Cc: git, Junio C Hamano
On Tue, Feb 04, 2025 at 10:22:36AM -0500, Jeff King wrote:
> On Tue, Feb 04, 2025 at 08:00:41AM +0100, Patrick Steinhardt wrote:
>
> > this small patch series fixes `git repack -ad --keep-unreachable` when
> > there aren't any preexisting packfiles.
> >
> > Changes in v2:
> > - Merge tests into t7701.
> > - Link to v1: https://lore.kernel.org/r/20250203-b4-pks-repack-unreachable-objects-wo-packfiles-v1-0-7c4d69c5072c@pks.im
>
> This looks good to me.
>
> One interesting thing I did notice:
>
> > +test_expect_success 'repack -k packs unreachable loose objects without existing packfiles' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + (
> > + cd repo &&
> > +
> > + oid=$(echo would-be-deleted-loose | git hash-object -w --stdin) &&
> > + objpath=.git/objects/$(echo $sha1 | sed "s,..,&/,") &&
> > + test_path_is_file $objpath &&
> > +
> > + git repack -ad --keep-unreachable &&
> > + test_path_is_missing $objpath &&
> > + git cat-file -p $oid
> > + )
> > +'
>
> In the test in v1, we had reachable commits to pack. And here we don't.
> So before your patch, the behavior in the v1 test was that we'd create a
> new pack, but it wouldn't pick up the loose object. But the behavior of
> this test is that we say "Nothing new to pack".
>
> I originally thought that output meant that we were not running
> pack-objects at all. But looking at builtin/repack.c, we do run it, and
> it simply chooses not to make a pack (which makes sense; how would
> repack even realize if there was stuff to pack, since pack-objects is
> what does the traversal).
>
> So the two outcomes are both the result of the same bug. In both cases
> we do not correctly pack the loose objects, so whether we make a pack is
> just a question of whether there was other reachable stuff to pack. And
> since your patch is fixing the bug at its root, both outcomes are fixed.
>
> And when I suggested in my response to v1 that "Nothing new to pack" in
> an empty repo was a separate bug, I was just wrong. ;) There is nothing
> else to fix after your patch.
>
> Thanks for finding and fixing.
Thanks for your thorough review!
Patrick
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-02-05 5:36 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-03 13:06 [PATCH 0/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
2025-02-03 13:06 ` [PATCH 1/2] t7700: add tests for `--keep-unreachable` Patrick Steinhardt
2025-02-03 17:13 ` Junio C Hamano
2025-02-03 18:32 ` Jeff King
2025-02-03 23:53 ` Junio C Hamano
2025-02-04 2:35 ` Jeff King
2025-02-04 7:00 ` Patrick Steinhardt
2025-02-03 13:06 ` [PATCH 2/2] builtin/repack: fix `--keep-unreachable` when there are no packs Patrick Steinhardt
2025-02-04 3:01 ` Jeff King
2025-02-04 7:01 ` Patrick Steinhardt
2025-02-04 14:58 ` Jeff King
2025-02-04 13:28 ` Junio C Hamano
2025-02-04 7:00 ` [PATCH v2] " Patrick Steinhardt
2025-02-04 15:22 ` Jeff King
2025-02-05 5:36 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).