* [PATCH 00/10] add more ref consistency checks
@ 2025-01-05 13:46 shejialuo
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
` (10 more replies)
0 siblings, 11 replies; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:46 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi all:
This patch mainly does the following three things:
1. Add some extra checks which I have ignored in the previous patches
for files-backend in
2. Add ref checks for packed-backend.
1. Check whether the type of "packed-refs" is correct.
2. Check whether the syntax of "packed-refs" is correct by using the
rules from "packed-backend.c::create_snapshot" and
"packed-backend.c::next_record".
3. Check whether the pointed object exists and whether the
"packed-refs" file is sorted.
3. Call "git refs verify" for "git-fsck(1)".
Although I am not mentored by Patrick and Karthik in this patch. I'd
like to add "Mentored-by" filed for them due to the reason that I
continue my GSoC work.
Thanks,
Jialuo
shejialuo (10):
files-backend: add object check for regular ref
builtin/refs.h: get worktrees without reading head info
packed-backend: check whether the "packed-refs" is regular
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NULL binaries
packed-backend: add "packed-refs" entry consistency check
packed-backend: create "fsck_packed_ref_entry" to store parsing info
packed-backend: add check for object consistency
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.txt | 22 ++
builtin/fsck.c | 28 +++
builtin/refs.c | 2 +-
fsck.h | 8 +
refs/files-backend.c | 54 ++++-
refs/packed-backend.c | 413 +++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 209 +++++++++++++++++
worktree.c | 5 +
worktree.h | 6 +
9 files changed, 723 insertions(+), 24 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH 01/10] files-backend: add object check for regular ref
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
@ 2025-01-05 13:49 ` shejialuo
2025-01-07 14:17 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
` (9 subsequent siblings)
10 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although we use "parse_loose_ref_content" to check whether the object id
is correct, we never parse it into the "struct object" structure thus we
ignore checking whether there is a real object existing in the repo and
whether the object type is correct.
Use "parse_object" to parse the oid for the regular ref content. If the
object does not exist, report the error to the user by reusing the fsck
message "BAD_REF_CONTENT".
Then, we need to check the type of the object. Just like "git-fsck(1)",
we only report "not a commit" error when the ref is a branch. Last,
update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 50 ++++++++++++++++++++++++++++++++--------
t/t0602-reffiles-fsck.sh | 30 ++++++++++++++++++++++++
2 files changed, 70 insertions(+), 10 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 64f51f0da9..0a4912c009 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -20,6 +20,7 @@
#include "../lockfile.h"
#include "../object.h"
#include "../object-file.h"
+#include "../packfile.h"
#include "../path.h"
#include "../dir.h"
#include "../chdir-notify.h"
@@ -3589,6 +3590,34 @@ static int files_fsck_symref_target(struct fsck_options *o,
return ret;
}
+static int files_fsck_refs_oid(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct fsck_ref_report report,
+ const char *target_name,
+ struct object_id *oid)
+{
+ struct object *obj;
+ int ret = 0;
+
+ if (is_promisor_object(ref_store->repo, oid))
+ return 0;
+
+ obj = parse_object(ref_store->repo, oid);
+ if (!obj) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "points to non-existing object %s",
+ oid_to_hex(oid));
+ } else if (obj->type != OBJ_COMMIT && is_branch(target_name)) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "points to non-commit object %s",
+ oid_to_hex(oid));
+ }
+
+ return ret;
+}
+
static int files_fsck_refs_content(struct ref_store *ref_store,
struct fsck_options *o,
const char *target_name,
@@ -3654,18 +3683,19 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
}
if (!(type & REF_ISSYMREF)) {
+ ret |= files_fsck_refs_oid(o, ref_store, report, target_name, &oid);
+
if (!*trailing) {
- ret = fsck_report_ref(o, &report,
- FSCK_MSG_REF_MISSING_NEWLINE,
- "misses LF at the end");
- goto cleanup;
- }
- if (*trailing != '\n' || *(trailing + 1)) {
- ret = fsck_report_ref(o, &report,
- FSCK_MSG_TRAILING_REF_CONTENT,
- "has trailing garbage: '%s'", trailing);
- goto cleanup;
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ } else if (*trailing != '\n' || *(trailing + 1)) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing garbage: '%s'", trailing);
}
+
+ goto cleanup;
} else {
ret = files_fsck_symref_target(o, &report, &referent, 0);
goto cleanup;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..75f234a94a 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -161,8 +161,10 @@ test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
cd repo &&
test_commit default &&
+ git branch branch-1 &&
mkdir -p "$branch_dir_prefix/a/b" &&
git refs verify 2>err &&
@@ -198,6 +200,28 @@ test_expect_success 'regular ref content should be checked (individual)' '
rm $branch_dir_prefix/branch-no-newline &&
test_cmp expect err &&
+ for non_existing_oid in "$(test_oid 001)" "$(test_oid 002)"
+ do
+ printf "%s\n" $non_existing_oid >$branch_dir_prefix/invalid-commit &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/invalid-commit: badRefContent: points to non-existing object $non_existing_oid
+ EOF
+ rm $branch_dir_prefix/invalid-commit &&
+ test_cmp expect err || return 1
+ done &&
+
+ for tree_oid in "$(git rev-parse main^{tree})" "$(git rev-parse branch-1^{tree})"
+ do
+ printf "%s\n" $tree_oid >$branch_dir_prefix/branch-tree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-tree: badRefContent: points to non-commit object $tree_oid
+ EOF
+ rm $branch_dir_prefix/branch-tree &&
+ test_cmp expect err || return 1
+ done &&
+
for trailing_content in " garbage" " more garbage"
do
printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
@@ -244,15 +268,21 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
bad_content_1=$(git rev-parse main)x &&
bad_content_2=xfsazqfxcadas &&
bad_content_3=Xfsazqfxcadas &&
+ non_existing_oid=$(test_oid 001) &&
+ tree_oid=$(git rev-parse main^{tree}) &&
printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+ printf "%s\n" $non_existing_oid >$branch_dir_prefix/branch-non-existing-oid &&
+ printf "%s\n" $tree_oid >$branch_dir_prefix/branch-tree &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/heads/branch-non-existing-oid: badRefContent: points to non-existing object $non_existing_oid
+ error: refs/heads/branch-tree: badRefContent: points to non-commit object $tree_oid
error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 02/10] builtin/refs.h: get worktrees without reading head info
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
@ 2025-01-05 13:49 ` shejialuo
2025-01-07 14:57 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
` (8 subsequent siblings)
10 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we will parse the
head information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" and "next_record" functions
to parse the "packed-ref" to get the head information. And if there are
something wrong, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we don't want to simply die the program but rather show the
warnings or errors as many as possible to info the users. So, we should
avoiding reading the head info.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head info.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 6 ++++++
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index af68b24f9d..74cb463e51 100644
--- a/worktree.c
+++ b/worktree.c
@@ -174,6 +174,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..1ba4a161a0 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,12 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. This is useful when checking
+ * the consistency, as reading HEAD may not be necessary.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
@ 2025-01-05 13:49 ` shejialuo
2025-01-07 16:33 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
` (7 subsequent siblings)
10 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". The user should always use "git
packed-refs" command to create the raw regular "packed-refs" file, so we
need to explicitly check this in "git refs verify".
Use "lstat" to check the file mode. If we cannot check the file status,
this is OK because there is a chance that there is no "packed-refs" in
the repo.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 33 +++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 20 ++++++++++++++++++++
2 files changed, 49 insertions(+), 4 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3406f1e71d..d9eb2f8b71 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../config.h"
#include "../dir.h"
#include "../gettext.h"
+#include "../fsck.h"
#include "../hash.h"
#include "../hex.h"
#include "../refs.h"
@@ -1747,15 +1748,39 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ struct stat st;
+ int ret = 0;
if (!is_main_worktree(wt))
- return 0;
+ goto cleanup;
- return 0;
+ /*
+ * If the packed-refs file doesn't exist, there's nothing to
+ * check.
+ */
+ if (lstat(refs->path, &st) < 0)
+ goto cleanup;
+
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ if (!S_ISREG(st.st_mode)) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+
+ ret = fsck_report_ref(o, &report, FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+cleanup:
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 75f234a94a..307f94a3ca 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -626,4 +626,24 @@ test_expect_success 'ref content checks should work with worktrees' '
test_cmp expect err
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-bak .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+'
+
test_done
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (2 preceding siblings ...)
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
@ 2025-01-05 13:49 ` shejialuo
2025-01-08 0:54 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries shejialuo
` (6 subsequent siblings)
10 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with:". As we are going to implement the header consistency
check, we should port this check into "packed_fsck".
However, the above check is not enough, this is because "git pack-refs"
will always write "PACKED_REFS_HEADER" which is a constant string to the
"packed-refs" file. So, we should check the following things for the
header.
1. If the header does not exist, we may report an error to the user
because it should exist, but we do allow no header in "packed-refs"
file. So, create a new fsck message "packedRefMissingHeader(INFO)" to
warn the user and also keep compatibility.
2. If the header content does not start with "# packed-ref with:", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER", ideally, we should report an error to the user.
However, we allow other contents as long as the header content starts
with "# packed-ref with:". To keep compatibility, create a new fsck
message "unknownPackedRefHeader(INFO)" to warn about this. We may
tighten this rule in the future.
In order to achieve above checks, read the "packed-refs" file via
"strbuf_read_file". Like what "create_snapshot" and other functions do,
we could split the line by finding the next newline in the buf. If we
cannot find a newline, this is an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(INFO)" to report an error to the user.
Then, parse the first line to apply the above three checks. Update the
test to excise the code.
However, when adding the new test for a bad header, the program will
still die in the "create_snapshot" method. This is because we have
checked the files-backend firstly and we use "parse_object" to check
whether the object exists and whether the type is correct. This function
will eventually call "create_snapshot" and "next_record" method, if
there is something wrong with packed-backend, the program just dies.
It's bad to just die the program because we want to report the problems
as many as possible. We should avoid checking object and its type when
packed-backend is broken. So, we should first check the consistency of
the packed-backend then for files-backend.
Add a new flag "safe_object_check" in "fsck_options", when there is
anything wrong with the parsing process, set this flag to 0 to avoid
checking objects in the later checks.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 16 ++++++
fsck.h | 6 ++
refs/files-backend.c | 6 +-
refs/packed-backend.c | 105 ++++++++++++++++++++++++++++++++++
t/t0602-reffiles-fsck.sh | 44 ++++++++++++++
5 files changed, 174 insertions(+), 3 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index b14bc44ca4..34375a3143 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,13 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
+`packedRefMissingHeader`::
+ (INFO) The "packed-refs" file does not contain the header.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
@@ -208,6 +219,11 @@
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
+`unknownPackedRefHeader`::
+ (INFO) The "packed-refs" header starts with "# pack-refs with:"
+ but the remaining content is not the same as what `git pack-refs`
+ would write.
+
`unknownType`::
(ERROR) Found an unknown object type.
diff --git a/fsck.h b/fsck.h
index a44c231a5f..026ad1d537 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
@@ -90,6 +92,8 @@ enum fsck_msg_type {
FUNC(REF_MISSING_NEWLINE, INFO) \
FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
+ FUNC(UNKNOWN_PACKED_REF_HEADER, INFO) \
+ FUNC(PACKED_REF_MISSING_HEADER, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
@@ -163,6 +167,7 @@ struct fsck_options {
fsck_error error_func;
unsigned strict;
unsigned verbose;
+ int safe_object_check;
enum fsck_msg_type *msg_type;
struct oidset skip_oids;
struct oidset gitmodules_found;
@@ -198,6 +203,7 @@ struct fsck_options {
}
#define FSCK_REFS_OPTIONS_DEFAULT { \
.error_func = fsck_refs_error_function, \
+ .safe_object_check = 1, \
}
/* descend in all linked child objects
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0a4912c009..66eae36184 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3599,7 +3599,7 @@ static int files_fsck_refs_oid(struct fsck_options *o,
struct object *obj;
int ret = 0;
- if (is_promisor_object(ref_store->repo, oid))
+ if (!o->safe_object_check || is_promisor_object(ref_store->repo, oid))
return 0;
obj = parse_object(ref_store->repo, oid);
@@ -3819,8 +3819,8 @@ static int files_fsck(struct ref_store *ref_store,
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_READ, "fsck");
- return files_fsck_refs(ref_store, o, wt) |
- refs->packed_ref_store->be->fsck(refs->packed_ref_store, o, wt);
+ return refs->packed_ref_store->be->fsck(refs->packed_ref_store, o, wt) |
+ files_fsck_refs(ref_store, o, wt);
}
struct ref_storage_be refs_be_files = {
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index d9eb2f8b71..3b11abe5f8 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1748,12 +1748,100 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ int line_number, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+
+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ strbuf_release(&packed_entry);
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
+{
+ const char *err_fmt = NULL;
+ int fsck_msg_id = -1;
+
+ if (!starts_with(start, "# pack-refs with:")) {
+ err_fmt = "'%.*s' does not start with '# pack-refs with:'";
+ fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
+ } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
+ err_fmt = "'%.*s' is not the official packed-refs header";
+ fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
+ }
+
+ if (err_fmt && fsck_msg_id >= 0) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report, fsck_msg_id, err_fmt,
+ (int)(eol - start), start);
+
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ int line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ } else {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_MISSING_HEADER,
+ "missing header line");
+ }
+
+ /*
+ * If there is anything wrong during the parsing of the "packed-refs"
+ * file, we should not check the object of the refs.
+ */
+ if (ret)
+ o->safe_object_check = 0;
+
+
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
struct stat st;
int ret = 0;
@@ -1779,7 +1867,24 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ if (strbuf_read_file(&packed_ref_content, refs->path, 0) < 0) {
+ /*
+ * Although we have checked that the file exists, there is a possibility
+ * that it has been removed between the lstat() and the read attempt by
+ * another process. In that case, we should not report an error.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ ret = error_errno("could not read %s", refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 307f94a3ca..6c729e749a 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -646,4 +646,48 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
test_cmp expect err
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ printf "$(git rev-parse main) refs/heads/main\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: packed-refs: packedRefMissingHeader: missing header line
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with:'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done &&
+
+ for unknown_header in "# pack-refs with: peeled fully-peeled sorted garbage" \
+ "# pack-refs with: peeled" \
+ "# pack-refs with: peeled peeled-fully sort"
+ do
+ printf "%s\n" "$unknown_header" >.git/packed-refs &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: packed-refs.header: unknownPackedRefHeader: '\''$unknown_header'\'' is not the official packed-refs header
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+'
+
test_done
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (3 preceding siblings ...)
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-01-05 13:49 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check shejialuo
` (5 subsequent siblings)
10 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We have already implemented the header consistency check for the raw
"packed-refs" file. Before we implement the consistency check for each
ref entry, let's analysis [1] which reports that "git fsck" cannot
detect some binary zeros.
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
So, we already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NULL binaries in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and the first occurred NULL binary is valid.
In order to catch above corruption, create a new function
"refname_contains_null" by checking whether the "refname.len" equals to
the length of the raw string pointer "refname.buf". If not equal, there
must be some NULL binaries in the refname.
Use this function in "next_record" function to die the program if
"refname_contains_null" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3b11abe5f8..f6142a4402 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -493,6 +493,23 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NULL binaries. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NULL binaries.
+ */
+static int refname_contains_null(struct strbuf refname)
+{
+ if (refname.len != strlen(refname.buf))
+ return 1;
+ return 0;
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -894,6 +911,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_null(iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (4 preceding siblings ...)
2025-01-05 13:49 ` [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries shejialuo
@ 2025-01-05 13:49 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info shejialuo
` (4 subsequent siblings)
10 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry, if the oid is not correct. It
will die the program. And then it will check whether the next
character of the oid is space. Then it will check whether the refname
is correct.
2. If the next line starts with '^', it will continue to parse the oid
of the peeled oid content and check whether the last character is
'\n'.
We can iterate each line by using the "packed_fsck_ref_unterminated_line"
function. Then, create a new fsck message "badPackedRefEntry(ERROR)" to
report to the user when something is wrong.
Create two new functions "packed_fsck_ref_main_line" and
"packed_fsck_ref_peeled_line" for case 1 and case 2 respectively. Last,
update the unit test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/packed-backend.c | 105 +++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 40 +++++++++++++
4 files changed, 148 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 34375a3143..2a7ec7592e 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 026ad1d537..4fca304b72 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f6142a4402..6e521a9f87 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1822,7 +1822,96 @@ static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store, int line_number,
+ const char *start, const char *eol)
+{
+ struct strbuf peeled_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+ int ret = 0;
+
+ strbuf_addf(&peeled_entry, "packed-refs line %d", line_number);
+ report.path = peeled_entry.buf;
+
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p != eol) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&peeled_entry);
+ return ret;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store, int line_number,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname = STRBUF_INIT;
+ struct object_id oid;
+ const char *p;
+ int ret = 0;
+
+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
+ report.path = packed_entry.buf;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p == eol || !isspace(*p)) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+ goto cleanup;
+ }
+
+ p++;
+ strbuf_add(&refname, p, eol - p);
+ if (refname_contains_null(refname)) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname.buf);
+ goto cleanup;
+ }
+
+ if (check_refname_format(refname.buf, 0)) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname.buf);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
int line_number = 1;
@@ -1844,6 +1933,20 @@ static int packed_fsck_ref_content(struct fsck_options *o,
"missing header line");
}
+ while (start < eof) {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, line_number, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
/*
* If there is anything wrong during the parsing of the "packed-refs"
* file, we should not check the object of the refs.
@@ -1900,7 +2003,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 6c729e749a..7e8b329425 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -690,4 +690,44 @@ test_expect_success 'packed-refs header should be checked' '
done
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s\n" "$short_oid refs/heads/branch-1" >>.git/packed-refs &&
+ printf "%sx\n" "$branch_1_oid" >>.git/packed-refs &&
+ printf "%s refs/heads/bad-branch\n" "$branch_2_oid" >>.git/packed-refs &&
+ printf "%s refs/heads/branch.\n" "$branch_2_oid" >>.git/packed-refs &&
+ printf "%s refs/tags/annotated-tag-3\n" "$tag_1_oid" >>.git/packed-refs &&
+ printf "^%s\n" "$short_oid" >>.git/packed-refs &&
+ printf "%s refs/tags/annotated-tag-4.\n" "$tag_2_oid" >>.git/packed-refs &&
+ printf "^%s garbage\n" "$tag_2_peeled_oid" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+'
+
test_done
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (5 preceding siblings ...)
2025-01-05 13:49 ` [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-01-05 13:50 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 08/10] packed-backend: add check for object consistency shejialuo
` (3 subsequent siblings)
10 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We have already check whether the oid hash is correct by using
`parse_oid_hex_algop`. However, we doesn't check whether the object
exists. It may seem that we could do this when we are parsing the raw
"packed-refs" file. But this is impossible. Let's analysis why.
We will use "parse_object" function to get the "struct object". However,
this function will eventually call the "create_snapshot" and
"next_record" function in "packed-backend.c". If there is anything
wrong, it will die the program. And we don't want to die the program
during the check.
So, we should store the information in the parsing process. And if there
is nothing wrong in the parsing process, we could continue to check
things. So, create "fsck_packed_ref_entry" to do this.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 56 ++++++++++++++++++++++++++++++++++---------
1 file changed, 45 insertions(+), 11 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6e521a9f87..7386e6bfce 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1768,6 +1768,29 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+struct fsck_packed_ref_entry {
+ int line_number;
+
+ int has_peeled;
+ struct object_id oid;
+ struct object_id peeled;
+};
+
+static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(int line_number)
+{
+ struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
+ entry->line_number = line_number;
+ entry->has_peeled = 0;
+ return entry;
+}
+
+static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, int nr)
+{
+ for (int i = 0; i < nr; i++)
+ free(entries[i]);
+ free(entries);
+}
+
static int packed_fsck_ref_next_line(struct fsck_options *o,
int line_number, const char *start,
const char *eof, const char **eol)
@@ -1823,20 +1846,20 @@ static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
}
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
- struct ref_store *ref_store, int line_number,
+ struct ref_store *ref_store,
+ struct fsck_packed_ref_entry *entry,
const char *start, const char *eol)
{
struct strbuf peeled_entry = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
- struct object_id peeled;
const char *p;
int ret = 0;
- strbuf_addf(&peeled_entry, "packed-refs line %d", line_number);
+ strbuf_addf(&peeled_entry, "packed-refs line %d", entry->line_number + 1);
report.path = peeled_entry.buf;
start++;
- if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ if (parse_oid_hex_algop(start, &entry->peeled, &p, ref_store->repo->hash_algo)) {
ret |= fsck_report_ref(o, &report,
FSCK_MSG_BAD_PACKED_REF_ENTRY,
"'%.*s' has invalid peeled oid",
@@ -1858,20 +1881,20 @@ static int packed_fsck_ref_peeled_line(struct fsck_options *o,
}
static int packed_fsck_ref_main_line(struct fsck_options *o,
- struct ref_store *ref_store, int line_number,
+ struct ref_store *ref_store,
+ struct fsck_packed_ref_entry *entry,
const char *start, const char *eol)
{
struct strbuf packed_entry = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
struct strbuf refname = STRBUF_INIT;
- struct object_id oid;
const char *p;
int ret = 0;
- strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
+ strbuf_addf(&packed_entry, "packed-refs line %d", entry->line_number);
report.path = packed_entry.buf;
- if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ if (parse_oid_hex_algop(start, &entry->oid, &p, ref_store->repo->hash_algo)) {
ret |= fsck_report_ref(o, &report,
FSCK_MSG_BAD_PACKED_REF_ENTRY,
"'%.*s' has invalid oid",
@@ -1883,7 +1906,7 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
ret |= fsck_report_ref(o, &report,
FSCK_MSG_BAD_PACKED_REF_ENTRY,
"has no space after oid '%s' but with '%.*s'",
- oid_to_hex(&oid), (int)(eol - p), p);
+ oid_to_hex(&entry->oid), (int)(eol - p), p);
goto cleanup;
}
@@ -1914,7 +1937,10 @@ static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
const char *start, const char *eof)
{
+ struct fsck_packed_ref_entry **entries;
+ int entry_alloc = 20;
int line_number = 1;
+ int entry_nr = 0;
const char *eol;
int ret = 0;
@@ -1933,14 +1959,21 @@ static int packed_fsck_ref_content(struct fsck_options *o,
"missing header line");
}
+ ALLOC_ARRAY(entries, entry_alloc);
while (start < eof) {
+ struct fsck_packed_ref_entry *entry
+ = create_fsck_packed_ref_entry(line_number);
+ ALLOC_GROW(entries, entry_nr + 1, entry_alloc);
+ entries[entry_nr++] = entry;
+
ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
- ret |= packed_fsck_ref_main_line(o, ref_store, line_number, start, eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, entry, start, eol);
start = eol + 1;
line_number++;
if (start < eof && *start == '^') {
+ entry->has_peeled = 1;
ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
- ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, entry,
start, eol);
start = eol + 1;
line_number++;
@@ -1955,6 +1988,7 @@ static int packed_fsck_ref_content(struct fsck_options *o,
o->safe_object_check = 0;
+ free_fsck_packed_ref_entries(entries, entry_nr);
return ret;
}
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 08/10] packed-backend: add check for object consistency
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (6 preceding siblings ...)
2025-01-05 13:50 ` [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info shejialuo
@ 2025-01-05 13:50 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted shejialuo
` (2 subsequent siblings)
10 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
If there is nothing wrong when parsing the raw file "packed-refs", we
could then iterate the "entries" to check the object consistency. There
are two kinds of ref entry: one is the normal and another is peeled. For
both situations, we need to use "parse_object" function to parse the
object id to get the object. If the object does not exist, we will
report an error to the user.
Create a new function "packed_fsck_ref_oid" to do above then update the
unit test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 50 +++++++++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 35 ++++++++++++++++++++++++++++
2 files changed, 84 insertions(+), 1 deletion(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 7386e6bfce..d83ce2838f 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -13,6 +13,7 @@
#include "../iterator.h"
#include "../lockfile.h"
#include "../chdir-notify.h"
+#include "../packfile.h"
#include "../statinfo.h"
#include "../worktree.h"
#include "../wrapper.h"
@@ -1933,6 +1934,52 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return ret;
}
+static int packed_fsck_ref_oid(struct fsck_options *o, struct ref_store *ref_store,
+ struct fsck_packed_ref_entry **entries, int nr)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object *obj;
+ int ret = 0;
+
+ for (int i = 0; i < nr; i++) {
+ struct fsck_packed_ref_entry *entry = entries[i];
+
+ strbuf_release(&packed_entry);
+ strbuf_addf(&packed_entry, "packed-refs line %d", entry->line_number);
+ report.path = packed_entry.buf;
+
+ if (is_promisor_object(ref_store->repo, &entry->oid))
+ continue;
+
+ obj = parse_object(ref_store->repo, &entry->oid);
+ if (!obj) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%s' is not a valid object",
+ oid_to_hex(&entry->oid));
+ }
+ if (entry->has_peeled) {
+ strbuf_reset(&packed_entry);
+ strbuf_addf(&packed_entry, "packed-refs line %d",
+ entry->line_number + 1);
+ report.path = packed_entry.buf;
+
+ obj = parse_object(ref_store->repo, &entry->peeled);
+ if (!obj) {
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%s' is not a valid object",
+ oid_to_hex(&entry->peeled));
+ }
+ }
+
+ }
+
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
const char *start, const char *eof)
@@ -1986,7 +2033,8 @@ static int packed_fsck_ref_content(struct fsck_options *o,
*/
if (ret)
o->safe_object_check = 0;
-
+ else
+ ret |= packed_fsck_ref_oid(o, ref_store, entries, entry_nr);
free_fsck_packed_ref_entries(entries, entry_nr);
return ret;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 7e8b329425..faa7c80356 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -730,4 +730,39 @@ test_expect_success 'packed-refs content should be checked' '
test_cmp expect err
'
+test_expect_success 'packed-refs objects should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+
+ for non_existing_oid in "$(test_oid 001)" "$(test_oid 002)"
+ do
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s refs/heads/foo\n" "$non_existing_oid" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$non_existing_oid'\'' is not a valid object
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done &&
+
+ for non_existing_oid in "$(test_oid 001)" "$(test_oid 002)"
+ do
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s refs/tags/foo\n" "$tag_1_oid" >>.git/packed-refs &&
+ printf "^$non_existing_oid\n" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 3: badPackedRefEntry: '\''$non_existing_oid'\'' is not a valid object
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+'
+
test_done
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (7 preceding siblings ...)
2025-01-05 13:50 ` [PATCH 08/10] packed-backend: add check for object consistency shejialuo
@ 2025-01-05 13:50 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 10/10] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
10 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We will always try to sort the "packed-refs" increasingly by comparing
the refname. So, we should add checks to verify whether the "packed-refs"
is sorted.
It may seem that we could add a new "struct strbuf refname" into the
"struct fsck_packed_ref_entry" and during the parsing process, we could
store the refname into the entry and then we could compare later.
However, this is not a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. The most important is that we cannot reuse the existing compare
functions which cause repetition.
So, instead of storing the "struct strbuf", let's use the existing
structure "struct snaphost_record". And thus we could use the existing
function "cmp_packed_ref_records".
However, this function need an extra parameter for "struct snaphost".
Extract the common part into a new function "cmp_packed_ref_records" to
reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to use the new fsck
message "packedRefUnsorted(ERROR)" to report to the user.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 ++
fsck.h | 1 +
refs/packed-backend.c | 78 ++++++++++++++++++++++++++++++-----
t/t0602-reffiles-fsck.sh | 40 ++++++++++++++++++
4 files changed, 111 insertions(+), 11 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 2a7ec7592e..7a11d35c5e 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -190,6 +190,9 @@
`packedRefMissingHeader`::
(INFO) The "packed-refs" file does not contain the header.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 4fca304b72..1be7402eb9 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index d83ce2838f..df65fec5a5 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,8 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
+static int cmp_packed_refname(const char *r1, const char *r2)
{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
-
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +316,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1775,13 +1780,17 @@ struct fsck_packed_ref_entry {
int has_peeled;
struct object_id oid;
struct object_id peeled;
+
+ struct snapshot_record record;
};
-static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(int line_number)
+static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(int line_number,
+ const char *start)
{
struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
entry->line_number = line_number;
entry->has_peeled = 0;
+ entry->record.start = start;
return entry;
}
@@ -1980,6 +1989,50 @@ static int packed_fsck_ref_oid(struct fsck_options *o, struct ref_store *ref_sto
return ret;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct fsck_packed_ref_entry **entries,
+ int nr)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ int ret = 0;
+
+ for (int i = 1; i < nr; i++) {
+ const char *r1 = entries[i - 1]->record.start + hexsz + 1;
+ const char *r2 = entries[i]->record.start + hexsz + 1;
+
+ if (cmp_packed_refname(r1, r2) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is not less than next refname '%s'";
+ const char *eol;
+ eol = memchr(entries[i - 1]->record.start, '\n',
+ entries[i - 1]->record.len);
+ strbuf_add(&refname1, r1, eol - r1);
+ eol = memchr(entries[i]->record.start, '\n',
+ entries[i]->record.len);
+ strbuf_add(&refname2, r2, eol - r2);
+
+ strbuf_addf(&packed_entry, "packed-refs line %d",
+ entries[i - 1]->line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname1.buf, refname2.buf);
+ goto cleanup;
+ }
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
const char *start, const char *eof)
@@ -2009,7 +2062,7 @@ static int packed_fsck_ref_content(struct fsck_options *o,
ALLOC_ARRAY(entries, entry_alloc);
while (start < eof) {
struct fsck_packed_ref_entry *entry
- = create_fsck_packed_ref_entry(line_number);
+ = create_fsck_packed_ref_entry(line_number, start);
ALLOC_GROW(entries, entry_nr + 1, entry_alloc);
entries[entry_nr++] = entry;
@@ -2025,16 +2078,19 @@ static int packed_fsck_ref_content(struct fsck_options *o,
start = eol + 1;
line_number++;
}
+ entry->record.len = start - entry->record.start;
}
/*
* If there is anything wrong during the parsing of the "packed-refs"
* file, we should not check the object of the refs.
*/
- if (ret)
+ if (ret) {
o->safe_object_check = 0;
- else
+ } else {
ret |= packed_fsck_ref_oid(o, ref_store, entries, entry_nr);
+ ret |= packed_fsck_ref_sorted(o, ref_store, entries, entry_nr);
+ }
free_fsck_packed_ref_entries(entries, entry_nr);
return ret;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index faa7c80356..800a19e4e6 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -765,4 +765,44 @@ test_expect_success 'packed-refs objects should be checked' '
done
'
+test_expect_success 'packed-ref sorted should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
+ printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname1'\'' is not less than next refname '\''$refname2'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s %s\n" "$tag_1_oid" "$refname3" >>.git/packed-refs &&
+ printf "^%s\n" "$tag_1_peeled_oid" >>.git/packed-refs &&
+ printf "%s %s\n" "$branch_2_oid" "$refname2" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname3'\'' is not less than next refname '\''$refname2'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+'
+
test_done
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH 10/10] builtin/fsck: add `git refs verify` child process
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (8 preceding siblings ...)
2025-01-05 13:50 ` [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-01-05 13:50 ` shejialuo
2025-01-06 22:16 ` Junio C Hamano
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
10 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-05 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
And we run this function in the first execution sequence of
"git-fsck(1)" because we don't want the existing code of "git-fsck(1)"
which implicitly checks the consistency of refs to die the program.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/fsck.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 0196c54eb6..a10e52b601 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -902,6 +902,32 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(void)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+
+ if (show_progress)
+ progress = start_progress(_("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
@@ -967,6 +993,8 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ fsck_refs();
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
--
2.47.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH 10/10] builtin/fsck: add `git refs verify` child process
2025-01-05 13:50 ` [PATCH 10/10] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-01-06 22:16 ` Junio C Hamano
2025-01-07 12:00 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-06 22:16 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> builtin/fsck.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 0196c54eb6..a10e52b601 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -902,6 +902,32 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
> return res;
> }
>
> +static void fsck_refs(void)
> +{
> + struct child_process refs_verify = CHILD_PROCESS_INIT;
> + struct progress *progress = NULL;
> +
> + if (show_progress)
> + progress = start_progress(_("Checking ref database"), 1);
This had an obvious semantic conflicts with a topic in flight.
I've resolved it in the latest integration after pushing out the
2.48-rc2 this morning, so there is no need to resend, but please
remember that it would be a possibility to rebase on top of an
updated 'master' *IF* the other topic graduates to 'master' a lot
earlier than this topic hits 'next' (IOW, until that happens there
is no need to rebase).
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 10/10] builtin/fsck: add `git refs verify` child process
2025-01-06 22:16 ` Junio C Hamano
@ 2025-01-07 12:00 ` shejialuo
2025-01-07 15:52 ` Junio C Hamano
0 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-07 12:00 UTC (permalink / raw)
To: Junio C Hamano, a
Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Mon, Jan 06, 2025 at 02:16:22PM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > builtin/fsck.c | 28 ++++++++++++++++++++++++++++
> > 1 file changed, 28 insertions(+)
> >
> > diff --git a/builtin/fsck.c b/builtin/fsck.c
> > index 0196c54eb6..a10e52b601 100644
> > --- a/builtin/fsck.c
> > +++ b/builtin/fsck.c
> > @@ -902,6 +902,32 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
> > return res;
> > }
> >
> > +static void fsck_refs(void)
> > +{
> > + struct child_process refs_verify = CHILD_PROCESS_INIT;
> > + struct progress *progress = NULL;
> > +
> > + if (show_progress)
> > + progress = start_progress(_("Checking ref database"), 1);
>
> This had an obvious semantic conflicts with a topic in flight.
>
> I've resolved it in the latest integration after pushing out the
> 2.48-rc2 this morning, so there is no need to resend, but please
> remember that it would be a possibility to rebase on top of an
> updated 'master' *IF* the other topic graduates to 'master' a lot
> earlier than this topic hits 'next' (IOW, until that happens there
> is no need to rebase).
>
Thanks for the careful notification. I'll watch this.
> Thanks.
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 01/10] files-backend: add object check for regular ref
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
@ 2025-01-07 14:17 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
1 sibling, 0 replies; 168+ messages in thread
From: Karthik Nayak @ 2025-01-07 14:17 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 7202 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> Although we use "parse_loose_ref_content" to check whether the object id
> is correct, we never parse it into the "struct object" structure thus we
> ignore checking whether there is a real object existing in the repo and
> whether the object type is correct.
>
> Use "parse_object" to parse the oid for the regular ref content. If the
> object does not exist, report the error to the user by reusing the fsck
> message "BAD_REF_CONTENT".
>
> Then, we need to check the type of the object. Just like "git-fsck(1)",
> we only report "not a commit" error when the ref is a branch. Last,
> update the test to exercise the code.
I found this a bit confusing at first, the code does clear up the
confusion. Perhaps we can say something like:
Branches that do not point to a commit type are explicitly called out,
similar to 'git-fsck(1)'.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> refs/files-backend.c | 50 ++++++++++++++++++++++++++++++++--------
> t/t0602-reffiles-fsck.sh | 30 ++++++++++++++++++++++++
> 2 files changed, 70 insertions(+), 10 deletions(-)
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 64f51f0da9..0a4912c009 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -20,6 +20,7 @@
> #include "../lockfile.h"
> #include "../object.h"
> #include "../object-file.h"
> +#include "../packfile.h"
> #include "../path.h"
> #include "../dir.h"
> #include "../chdir-notify.h"
> @@ -3589,6 +3590,34 @@ static int files_fsck_symref_target(struct fsck_options *o,
> return ret;
> }
>
> +static int files_fsck_refs_oid(struct fsck_options *o,
> + struct ref_store *ref_store,
> + struct fsck_ref_report report,
> + const char *target_name,
> + struct object_id *oid)
> +{
> + struct object *obj;
> + int ret = 0;
> +
> + if (is_promisor_object(ref_store->repo, oid))
> + return 0;
> +
> + obj = parse_object(ref_store->repo, oid);
> + if (!obj) {
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_CONTENT,
> + "points to non-existing object %s",
> + oid_to_hex(oid));
Nit: The two conditionals here are mutually exclusive. So we don't have
to do `ret |=`, no? We don't even need `ret` here, we could simply do a
`return fsck_report_ref(...)`.
> + } else if (obj->type != OBJ_COMMIT && is_branch(target_name)) {
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_CONTENT,
> + "points to non-commit object %s",
> + oid_to_hex(oid));
> + }
Since this is a single lined if/else, we can skip the braces here.
> + return ret;
> +}
> +
> static int files_fsck_refs_content(struct ref_store *ref_store,
> struct fsck_options *o,
> const char *target_name,
> @@ -3654,18 +3683,19 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
> }
>
> if (!(type & REF_ISSYMREF)) {
> + ret |= files_fsck_refs_oid(o, ref_store, report, target_name, &oid);
> +
> if (!*trailing) {
> - ret = fsck_report_ref(o, &report,
> - FSCK_MSG_REF_MISSING_NEWLINE,
> - "misses LF at the end");
> - goto cleanup;
> - }
> - if (*trailing != '\n' || *(trailing + 1)) {
> - ret = fsck_report_ref(o, &report,
> - FSCK_MSG_TRAILING_REF_CONTENT,
> - "has trailing garbage: '%s'", trailing);
> - goto cleanup;
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_REF_MISSING_NEWLINE,
> + "misses LF at the end");
> + } else if (*trailing != '\n' || *(trailing + 1)) {
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_TRAILING_REF_CONTENT,
> + "has trailing garbage: '%s'", trailing);
> }
> +
> + goto cleanup;
> } else {
> ret = files_fsck_symref_target(o, &report, &referent, 0);
> goto cleanup;
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index d4a08b823b..75f234a94a 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -161,8 +161,10 @@ test_expect_success 'regular ref content should be checked (individual)' '
> test_when_finished "rm -rf repo" &&
> git init repo &&
> branch_dir_prefix=.git/refs/heads &&
> + tag_dir_prefix=.git/refs/tags &&
> cd repo &&
> test_commit default &&
> + git branch branch-1 &&
> mkdir -p "$branch_dir_prefix/a/b" &&
>
> git refs verify 2>err &&
> @@ -198,6 +200,28 @@ test_expect_success 'regular ref content should be checked (individual)' '
> rm $branch_dir_prefix/branch-no-newline &&
> test_cmp expect err &&
>
> + for non_existing_oid in "$(test_oid 001)" "$(test_oid 002)"
> + do
> + printf "%s\n" $non_existing_oid >$branch_dir_prefix/invalid-commit &&
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: refs/heads/invalid-commit: badRefContent: points to non-existing object $non_existing_oid
> + EOF
> + rm $branch_dir_prefix/invalid-commit &&
> + test_cmp expect err || return 1
> + done &&
> +
> + for tree_oid in "$(git rev-parse main^{tree})" "$(git rev-parse branch-1^{tree})"
> + do
> + printf "%s\n" $tree_oid >$branch_dir_prefix/branch-tree &&
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: refs/heads/branch-tree: badRefContent: points to non-commit object $tree_oid
Reading this error here, I think it would be nicer to say
'badRefContent: branch points to ....' so we know that the specified ref
is a branch.
> + EOF
> + rm $branch_dir_prefix/branch-tree &&
> + test_cmp expect err || return 1
> + done &&
> +
> for trailing_content in " garbage" " more garbage"
> do
> printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
> @@ -244,15 +268,21 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
> bad_content_1=$(git rev-parse main)x &&
> bad_content_2=xfsazqfxcadas &&
> bad_content_3=Xfsazqfxcadas &&
> + non_existing_oid=$(test_oid 001) &&
> + tree_oid=$(git rev-parse main^{tree}) &&
> printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
> printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
> printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
> printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
> printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
> + printf "%s\n" $non_existing_oid >$branch_dir_prefix/branch-non-existing-oid &&
> + printf "%s\n" $tree_oid >$branch_dir_prefix/branch-tree &&
>
> test_must_fail git refs verify 2>err &&
> cat >expect <<-EOF &&
> error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
> + error: refs/heads/branch-non-existing-oid: badRefContent: points to non-existing object $non_existing_oid
> + error: refs/heads/branch-tree: badRefContent: points to non-commit object $tree_oid
> error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
> error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
> warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
> --
> 2.47.1
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 02/10] builtin/refs.h: get worktrees without reading head info
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
@ 2025-01-07 14:57 ` Karthik Nayak
2025-01-07 16:34 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
1 sibling, 1 reply; 168+ messages in thread
From: Karthik Nayak @ 2025-01-07 14:57 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 2379 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> In "packed-backend.c", there are some functions such as "create_snapshot"
> and "next_record" which would check the correctness of the content of
> the "packed-ref" file. When anything is bad, the program will die.
So you're saying, `create_snapshot()` and `next_record()` exit the
program on any error. Okay that seems to be valid.
> It may seem that we have nothing relevant to above feature, because we
> are going to read and parse the raw "packed-ref" file without creating
> the snapshot and using the ref iterator to check the consistency.
>
> However, when using "get_worktrees" in "builtin/refs", we will parse the
> head information. If the referent of the "HEAD" is inside the
> "packed-ref", we will call "create_snapshot" and "next_record" functions
> to parse the "packed-ref" to get the head information. And if there are
> something wrong, the program will die.
>
> Although this behavior has no harm for the program, it will
> short-circuit the program. When the users execute "git refs verify" or
> "git fsck", we don't want to simply die the program but rather show the
> warnings or errors as many as possible to info the users. So, we should
> avoiding reading the head info.
>
This is a bit tricky here. If the information for the `HEAD` ref is
incorrect in the packed-refs, git would exit early. Which is what we're
trying to avoid in this patch, by using the `get_worktrees_internal()`
function.
However, I would question if this is the right approach. Shouldn't
`get_worktree()` failing indicate that the repository is invalid? In
that case does it really make sense to allow the user to even run `git
refs verify`? Isn't the prerequisite for running the `git-refs(1)`
command a valid repository?
Generally, I'd agree that we try to obtain all errors so that the user
can get a full picture. But exposing internal worktree functions so we
treat invalid repos as valid repos so we can do that, seems a bit of a
stretch.
> Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
> worktrees, 2023-12-29), we have introduced a function
> "get_worktrees_internal" which allows us to get worktrees without
> reading head info.
>
> Create a new exposed function "get_worktrees_without_reading_head", then
> replace the "get_worktrees" in "builtin/refs" with the new created
> function.
>
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 10/10] builtin/fsck: add `git refs verify` child process
2025-01-07 12:00 ` shejialuo
@ 2025-01-07 15:52 ` Junio C Hamano
0 siblings, 0 replies; 168+ messages in thread
From: Junio C Hamano @ 2025-01-07 15:52 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
>> I've resolved it in the latest integration after pushing out the
>> 2.48-rc2 this morning, so there is no need to resend, but please
>> remember that it would be a possibility to rebase on top of an
>> updated 'master' *IF* the other topic graduates to 'master' a lot
>> earlier than this topic hits 'next' (IOW, until that happens there
>> is no need to rebase).
>>
>
> Thanks for the careful notification. I'll watch this.
For future reference and to help those who may be reading from the
sidelines, it is a good practice to see how your topic interacts
with other things in flight by making a trial merge to 'next' and to
'seen'. It would give you an opportunity to learn about what other
people are actively doing in the project.
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
@ 2025-01-07 16:33 ` Karthik Nayak
2025-01-17 14:00 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
1 sibling, 1 reply; 168+ messages in thread
From: Karthik Nayak @ 2025-01-07 16:33 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 3808 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> Although "git-fsck(1)" and "packed-backend.c" will check some
> consistency and correctness of "packed-refs" file, they never check the
> filetype of the "packed-refs". The user should always use "git
> packed-refs" command to create the raw regular "packed-refs" file, so we
> need to explicitly check this in "git refs verify".
>
> Use "lstat" to check the file mode. If we cannot check the file status,
> this is OK because there is a chance that there is no "packed-refs" in
> the repo.
>
> Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
> the user if "packed-refs" is not a regular file.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> refs/packed-backend.c | 33 +++++++++++++++++++++++++++++----
> t/t0602-reffiles-fsck.sh | 20 ++++++++++++++++++++
> 2 files changed, 49 insertions(+), 4 deletions(-)
>
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 3406f1e71d..d9eb2f8b71 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -4,6 +4,7 @@
> #include "../config.h"
> #include "../dir.h"
> #include "../gettext.h"
> +#include "../fsck.h"
> #include "../hash.h"
> #include "../hex.h"
> #include "../refs.h"
> @@ -1747,15 +1748,39 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> return empty_ref_iterator_begin();
> }
>
> -static int packed_fsck(struct ref_store *ref_store UNUSED,
> - struct fsck_options *o UNUSED,
> +static int packed_fsck(struct ref_store *ref_store,
> + struct fsck_options *o,
> struct worktree *wt)
> {
> + struct packed_ref_store *refs = packed_downcast(ref_store,
> + REF_STORE_READ, "fsck");
> + struct stat st;
> + int ret = 0;
>
> if (!is_main_worktree(wt))
> - return 0;
> + goto cleanup;
>
> - return 0;
> + /*
> + * If the packed-refs file doesn't exist, there's nothing to
> + * check.
> + */
> + if (lstat(refs->path, &st) < 0)
> + goto cleanup;
Since `lstat` return '-1' for all errors, we should check that the
`errno == ENOENT`.
> + if (o->verbose)
> + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
> +
> + if (!S_ISREG(st.st_mode)) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs";
> +
> + ret = fsck_report_ref(o, &report, FSCK_MSG_BAD_REF_FILETYPE,
> + "not a regular file");
> + goto cleanup;
> + }
> +
> +cleanup:
> + return ret;
> }
>
> struct ref_storage_be refs_be_packed = {
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 75f234a94a..307f94a3ca 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -626,4 +626,24 @@ test_expect_success 'ref content checks should work with worktrees' '
> test_cmp expect err
> '
>
> +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + cd repo &&
This should be in a subshell, so that at the end we can actually remove
the repo. This seems to be applicable to most of the other tests in this
file too. Perhaps, we should clean it up as a precursor commit to this
series?
> + test_commit default &&
> + git branch branch-1 &&
> + git branch branch-2 &&
> + git branch branch-3 &&
> + git pack-refs --all &&
> +
> + mv .git/packed-refs .git/packed-refs-back &&
> + ln -sf packed-refs-bak .git/packed-refs &&
This should be `ln -sf .git/packed-refs-back .git/packed-refs` no?
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: packed-refs: badRefFiletype: not a regular file
> + EOF
> + rm .git/packed-refs &&
> + test_cmp expect err
> +'
> +
> test_done
> --
> 2.47.1
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 02/10] builtin/refs.h: get worktrees without reading head info
2025-01-07 14:57 ` Karthik Nayak
@ 2025-01-07 16:34 ` shejialuo
2025-01-08 8:40 ` Karthik Nayak
0 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-07 16:34 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
On Tue, Jan 07, 2025 at 06:57:08AM -0800, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > In "packed-backend.c", there are some functions such as "create_snapshot"
> > and "next_record" which would check the correctness of the content of
> > the "packed-ref" file. When anything is bad, the program will die.
>
> So you're saying, `create_snapshot()` and `next_record()` exit the
> program on any error. Okay that seems to be valid.
>
> > It may seem that we have nothing relevant to above feature, because we
> > are going to read and parse the raw "packed-ref" file without creating
> > the snapshot and using the ref iterator to check the consistency.
> >
> > However, when using "get_worktrees" in "builtin/refs", we will parse the
> > head information. If the referent of the "HEAD" is inside the
> > "packed-ref", we will call "create_snapshot" and "next_record" functions
> > to parse the "packed-ref" to get the head information. And if there are
> > something wrong, the program will die.
> >
> > Although this behavior has no harm for the program, it will
> > short-circuit the program. When the users execute "git refs verify" or
> > "git fsck", we don't want to simply die the program but rather show the
> > warnings or errors as many as possible to info the users. So, we should
> > avoiding reading the head info.
> >
>
> This is a bit tricky here. If the information for the `HEAD` ref is
> incorrect in the packed-refs, git would exit early. Which is what we're
> trying to avoid in this patch, by using the `get_worktrees_internal()`
> function.
>
I think my commit message may confuse you here. The information of the
"HEAD" ref will never be stored in the "packed-refs", but if we need to
read the head information, we need to parse the "packed-refs" via
"create_snapshot" method. Even though the corresponding referent is
correct (and even if it is not correct, it won't let the program die),
"create_snapshot" will call "verify_buffer_safe" to check whether there
is a newline in the last line of the file. If not, it will die.
However, this is a bad thing. For example, if the HEAD points to
"refs/heads/main", now we need to use the code path from packed-backend,
we have to call "create_snapshot", the program will die. And we cannot
tell the user the other faults.
```packed-refs
<good_oid> refs/heads/main\n
<bad_oid> <bad_refname>\n
<oid> refs/heads/a
```
So, the motivation here is that we should not read HEAD at all when we
are doing consistency checking to make the code totally independent of
the "create_snapshot" and "next_record".
> However, I would question if this is the right approach. Shouldn't
> `get_worktree()` failing indicate that the repository is invalid? In
> that case does it really make sense to allow the user to even run `git
> refs verify`? Isn't the prerequisite for running the `git-refs(1)`
> command a valid repository?
>
As I have talked about above, even though the referent of "HEAD" is
good, "get_worktree()" will still fail because of some fatal errors in
"packed-refs" file. I don't think that the repository is invalid in this
situation.
Put it further more, in what situations, the users want to execute "git
refs verify" or "git-fsck". From my intuitive thinking, the users will
execute these check commands when something fails. They want to know
why. So we should execute these commands when the repository is invalid
to tell the user what may be wrong. And this is the value of these two
commands.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-01-08 0:54 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
1 sibling, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-01-08 0:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
On Sun, Jan 05, 2025 at 09:49:37PM +0800, shejialuo wrote:
[snip]
> However, when adding the new test for a bad header, the program will
> still die in the "create_snapshot" method. This is because we have
> checked the files-backend firstly and we use "parse_object" to check
> whether the object exists and whether the type is correct. This function
> will eventually call "create_snapshot" and "next_record" method, if
> there is something wrong with packed-backend, the program just dies.
>
> It's bad to just die the program because we want to report the problems
> as many as possible. We should avoid checking object and its type when
> packed-backend is broken. So, we should first check the consistency of
> the packed-backend then for files-backend.
>
> Add a new flag "safe_object_check" in "fsck_options", when there is
> anything wrong with the parsing process, set this flag to 0 to avoid
> checking objects in the later checks.
>
Here, I made a mistake. The most simplest way is to call the
"disable_replace_refs" function in "builtin/refs". So, there is a lot of
code and commit message needs to be fixed in the version 2. I have just
realized about this.
So, tell the reviewers in advance about this.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 02/10] builtin/refs.h: get worktrees without reading head info
2025-01-07 16:34 ` shejialuo
@ 2025-01-08 8:40 ` Karthik Nayak
0 siblings, 0 replies; 168+ messages in thread
From: Karthik Nayak @ 2025-01-08 8:40 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 3917 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> On Tue, Jan 07, 2025 at 06:57:08AM -0800, Karthik Nayak wrote:
>> shejialuo <shejialuo@gmail.com> writes:
>>
>> > In "packed-backend.c", there are some functions such as "create_snapshot"
>> > and "next_record" which would check the correctness of the content of
>> > the "packed-ref" file. When anything is bad, the program will die.
>>
>> So you're saying, `create_snapshot()` and `next_record()` exit the
>> program on any error. Okay that seems to be valid.
>>
>> > It may seem that we have nothing relevant to above feature, because we
>> > are going to read and parse the raw "packed-ref" file without creating
>> > the snapshot and using the ref iterator to check the consistency.
>> >
>> > However, when using "get_worktrees" in "builtin/refs", we will parse the
>> > head information. If the referent of the "HEAD" is inside the
>> > "packed-ref", we will call "create_snapshot" and "next_record" functions
>> > to parse the "packed-ref" to get the head information. And if there are
>> > something wrong, the program will die.
>> >
>> > Although this behavior has no harm for the program, it will
>> > short-circuit the program. When the users execute "git refs verify" or
>> > "git fsck", we don't want to simply die the program but rather show the
>> > warnings or errors as many as possible to info the users. So, we should
>> > avoiding reading the head info.
>> >
>>
>> This is a bit tricky here. If the information for the `HEAD` ref is
>> incorrect in the packed-refs, git would exit early. Which is what we're
>> trying to avoid in this patch, by using the `get_worktrees_internal()`
>> function.
>>
>
> I think my commit message may confuse you here. The information of the
> "HEAD" ref will never be stored in the "packed-refs", but if we need to
> read the head information, we need to parse the "packed-refs" via
> "create_snapshot" method. Even though the corresponding referent is
> correct (and even if it is not correct, it won't let the program die),
> "create_snapshot" will call "verify_buffer_safe" to check whether there
> is a newline in the last line of the file. If not, it will die.
>
> However, this is a bad thing. For example, if the HEAD points to
> "refs/heads/main", now we need to use the code path from packed-backend,
> we have to call "create_snapshot", the program will die. And we cannot
> tell the user the other faults.
>
> ```packed-refs
> <good_oid> refs/heads/main\n
> <bad_oid> <bad_refname>\n
> <oid> refs/heads/a
> ```
>
> So, the motivation here is that we should not read HEAD at all when we
> are doing consistency checking to make the code totally independent of
> the "create_snapshot" and "next_record".
>
Thanks for clarifying. I understand better the point now.
>> However, I would question if this is the right approach. Shouldn't
>> `get_worktree()` failing indicate that the repository is invalid? In
>> that case does it really make sense to allow the user to even run `git
>> refs verify`? Isn't the prerequisite for running the `git-refs(1)`
>> command a valid repository?
>>
>
> As I have talked about above, even though the referent of "HEAD" is
> good, "get_worktree()" will still fail because of some fatal errors in
> "packed-refs" file. I don't think that the repository is invalid in this
> situation.
>
> Put it further more, in what situations, the users want to execute "git
> refs verify" or "git-fsck". From my intuitive thinking, the users will
> execute these check commands when something fails. They want to know
> why. So we should execute these commands when the repository is invalid
> to tell the user what may be wrong. And this is the value of these two
> commands.
>
I agree with your inference here, we should try and figure out as much
as we can and report it, so clients can make informed decisions on how
to fix their refdb/repo. Thanks for explaining.
>
> Thanks,
> Jialuo
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 01/10] files-backend: add object check for regular ref
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
2025-01-07 14:17 ` Karthik Nayak
@ 2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 13:40 ` shejialuo
1 sibling, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:49:09PM +0800, shejialuo wrote:
> Although we use "parse_loose_ref_content" to check whether the object id
> is correct, we never parse it into the "struct object" structure thus we
> ignore checking whether there is a real object existing in the repo and
> whether the object type is correct.
>
> Use "parse_object" to parse the oid for the regular ref content. If the
> object does not exist, report the error to the user by reusing the fsck
> message "BAD_REF_CONTENT".
>
> Then, we need to check the type of the object. Just like "git-fsck(1)",
> we only report "not a commit" error when the ref is a branch. Last,
> update the test to exercise the code.
I wonder whether it wouldn't make more sense to put this into a generic
part of `git refs verify`. This isn't a check for whether the format of
the files backend is correct, but rather a check whether the refdb is
sane. As such, it also applies do the reftable backend.
So should we maybe extend `git refs verify` so that it also knows to
perform generic checks that apply independent of the backend in use?
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 02/10] builtin/refs.h: get worktrees without reading head info
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
2025-01-07 14:57 ` Karthik Nayak
@ 2025-01-16 13:57 ` Patrick Steinhardt
1 sibling, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:49:19PM +0800, shejialuo wrote:
The commit subject is a bit funny with "builtin/refs.h:". You probably
wanted to say "builtin/refs:".
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-07 16:33 ` Karthik Nayak
@ 2025-01-16 13:57 ` Patrick Steinhardt
1 sibling, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:49:28PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 3406f1e71d..d9eb2f8b71 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -4,6 +4,7 @@
> #include "../config.h"
> #include "../dir.h"
> #include "../gettext.h"
> +#include "../fsck.h"
Let's keep the alphabetic ordering here.
Other than that I have nothing to add on top of what Karthik mentioned
already.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-08 0:54 ` shejialuo
@ 2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:23 ` shejialuo
2025-02-17 13:16 ` shejialuo
1 sibling, 2 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:49:37PM +0800, shejialuo wrote:
> Add a new flag "safe_object_check" in "fsck_options", when there is
> anything wrong with the parsing process, set this flag to 0 to avoid
> checking objects in the later checks.
Okay, I understand the motivation: a corrupted refdb may be completely
bogus, so checking its objects may not be sensible.
For one of the preceding commits I made the suggestion to split out the
object checks into a generic part instead, as they aren't specific to
the backend. With such a scheme we could adapt the logic to first do the
backend-specific checks for the format, and only in case the backend
looks sane to us we'd execute those generic checks for that specific
backend. That'd allow us to get rid of the "safe object check" flag.
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index d9eb2f8b71..3b11abe5f8 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1748,12 +1748,100 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> return empty_ref_iterator_begin();
> }
>
> +static int packed_fsck_ref_next_line(struct fsck_options *o,
> + int line_number, const char *start,
> + const char *eof, const char **eol)
> +{
> + int ret = 0;
> +
> + *eol = memchr(start, '\n', eof - start);
> + if (!*eol) {
> + struct strbuf packed_entry = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> +
> + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
> + report.path = packed_entry.buf;
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
> + "'%.*s' is not terminated with a newline",
> + (int)(eof - start), start);
> +
> + /*
> + * There is no newline but we still want to parse it to the end of
> + * the buffer.
> + */
> + *eol = eof;
I don't quite understand. We've figured out that there isn't a newline,
so wouldn't that mean that we _are_ at the end of the buffer already?
> + strbuf_release(&packed_entry);
> + }
> +
> + return ret;
> +}
> +
> +static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
> +{
> + const char *err_fmt = NULL;
> + int fsck_msg_id = -1;
> +
> + if (!starts_with(start, "# pack-refs with:")) {
> + err_fmt = "'%.*s' does not start with '# pack-refs with:'";
> + fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
> + } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
> + err_fmt = "'%.*s' is not the official packed-refs header";
I wouldn't say "official", because it could totally be that whatever is
official changes in the future, e.g. when a new format is introduced.
Unlikely to happen, but saying "unknown packed-refs header" might be a
bit more future proof.
> + fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
> + }
> +
> + if (err_fmt && fsck_msg_id >= 0) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs.header";
> +
> + return fsck_report_ref(o, &report, fsck_msg_id, err_fmt,
> + (int)(eol - start), start);
> +
> + }
> +
> + return 0;
> +}
> +
> +static int packed_fsck_ref_content(struct fsck_options *o,
> + const char *start, const char *eof)
> +{
> + int line_number = 1;
> + const char *eol;
> + int ret = 0;
> +
> + ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
> + if (*start == '#') {
> + ret |= packed_fsck_ref_header(o, start, eol);
> +
> + start = eol + 1;
> + line_number++;
The header can only appear at the beginning of the file, can't it? But
we accept it in every line here. We should likely verify that it's
actually a header and not a line at some random place.
> + } else {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs";
> +
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_PACKED_REF_MISSING_HEADER,
> + "missing header line");
> + }
> +
> + /*
> + * If there is anything wrong during the parsing of the "packed-refs"
> + * file, we should not check the object of the refs.
> + */
> + if (ret)
> + o->safe_object_check = 0;
> +
> +
> + return ret;
> +}
> +
> static int packed_fsck(struct ref_store *ref_store,
> struct fsck_options *o,
> struct worktree *wt)
> {
> struct packed_ref_store *refs = packed_downcast(ref_store,
> REF_STORE_READ, "fsck");
> + struct strbuf packed_ref_content = STRBUF_INIT;
> struct stat st;
> int ret = 0;
>
> @@ -1779,7 +1867,24 @@ static int packed_fsck(struct ref_store *ref_store,
> goto cleanup;
> }
>
> + if (strbuf_read_file(&packed_ref_content, refs->path, 0) < 0) {
> + /*
> + * Although we have checked that the file exists, there is a possibility
> + * that it has been removed between the lstat() and the read attempt by
> + * another process. In that case, we should not report an error.
> + */
> + if (errno == ENOENT)
> + goto cleanup;
Unlikely, but good to guard us against that condition regardless. It's
still not entirely race-free though because the file could meanwhile
have changed into a symlink, and we wouldn't notice now. We could fix
that by using open(O_NOFOLLOW), fstat the returne file descriptor and
then use `strbuf_read()` to slurp in the file.
> + ret = error_errno("could not read %s", refs->path);
> + goto cleanup;
> + }
> +
> + ret = packed_fsck_ref_content(o, packed_ref_content.buf,
> + packed_ref_content.buf + packed_ref_content.len);
> +
> cleanup:
> + strbuf_release(&packed_ref_content);
> return ret;
> }
>
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 307f94a3ca..6c729e749a 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -646,4 +646,48 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> test_cmp expect err
> '
>
> +test_expect_success 'packed-refs header should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + cd repo &&
The same comment applies here as on a preceding test: cd should be
executed in a subshell.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries
2025-01-05 13:49 ` [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries shejialuo
@ 2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:33 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:49:51PM +0800, shejialuo wrote:
> We have already implemented the header consistency check for the raw
> "packed-refs" file. Before we implement the consistency check for each
> ref entry, let's analysis [1] which reports that "git fsck" cannot
> detect some binary zeros.
>
> "packed-backend.c::next_record" will use "check_refname_format" to check
> the consistency of the refname. If it is not OK, the program will die.
> So, we already have the code path and we must miss out something.
>
> We use the following code to get the refname:
>
> strbuf_add(&iter->refname_buf, p, eol - p);
> iter->base.refname = iter->refname_buf.buf
>
> In the above code, `p` is the start pointer of the refname and `eol` is
> the next newline pointer. We calculate the length of the refname by
> subtracting the two pointers. Then we add the memory range between `p`
> and `eol` to get the refname.
>
> However, if there are some NULL binaries in the memory range between `p`
You probably mean NUL characters, not NULL binaries?
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 3b11abe5f8..f6142a4402 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -493,6 +493,23 @@ static void verify_buffer_safe(struct snapshot *snapshot)
> last_line, eof - last_line);
> }
>
> +/*
> + * When parsing the "packed-refs" file, we will parse it line by line.
> + * Because we know the start pointer of the refname and the next
> + * newline pointer, we could calculate the length of the refname by
> + * subtracting the two pointers. However, there is a corner case where
> + * the refname contains corrupted embedded NULL binaries. And
> + * `check_refname_format()` will not catch this when the truncated
> + * refname is still a valid refname. To prevent this, we need to check
> + * whether the refname contains the NULL binaries.
> + */
> +static int refname_contains_null(struct strbuf refname)
> +{
> + if (refname.len != strlen(refname.buf))
> + return 1;
> + return 0;
> +}
> +
> #define SMALL_FILE_SIZE (32*1024)
>
> /*
> @@ -894,6 +911,9 @@ static int next_record(struct packed_ref_iterator *iter)
> strbuf_add(&iter->refname_buf, p, eol - p);
> iter->base.refname = iter->refname_buf.buf;
>
> + if (refname_contains_null(iter->refname_buf))
We can replace this with `memchr(iter->refname_buf.buf, '\0',
iter->refname_buf.len)`, which should be more efficient than using
strlen(3p).
> + die("packed refname contains embedded NULL: %s", iter->base.refname);
> +
I was a bit surprised to find that we modify the way that we read refs
from the packed-refs file instead of adapting the fsck code. But I think
this check is sensible.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check
2025-01-05 13:49 ` [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:35 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:49:59PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index f6142a4402..6e521a9f87 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1822,7 +1822,96 @@ static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
> return 0;
> }
>
> +static int packed_fsck_ref_peeled_line(struct fsck_options *o,
> + struct ref_store *ref_store, int line_number,
> + const char *start, const char *eol)
> +{
> + struct strbuf peeled_entry = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> + struct object_id peeled;
> + const char *p;
> + int ret = 0;
> +
> + strbuf_addf(&peeled_entry, "packed-refs line %d", line_number);
> + report.path = peeled_entry.buf;
> +
> + start++;
> + if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_PACKED_REF_ENTRY,
> + "'%.*s' has invalid peeled oid",
> + (int)(eol - start), start);
> + goto cleanup;
> + }
> +
> + if (p != eol) {
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_PACKED_REF_ENTRY,
> + "has trailing garbage after peeled oid '%.*s'",
> + (int)(eol - p), p);
> + goto cleanup;
> + }
> +
> +cleanup:
> + strbuf_release(&peeled_entry);
> + return ret;
> +}
> +
> +static int packed_fsck_ref_main_line(struct fsck_options *o,
> + struct ref_store *ref_store, int line_number,
> + const char *start, const char *eol)
> +{
> + struct strbuf packed_entry = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> + struct strbuf refname = STRBUF_INIT;
It feels quite inefficient to create a separate buffer for every
invocation of this function, as there can be many million refs in a
repo. Might be something to avoid by passing in a scratch buffer.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info
2025-01-05 13:50 ` [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info shejialuo
@ 2025-01-16 13:57 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:50:10PM +0800, shejialuo wrote:
> We have already check whether the oid hash is correct by using
> `parse_oid_hex_algop`. However, we doesn't check whether the object
> exists. It may seem that we could do this when we are parsing the raw
> "packed-refs" file. But this is impossible. Let's analysis why.
>
> We will use "parse_object" function to get the "struct object". However,
> this function will eventually call the "create_snapshot" and
> "next_record" function in "packed-backend.c". If there is anything
> wrong, it will die the program. And we don't want to die the program
> during the check.
>
> So, we should store the information in the parsing process. And if there
> is nothing wrong in the parsing process, we could continue to check
> things. So, create "fsck_packed_ref_entry" to do this.
This step can be avoided if we made the check generic.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 08/10] packed-backend: add check for object consistency
2025-01-05 13:50 ` [PATCH 08/10] packed-backend: add check for object consistency shejialuo
@ 2025-01-16 13:57 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:50:19PM +0800, shejialuo wrote:
> If there is nothing wrong when parsing the raw file "packed-refs", we
> could then iterate the "entries" to check the object consistency. There
> are two kinds of ref entry: one is the normal and another is peeled. For
> both situations, we need to use "parse_object" function to parse the
> object id to get the object. If the object does not exist, we will
> report an error to the user.
>
> Create a new function "packed_fsck_ref_oid" to do above then update the
> unit test to exercise the code.
This one, as well.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted
2025-01-05 13:50 ` [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-01-16 13:57 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-16 13:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Sun, Jan 05, 2025 at 09:50:31PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index d83ce2838f..df65fec5a5 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1980,6 +1989,50 @@ static int packed_fsck_ref_oid(struct fsck_options *o, struct ref_store *ref_sto
> return ret;
> }
>
> +static int packed_fsck_ref_sorted(struct fsck_options *o,
> + struct ref_store *ref_store,
> + struct fsck_packed_ref_entry **entries,
> + int nr)
> +{
> + size_t hexsz = ref_store->repo->hash_algo->hexsz;
> + struct strbuf packed_entry = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> + struct strbuf refname1 = STRBUF_INIT;
> + struct strbuf refname2 = STRBUF_INIT;
> + int ret = 0;
> +
> + for (int i = 1; i < nr; i++) {
> + const char *r1 = entries[i - 1]->record.start + hexsz + 1;
> + const char *r2 = entries[i]->record.start + hexsz + 1;
> +
> + if (cmp_packed_refname(r1, r2) >= 0) {
Makes sense. It has been a source of bugs a couple years ago, and it can
silently make you receive wrong results, so this is quite a sensible
check to have.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 01/10] files-backend: add object check for regular ref
2025-01-16 13:57 ` Patrick Steinhardt
@ 2025-01-17 13:40 ` shejialuo
2025-01-24 7:54 ` Patrick Steinhardt
0 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-17 13:40 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 16, 2025 at 02:57:25PM +0100, Patrick Steinhardt wrote:
> On Sun, Jan 05, 2025 at 09:49:09PM +0800, shejialuo wrote:
> > Although we use "parse_loose_ref_content" to check whether the object id
> > is correct, we never parse it into the "struct object" structure thus we
> > ignore checking whether there is a real object existing in the repo and
> > whether the object type is correct.
> >
> > Use "parse_object" to parse the oid for the regular ref content. If the
> > object does not exist, report the error to the user by reusing the fsck
> > message "BAD_REF_CONTENT".
> >
> > Then, we need to check the type of the object. Just like "git-fsck(1)",
> > we only report "not a commit" error when the ref is a branch. Last,
> > update the test to exercise the code.
>
> I wonder whether it wouldn't make more sense to put this into a generic
> part of `git refs verify`. This isn't a check for whether the format of
> the files backend is correct, but rather a check whether the refdb is
> sane. As such, it also applies do the reftable backend.
>
> So should we maybe extend `git refs verify` so that it also knows to
> perform generic checks that apply independent of the backend in use?
>
I somehow understand your meaning here and I think what your meaning
here is that we could use internal ref method to parse the oid after we
check the format of the ref files. Thus, we could totally make these two
different kinds of checks separately.
However, if we have already parsed the raw ref files, we could reuse the
parsed hex and then use "parse_object" to get the object id to check.
This is the main reason why I add this check now.
And I agree with your thinking here. Actually, we may put this into
object check part. Because in "git-fsck(1)", we parse the refdb to know
whether an object is dangling or not.
I will postpone these checks in the later patches. Really thanks here
for this wonderful suggestion.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular
2025-01-07 16:33 ` Karthik Nayak
@ 2025-01-17 14:00 ` shejialuo
2025-01-17 22:01 ` Eric Sunshine
0 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-17 14:00 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
[snip]
> > -static int packed_fsck(struct ref_store *ref_store UNUSED,
> > - struct fsck_options *o UNUSED,
> > +static int packed_fsck(struct ref_store *ref_store,
> > + struct fsck_options *o,
> > struct worktree *wt)
> > {
> > + struct packed_ref_store *refs = packed_downcast(ref_store,
> > + REF_STORE_READ, "fsck");
> > + struct stat st;
> > + int ret = 0;
> >
> > if (!is_main_worktree(wt))
> > - return 0;
> > + goto cleanup;
> >
> > - return 0;
> > + /*
> > + * If the packed-refs file doesn't exist, there's nothing to
> > + * check.
> > + */
> > + if (lstat(refs->path, &st) < 0)
> > + goto cleanup;
>
> Since `lstat` return '-1' for all errors, we should check that the
> `errno == ENOENT`.
>
I agree here, if the reason is not "errno == ENOENT", we should report
an error to the user.
[snip]
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -626,4 +626,24 @@ test_expect_success 'ref content checks should work with worktrees' '
> > test_cmp expect err
> > '
> >
> > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + cd repo &&
>
> This should be in a subshell, so that at the end we can actually remove
> the repo. This seems to be applicable to most of the other tests in this
> file too. Perhaps, we should clean it up as a precursor commit to this
> series?
I have searched the usage of "test_when_finished", and I don't know why
we need to use subshell. Could you please explain this further here.
>
> > + test_commit default &&
> > + git branch branch-1 &&
> > + git branch branch-2 &&
> > + git branch branch-3 &&
> > + git pack-refs --all &&
> > +
> > + mv .git/packed-refs .git/packed-refs-back &&
> > + ln -sf packed-refs-bak .git/packed-refs &&
>
> This should be `ln -sf .git/packed-refs-back .git/packed-refs` no?
>
No. This should not be `ln -sf .git/packed-refs-back .git/packed-refs`.
This is because it is a relative symlink. And the file
".git/packed-refs-back" and ".git/packed-refs" are in the same
directory. So, from the perspective of ".git/packed-refs", it should be
the "packed-refs-back".
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
2025-01-16 13:57 ` Patrick Steinhardt
@ 2025-01-17 14:23 ` shejialuo
2025-01-24 7:51 ` Patrick Steinhardt
2025-02-17 13:16 ` shejialuo
1 sibling, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-17 14:23 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 16, 2025 at 02:57:37PM +0100, Patrick Steinhardt wrote:
> On Sun, Jan 05, 2025 at 09:49:37PM +0800, shejialuo wrote:
> > Add a new flag "safe_object_check" in "fsck_options", when there is
> > anything wrong with the parsing process, set this flag to 0 to avoid
> > checking objects in the later checks.
>
> Okay, I understand the motivation: a corrupted refdb may be completely
> bogus, so checking its objects may not be sensible.
>
> For one of the preceding commits I made the suggestion to split out the
> object checks into a generic part instead, as they aren't specific to
> the backend. With such a scheme we could adapt the logic to first do the
> backend-specific checks for the format, and only in case the backend
> looks sane to us we'd execute those generic checks for that specific
> backend. That'd allow us to get rid of the "safe object check" flag.
>
Yes, I agree with you here. And I won't touch this topic in the next
version. Let me make this patch concentrate on the "packed-ref" format.
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index d9eb2f8b71..3b11abe5f8 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -1748,12 +1748,100 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> > return empty_ref_iterator_begin();
> > }
> >
> > +static int packed_fsck_ref_next_line(struct fsck_options *o,
> > + int line_number, const char *start,
> > + const char *eof, const char **eol)
> > +{
> > + int ret = 0;
> > +
> > + *eol = memchr(start, '\n', eof - start);
> > + if (!*eol) {
> > + struct strbuf packed_entry = STRBUF_INIT;
> > + struct fsck_ref_report report = { 0 };
> > +
> > + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
> > + report.path = packed_entry.buf;
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
> > + "'%.*s' is not terminated with a newline",
> > + (int)(eof - start), start);
> > +
> > + /*
> > + * There is no newline but we still want to parse it to the end of
> > + * the buffer.
> > + */
> > + *eol = eof;
>
> I don't quite understand. We've figured out that there isn't a newline,
> so wouldn't that mean that we _are_ at the end of the buffer already?
>
In the "packed-refs" file, the last line should end with a newline. If
not, this is a fatal error. The motivation why I do this is that for
each line, we could pass the "line_start" and "eol" to check. But if
there is no newline, the "eol" will be NULL. So, I change it to "eof" to
make sure that we could follow the same logic when "eol" is not NULL.
I guess I should not handle this in this function which may cause
confusion here. I will improve this in the next version.
> > + strbuf_release(&packed_entry);
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
> > +{
> > + const char *err_fmt = NULL;
> > + int fsck_msg_id = -1;
> > +
> > + if (!starts_with(start, "# pack-refs with:")) {
> > + err_fmt = "'%.*s' does not start with '# pack-refs with:'";
> > + fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
> > + } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
> > + err_fmt = "'%.*s' is not the official packed-refs header";
>
> I wouldn't say "official", because it could totally be that whatever is
> official changes in the future, e.g. when a new format is introduced.
> Unlikely to happen, but saying "unknown packed-refs header" might be a
> bit more future proof.
>
I will improve this in the next version.
> > + fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
> > + }
> > +
> > + if (err_fmt && fsck_msg_id >= 0) {
> > + struct fsck_ref_report report = { 0 };
> > + report.path = "packed-refs.header";
> > +
> > + return fsck_report_ref(o, &report, fsck_msg_id, err_fmt,
> > + (int)(eol - start), start);
> > +
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int packed_fsck_ref_content(struct fsck_options *o,
> > + const char *start, const char *eof)
> > +{
> > + int line_number = 1;
> > + const char *eol;
> > + int ret = 0;
> > +
> > + ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
> > + if (*start == '#') {
> > + ret |= packed_fsck_ref_header(o, start, eol);
> > +
> > + start = eol + 1;
> > + line_number++;
>
> The header can only appear at the beginning of the file, can't it? But
> we accept it in every line here. We should likely verify that it's
> actually a header and not a line at some random place.
>
Yes. But we don't accept it in every line. Because in here, we are
getting the first line "start" and "eol" by using
"packed_fsck_ref_next_line". Only it starts with "#", we will check the
header consistency.
> > + } else {
> > + struct fsck_ref_report report = { 0 };
> > + report.path = "packed-refs";
> > +
> > + ret |= fsck_report_ref(o, &report,
> > + FSCK_MSG_PACKED_REF_MISSING_HEADER,
> > + "missing header line");
> > + }
> > +
> > + /*
> > + * If there is anything wrong during the parsing of the "packed-refs"
> > + * file, we should not check the object of the refs.
> > + */
> > + if (ret)
> > + o->safe_object_check = 0;
> > +
> > +
> > + return ret;
> > +}
> > +
> > static int packed_fsck(struct ref_store *ref_store,
> > struct fsck_options *o,
> > struct worktree *wt)
> > {
> > struct packed_ref_store *refs = packed_downcast(ref_store,
> > REF_STORE_READ, "fsck");
> > + struct strbuf packed_ref_content = STRBUF_INIT;
> > struct stat st;
> > int ret = 0;
> >
> > @@ -1779,7 +1867,24 @@ static int packed_fsck(struct ref_store *ref_store,
> > goto cleanup;
> > }
> >
> > + if (strbuf_read_file(&packed_ref_content, refs->path, 0) < 0) {
> > + /*
> > + * Although we have checked that the file exists, there is a possibility
> > + * that it has been removed between the lstat() and the read attempt by
> > + * another process. In that case, we should not report an error.
> > + */
> > + if (errno == ENOENT)
> > + goto cleanup;
>
> Unlikely, but good to guard us against that condition regardless. It's
> still not entirely race-free though because the file could meanwhile
> have changed into a symlink, and we wouldn't notice now. We could fix
> that by using open(O_NOFOLLOW), fstat the returne file descriptor and
> then use `strbuf_read()` to slurp in the file.
>
Would this be too complicated for us to avoid race condition and we will
introduce a lot of code to handle above logic. Because there is a
possibility that when finishing reading the file content to the memory,
the file could be changed into a symlink and we cannot notice. So, I
wanna say we can't avoid race condition totally. It would be good if we
avoid race, but what I am concern about here is that we would make the
logic too complicated. So, could we make it unchanged?
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries
2025-01-16 13:57 ` Patrick Steinhardt
@ 2025-01-17 14:33 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-01-17 14:33 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 16, 2025 at 02:57:40PM +0100, Patrick Steinhardt wrote:
> On Sun, Jan 05, 2025 at 09:49:51PM +0800, shejialuo wrote:
> > We have already implemented the header consistency check for the raw
> > "packed-refs" file. Before we implement the consistency check for each
> > ref entry, let's analysis [1] which reports that "git fsck" cannot
> > detect some binary zeros.
> >
> > "packed-backend.c::next_record" will use "check_refname_format" to check
> > the consistency of the refname. If it is not OK, the program will die.
> > So, we already have the code path and we must miss out something.
> >
> > We use the following code to get the refname:
> >
> > strbuf_add(&iter->refname_buf, p, eol - p);
> > iter->base.refname = iter->refname_buf.buf
> >
> > In the above code, `p` is the start pointer of the refname and `eol` is
> > the next newline pointer. We calculate the length of the refname by
> > subtracting the two pointers. Then we add the memory range between `p`
> > and `eol` to get the refname.
> >
> > However, if there are some NULL binaries in the memory range between `p`
>
> You probably mean NUL characters, not NULL binaries?
>
Yes, I will improve this in the next version.
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index 3b11abe5f8..f6142a4402 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -493,6 +493,23 @@ static void verify_buffer_safe(struct snapshot *snapshot)
> > last_line, eof - last_line);
> > }
> >
> > +/*
> > + * When parsing the "packed-refs" file, we will parse it line by line.
> > + * Because we know the start pointer of the refname and the next
> > + * newline pointer, we could calculate the length of the refname by
> > + * subtracting the two pointers. However, there is a corner case where
> > + * the refname contains corrupted embedded NULL binaries. And
> > + * `check_refname_format()` will not catch this when the truncated
> > + * refname is still a valid refname. To prevent this, we need to check
> > + * whether the refname contains the NULL binaries.
> > + */
> > +static int refname_contains_null(struct strbuf refname)
> > +{
> > + if (refname.len != strlen(refname.buf))
> > + return 1;
> > + return 0;
> > +}
> > +
> > #define SMALL_FILE_SIZE (32*1024)
> >
> > /*
> > @@ -894,6 +911,9 @@ static int next_record(struct packed_ref_iterator *iter)
> > strbuf_add(&iter->refname_buf, p, eol - p);
> > iter->base.refname = iter->refname_buf.buf;
> >
> > + if (refname_contains_null(iter->refname_buf))
>
> We can replace this with `memchr(iter->refname_buf.buf, '\0',
> iter->refname_buf.len)`, which should be more efficient than using
> strlen(3p).
Thanks for the suggestion. Will improve this in the next version.
>
> > + die("packed refname contains embedded NULL: %s", iter->base.refname);
> > +
>
> I was a bit surprised to find that we modify the way that we read refs
> from the packed-refs file instead of adapting the fsck code. But I think
> this check is sensible.
Actually, I am also surprised here. And this thing is extremely
interesting. When I implement all the fsck code, I find I still cannot
detect the error reported in [1] which is the motivation why we want to
add checks for ref explicitly.
And I dive into the code to fix this problem. The reason why I put here
is that we are going to implement the checks like what "next_record"
does.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check
2025-01-16 13:57 ` Patrick Steinhardt
@ 2025-01-17 14:35 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-01-17 14:35 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 16, 2025 at 02:57:43PM +0100, Patrick Steinhardt wrote:
> On Sun, Jan 05, 2025 at 09:49:59PM +0800, shejialuo wrote:
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index f6142a4402..6e521a9f87 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -1822,7 +1822,96 @@ static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
> > return 0;
> > }
> >
> > +static int packed_fsck_ref_peeled_line(struct fsck_options *o,
> > + struct ref_store *ref_store, int line_number,
> > + const char *start, const char *eol)
> > +{
> > + struct strbuf peeled_entry = STRBUF_INIT;
> > + struct fsck_ref_report report = { 0 };
> > + struct object_id peeled;
> > + const char *p;
> > + int ret = 0;
> > +
> > + strbuf_addf(&peeled_entry, "packed-refs line %d", line_number);
> > + report.path = peeled_entry.buf;
> > +
> > + start++;
> > + if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
> > + ret |= fsck_report_ref(o, &report,
> > + FSCK_MSG_BAD_PACKED_REF_ENTRY,
> > + "'%.*s' has invalid peeled oid",
> > + (int)(eol - start), start);
> > + goto cleanup;
> > + }
> > +
> > + if (p != eol) {
> > + ret |= fsck_report_ref(o, &report,
> > + FSCK_MSG_BAD_PACKED_REF_ENTRY,
> > + "has trailing garbage after peeled oid '%.*s'",
> > + (int)(eol - p), p);
> > + goto cleanup;
> > + }
> > +
> > +cleanup:
> > + strbuf_release(&peeled_entry);
> > + return ret;
> > +}
> > +
> > +static int packed_fsck_ref_main_line(struct fsck_options *o,
> > + struct ref_store *ref_store, int line_number,
> > + const char *start, const char *eol)
> > +{
> > + struct strbuf packed_entry = STRBUF_INIT;
> > + struct fsck_ref_report report = { 0 };
> > + struct strbuf refname = STRBUF_INIT;
>
> It feels quite inefficient to create a separate buffer for every
> invocation of this function, as there can be many million refs in a
> repo. Might be something to avoid by passing in a scratch buffer.
>
I see. I will improve this in the next version.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular
2025-01-17 14:00 ` shejialuo
@ 2025-01-17 22:01 ` Eric Sunshine
2025-01-18 3:05 ` shejialuo
2025-01-19 8:03 ` Karthik Nayak
0 siblings, 2 replies; 168+ messages in thread
From: Eric Sunshine @ 2025-01-17 22:01 UTC (permalink / raw)
To: shejialuo
Cc: Karthik Nayak, git, Patrick Steinhardt, Junio C Hamano,
Michael Haggerty
On Fri, Jan 17, 2025 at 8:59 AM shejialuo <shejialuo@gmail.com> wrote:
> On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote:
> > shejialuo <shejialuo@gmail.com> writes:
> > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> > > + test_when_finished "rm -rf repo" &&
> > > + git init repo &&
> > > + cd repo &&
> >
> > This should be in a subshell, so that at the end we can actually remove
> > the repo. This seems to be applicable to most of the other tests in this
> > file too. Perhaps, we should clean it up as a precursor commit to this
> > series?
>
> I have searched the usage of "test_when_finished", and I don't know why
> we need to use subshell. Could you please explain this further here.
Karthik may have been thinking about operating systems, such as
Microsoft Windows, which won't allow a directory to be deleted if that
directory is in use. In this case, because the test cd's into "repo"
and never cd's elsewhere, the directory is still in use when
test_when_finished() tries to delete "repo".
However, there is an even more important reason to use a subshell, and
that is because a subshell ensures that the current working directory
is effectively restored to the path which was current before the cd
command. This is important since it guarantees that subsequent tests
will be run in the correct directory even if the preceding test bombed
out part way through. For example:
test_expect_success 'foo' '
git init repo &&
cd repo &&
...some more commands... &&
cd ..
'
If one of the commands in "...some more commands..." fails, then the
`cd ..` will never be reached, and the current working directory will
remain "repo" rather than reverting to the path prior to the cd
command. Thus, any tests which follow this one in the script will end
up running in the wrong directory. The proper way to protect against
this is:
test_expect_success 'foo' '
git init repo &&
(
cd repo &&
...some more commands...
)
'
Exiting the subshell will correctly restore the current working
directory to the original path _regardless_ of whether the test
succeeds or fails somewhere in "...some more commands...". Using a
subshell also means that you don't have to manually restore the
working directory via `cd ..` or similar.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular
2025-01-17 22:01 ` Eric Sunshine
@ 2025-01-18 3:05 ` shejialuo
2025-01-19 8:03 ` Karthik Nayak
1 sibling, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-01-18 3:05 UTC (permalink / raw)
To: Eric Sunshine
Cc: Karthik Nayak, git, Patrick Steinhardt, Junio C Hamano,
Michael Haggerty
On Fri, Jan 17, 2025 at 05:01:21PM -0500, Eric Sunshine wrote:
> On Fri, Jan 17, 2025 at 8:59 AM shejialuo <shejialuo@gmail.com> wrote:
> > On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote:
> > > shejialuo <shejialuo@gmail.com> writes:
> > > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> > > > + test_when_finished "rm -rf repo" &&
> > > > + git init repo &&
> > > > + cd repo &&
> > >
> > > This should be in a subshell, so that at the end we can actually remove
> > > the repo. This seems to be applicable to most of the other tests in this
> > > file too. Perhaps, we should clean it up as a precursor commit to this
> > > series?
> >
> > I have searched the usage of "test_when_finished", and I don't know why
> > we need to use subshell. Could you please explain this further here.
>
> Karthik may have been thinking about operating systems, such as
> Microsoft Windows, which won't allow a directory to be deleted if that
> directory is in use. In this case, because the test cd's into "repo"
> and never cd's elsewhere, the directory is still in use when
> test_when_finished() tries to delete "repo".
>
> However, there is an even more important reason to use a subshell, and
> that is because a subshell ensures that the current working directory
> is effectively restored to the path which was current before the cd
> command. This is important since it guarantees that subsequent tests
> will be run in the correct directory even if the preceding test bombed
> out part way through. For example:
>
> test_expect_success 'foo' '
> git init repo &&
> cd repo &&
> ...some more commands... &&
> cd ..
> '
>
> If one of the commands in "...some more commands..." fails, then the
> `cd ..` will never be reached, and the current working directory will
> remain "repo" rather than reverting to the path prior to the cd
> command. Thus, any tests which follow this one in the script will end
> up running in the wrong directory. The proper way to protect against
> this is:
>
> test_expect_success 'foo' '
> git init repo &&
> (
> cd repo &&
> ...some more commands...
> )
> '
>
> Exiting the subshell will correctly restore the current working
> directory to the original path _regardless_ of whether the test
> succeeds or fails somewhere in "...some more commands...". Using a
> subshell also means that you don't have to manually restore the
> working directory via `cd ..` or similar.
Thanks for above detailed explanation. I somehow understand why there
would be so many "repo/repo/repo" when I execute the test. I have
thought that `test_expect_success` command will make the environment of
each test totally independent. I will improve this in the next version.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular
2025-01-17 22:01 ` Eric Sunshine
2025-01-18 3:05 ` shejialuo
@ 2025-01-19 8:03 ` Karthik Nayak
1 sibling, 0 replies; 168+ messages in thread
From: Karthik Nayak @ 2025-01-19 8:03 UTC (permalink / raw)
To: Eric Sunshine, shejialuo
Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 2746 bytes --]
Eric Sunshine <sunshine@sunshineco.com> writes:
> On Fri, Jan 17, 2025 at 8:59 AM shejialuo <shejialuo@gmail.com> wrote:
>> On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote:
>> > shejialuo <shejialuo@gmail.com> writes:
>> > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
>> > > + test_when_finished "rm -rf repo" &&
>> > > + git init repo &&
>> > > + cd repo &&
>> >
>> > This should be in a subshell, so that at the end we can actually remove
>> > the repo. This seems to be applicable to most of the other tests in this
>> > file too. Perhaps, we should clean it up as a precursor commit to this
>> > series?
>>
>> I have searched the usage of "test_when_finished", and I don't know why
>> we need to use subshell. Could you please explain this further here.
>
> Karthik may have been thinking about operating systems, such as
> Microsoft Windows, which won't allow a directory to be deleted if that
> directory is in use. In this case, because the test cd's into "repo"
> and never cd's elsewhere, the directory is still in use when
> test_when_finished() tries to delete "repo".
>
I didn't know this either. I was mostly talking about what you mentioned
below.
> However, there is an even more important reason to use a subshell, and
> that is because a subshell ensures that the current working directory
> is effectively restored to the path which was current before the cd
> command. This is important since it guarantees that subsequent tests
> will be run in the correct directory even if the preceding test bombed
> out part way through. For example:
>
> test_expect_success 'foo' '
> git init repo &&
> cd repo &&
> ...some more commands... &&
> cd ..
> '
>
> If one of the commands in "...some more commands..." fails, then the
> `cd ..` will never be reached, and the current working directory will
> remain "repo" rather than reverting to the path prior to the cd
> command. Thus, any tests which follow this one in the script will end
> up running in the wrong directory. The proper way to protect against
> this is:
>
> test_expect_success 'foo' '
> git init repo &&
> (
> cd repo &&
> ...some more commands...
> )
> '
>
> Exiting the subshell will correctly restore the current working
> directory to the original path _regardless_ of whether the test
> succeeds or fails somewhere in "...some more commands...". Using a
> subshell also means that you don't have to manually restore the
> working directory via `cd ..` or similar.
This is was a super nice explanation compared to my single sentence.
Thanks!
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
2025-01-17 14:23 ` shejialuo
@ 2025-01-24 7:51 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-24 7:51 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Fri, Jan 17, 2025 at 10:23:06PM +0800, shejialuo wrote:
> On Thu, Jan 16, 2025 at 02:57:37PM +0100, Patrick Steinhardt wrote:
> > On Sun, Jan 05, 2025 at 09:49:37PM +0800, shejialuo wrote:
> > > @@ -1779,7 +1867,24 @@ static int packed_fsck(struct ref_store *ref_store,
> > > goto cleanup;
> > > }
> > >
> > > + if (strbuf_read_file(&packed_ref_content, refs->path, 0) < 0) {
> > > + /*
> > > + * Although we have checked that the file exists, there is a possibility
> > > + * that it has been removed between the lstat() and the read attempt by
> > > + * another process. In that case, we should not report an error.
> > > + */
> > > + if (errno == ENOENT)
> > > + goto cleanup;
> >
> > Unlikely, but good to guard us against that condition regardless. It's
> > still not entirely race-free though because the file could meanwhile
> > have changed into a symlink, and we wouldn't notice now. We could fix
> > that by using open(O_NOFOLLOW), fstat the returne file descriptor and
> > then use `strbuf_read()` to slurp in the file.
> >
>
> Would this be too complicated for us to avoid race condition and we will
> introduce a lot of code to handle above logic. Because there is a
> possibility that when finishing reading the file content to the memory,
> the file could be changed into a symlink and we cannot notice. So, I
> wanna say we can't avoid race condition totally. It would be good if we
> avoid race, but what I am concern about here is that we would make the
> logic too complicated. So, could we make it unchanged?
It would ultimately only be two additional function calls, so I don't
think it's going to add a ton of complexity. Whether things are changing
_after_ we have opened and read the file is a different issue, and I
agree that we shouldn't have to care about that case. What we're after
is whether things are correct when running consistency checks, it's
always a possibility that e.g. the packed-refs file gets rewritten while
we do it.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 01/10] files-backend: add object check for regular ref
2025-01-17 13:40 ` shejialuo
@ 2025-01-24 7:54 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-01-24 7:54 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Fri, Jan 17, 2025 at 09:40:18PM +0800, shejialuo wrote:
> On Thu, Jan 16, 2025 at 02:57:25PM +0100, Patrick Steinhardt wrote:
> > On Sun, Jan 05, 2025 at 09:49:09PM +0800, shejialuo wrote:
> > > Although we use "parse_loose_ref_content" to check whether the object id
> > > is correct, we never parse it into the "struct object" structure thus we
> > > ignore checking whether there is a real object existing in the repo and
> > > whether the object type is correct.
> > >
> > > Use "parse_object" to parse the oid for the regular ref content. If the
> > > object does not exist, report the error to the user by reusing the fsck
> > > message "BAD_REF_CONTENT".
> > >
> > > Then, we need to check the type of the object. Just like "git-fsck(1)",
> > > we only report "not a commit" error when the ref is a branch. Last,
> > > update the test to exercise the code.
> >
> > I wonder whether it wouldn't make more sense to put this into a generic
> > part of `git refs verify`. This isn't a check for whether the format of
> > the files backend is correct, but rather a check whether the refdb is
> > sane. As such, it also applies do the reftable backend.
> >
> > So should we maybe extend `git refs verify` so that it also knows to
> > perform generic checks that apply independent of the backend in use?
> >
>
> I somehow understand your meaning here and I think what your meaning
> here is that we could use internal ref method to parse the oid after we
> check the format of the ref files. Thus, we could totally make these two
> different kinds of checks separately.
>
> However, if we have already parsed the raw ref files, we could reuse the
> parsed hex and then use "parse_object" to get the object id to check.
> This is the main reason why I add this check now.
>
> And I agree with your thinking here. Actually, we may put this into
> object check part. Because in "git-fsck(1)", we parse the refdb to know
> whether an object is dangling or not.
>
> I will postpone these checks in the later patches. Really thanks here
> for this wonderful suggestion.
Yeah. I'm thinking ahead a bit in this context and want to avoid that we
eventually have to reimplement the same set of checks for every single
ref backend that we have. So separating the backend-generic bits from
the non-generic ones is what I'm after.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v2 0/8] add more ref consistency checks
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
` (9 preceding siblings ...)
2025-01-05 13:50 ` [PATCH 10/10] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-01-30 4:04 ` shejialuo
2025-01-30 4:06 ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
` (8 more replies)
10 siblings, 9 replies; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:04 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi All:
This version handles the following things:
1. Remove code which checks the object where refs point to suggested by
Patrick.
2. Use subshell for the shell script to fix the problem to make sure the
current working directory consistent.
3. Optimize to avoid allocating too much memory.
This version is rebased to the latest master due to semantic conflict. I
don't provide range-diff here the mumber of commit is reduced. However,
it won't bring too much burdern for the reviewer due to small change.
Thanks,
Jialuo
shejialuo (8):
t0602: use subshell to ensure working directory unchanged
builtin/refs: get worktrees without reading head info
packed-backend: check whether the "packed-refs" is regular
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NUL characters
packed-backend: add "packed-refs" entry consistency check
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.txt | 22 +
builtin/fsck.c | 30 +
builtin/refs.c | 2 +-
fsck.h | 6 +
refs/packed-backend.c | 343 +++++++++-
t/t0602-reffiles-fsck.sh | 1107 +++++++++++++++++++--------------
worktree.c | 5 +
worktree.h | 6 +
8 files changed, 1040 insertions(+), 481 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
@ 2025-01-30 4:06 ` shejialuo
2025-01-30 17:53 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 2/8] builtin/refs: get worktrees without reading head info shejialuo
` (7 subsequent siblings)
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:06 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
For every test, we would execute the command "cd repo" in the first but
we never execute the command "cd .." to restore the working directory.
However, it's either not a good idea use above way. Because if any test
fails between "cd repo" and "cd ..", the "cd .." will never be reached.
And we cannot correctly restore the working directory.
Let's use subshell to ensure that the current working directory could be
restored to the correct path.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
1 file changed, 494 insertions(+), 473 deletions(-)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..cf7a202d0d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -14,222 +14,229 @@ test_expect_success 'ref name should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
-
- git commit --allow-empty -m initial &&
- git checkout -b default-branch &&
- git tag default-tag &&
- git tag multi_hierarchy/default-tag &&
-
- cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
- git refs verify 2>err &&
- test_must_be_empty err &&
- rm $branch_dir_prefix/@ &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
- git refs verify 2>err &&
- rm $tag_dir_prefix/tag-1.lock &&
- test_must_be_empty err &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/.lock: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/.lock &&
- test_cmp expect err &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/$refname: badRefName: invalid refname format
- EOF
- rm "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ (
+ cd repo &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ git commit --allow-empty -m initial &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_cmp expect err || return 1
- done &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- mkdir "$branch_dir_prefix/$refname" &&
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ error: refs/tags/.lock: badRefName: invalid refname format
EOF
- rm -r "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done
+ rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=warn refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=ignore refs verify 2>err &&
- test_must_be_empty err
+ (
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/.branch-1: badRefName: invalid refname format
+ EOF
+ rm $branch_dir_prefix/.branch-1 &&
+ test_cmp expect err &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=ignore refs verify 2>err &&
+ test_must_be_empty err
+ )
'
test_expect_success 'ref name check should work for multiple worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
-
- cd repo &&
- test_commit initial &&
- git checkout -b branch-1 &&
- test_commit second &&
- git checkout -b branch-2 &&
- test_commit third &&
- git checkout -b branch-3 &&
- git worktree add ./worktree-1 branch-1 &&
- git worktree add ./worktree-2 branch-2 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
- (
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
(
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
-
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err &&
-
- for worktree in "worktree-1" "worktree-2"
- do
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
(
- cd $worktree &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err || return 1
- )
- done
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+ )
'
test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- git refs verify 2>err &&
- test_must_be_empty err &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- for trailing_content in " garbage" " more garbage"
- do
- printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $branch_dir_prefix/branch-garbage &&
- test_cmp expect err || return 1
- done &&
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- '\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err &&
- printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- garbage'\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
+ )
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -237,99 +244,103 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- bad_content_1=$(git rev-parse main)x &&
- bad_content_2=xfsazqfxcadas &&
- bad_content_3=Xfsazqfxcadas &&
- printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
- printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
- printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
- error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
- error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'textual symref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for good_referent in "refs/heads/branch" "HEAD"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
- do
- printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-1 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-2 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-3 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-complicated &&
- test_cmp expect err
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+ )
'
test_expect_success 'textual symref content should be checked (aggregate)' '
@@ -337,32 +348,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
- printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'the target of the textual symref should be checked' '
@@ -370,28 +383,30 @@ test_expect_success 'the target of the textual symref should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
- git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
-
- for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
- do
- printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
- EOF
- rm $branch_dir_prefix/branch-bad-1 &&
- test_cmp expect err || return 1
- done
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked' '
@@ -399,201 +414,207 @@ test_expect_success SYMLINKS 'symlink symref content should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
- test_cmp expect err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- main_worktree_refdir_prefix=.git/refs/heads &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
-
- ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree2_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $main_worktree_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- for bad_referent_name in ".tag" "branch "
- do
- ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
- test_cmp expect err || return 1
- done
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
- test_must_fail git refs verify 2>err &&
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $worktree1_refdir_prefix/bad-branch-1 &&
- test_cmp expect err || return 1
- done &&
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
- test_must_fail git refs verify 2>err &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $worktree2_refdir_prefix/bad-branch-2 &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree1_refdir_prefix/branch-garbage &&
- test_cmp expect err
+ rm $worktree1_refdir_prefix/branch-garbage &&
+ test_cmp expect err
+ )
'
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v2 2/8] builtin/refs: get worktrees without reading head info
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
2025-01-30 4:06 ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-01-30 4:07 ` shejialuo
2025-01-30 18:04 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
` (6 subsequent siblings)
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we don't want to simply die the program but rather show the
warnings or errors as many as possible to info the users. So, we should
avoid reading the head info.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head info.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 6 ++++++
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index 248bbb39d4..89b7d86cef 100644
--- a/worktree.c
+++ b/worktree.c
@@ -175,6 +175,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..1ba4a161a0 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,12 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. This is useful when checking
+ * the consistency, as reading HEAD may not be necessary.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
2025-01-30 4:06 ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-01-30 4:07 ` [PATCH v2 2/8] builtin/refs: get worktrees without reading head info shejialuo
@ 2025-01-30 4:07 ` shejialuo
2025-01-30 18:23 ` Junio C Hamano
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:07 ` [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
` (5 subsequent siblings)
8 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". The user should always use "git
packed-refs" command to create the raw regular "packed-refs" file, so we
need to explicitly check this in "git refs verify".
We could use the following two ways to check whether the "packed-refs"
is regular:
1. We could use "lstat" system call to check the file mode.
2. We could use "open_nofollow" wrapper to open the raw "packed-refs" file
If the returned fd value is less than 0, we could check whether the
"errno" is "ELOOP" to report an error to the user.
It might seems that the method one is much easier than method two.
However, method one has a significant drawback. When we have checked the
file mode using "lstat", we will need to read the file content, there is
a possibility that when finishing reading the file content to the
memory, the file could be changed into a symlink and we cannot notice.
With method two, we could get the "fd" firstly. Even if the file is
changed into a symlink, we could still operate the "fd" in the memory
which is consistent across the checking which avoids race condition.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++
2 files changed, 57 insertions(+), 4 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a7b6f74b6e..6401cecd5f 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../git-compat-util.h"
#include "../config.h"
#include "../dir.h"
+#include "../fsck.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ int ret = 0;
+ int fd;
if (!is_main_worktree(wt))
- return 0;
+ goto cleanup;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ fd = open_nofollow(refs->path, O_RDONLY);
+ if (fd < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ if (errno == ELOOP) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+ ret = error_errno(_("unable to open %s"), refs->path);
+ goto cleanup;
+ }
+
+cleanup:
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index cf7a202d0d..42c8d4ca1e 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
)
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-bak .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
` (2 preceding siblings ...)
2025-01-30 4:07 ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
@ 2025-01-30 4:07 ` shejialuo
2025-01-30 18:58 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
` (4 subsequent siblings)
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with:". As we are going to implement the header consistency
check, we should port this check into "packed_fsck".
However, the above check is not enough, this is because "git pack-refs"
will always write "PACKED_REFS_HEADER" which is a constant string to the
"packed-refs" file. So, we should check the following things for the
header.
1. If the header does not exist, we may report an error to the user
because it should exist, but we do allow no header in "packed-refs"
file. So, create a new fsck message "packedRefMissingHeader(INFO)" to
warn the user and also keep compatibility.
2. If the header content does not start with "# packed-ref with:", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER", ideally, we should report an error to the user.
However, we allow other contents as long as the header content starts
with "# packed-ref with:". To keep compatibility, create a new fsck
message "unknownPackedRefHeader(INFO)" to warn about this. We may
tighten this rule in the future.
In order to achieve above checks, read the "packed-refs" file via
"strbuf_read". Like what "create_snapshot" and other functions do, we
could split the line by finding the next newline in the buffer. When we
cannot find a newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
Then, parse the first line to apply the above three checks. Update the
test to excise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 16 +++++++
fsck.h | 4 ++
refs/packed-backend.c | 89 +++++++++++++++++++++++++++++++++++
t/t0602-reffiles-fsck.sh | 46 ++++++++++++++++++
4 files changed, 155 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index b14bc44ca4..34375a3143 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,13 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
+`packedRefMissingHeader`::
+ (INFO) The "packed-refs" file does not contain the header.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
@@ -208,6 +219,11 @@
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
+`unknownPackedRefHeader`::
+ (INFO) The "packed-refs" header starts with "# pack-refs with:"
+ but the remaining content is not the same as what `git pack-refs`
+ would write.
+
`unknownType`::
(ERROR) Found an unknown object type.
diff --git a/fsck.h b/fsck.h
index a44c231a5f..3107a0093d 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
@@ -90,6 +92,8 @@ enum fsck_msg_type {
FUNC(REF_MISSING_NEWLINE, INFO) \
FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
+ FUNC(UNKNOWN_PACKED_REF_HEADER, INFO) \
+ FUNC(PACKED_REF_MISSING_HEADER, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6401cecd5f..883189f3a1 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1749,12 +1749,92 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ struct strbuf *packed_entry, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct fsck_ref_report report = { 0 };
+
+ report.path = packed_entry->buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
+{
+ const char *err_fmt = NULL;
+ int fsck_msg_id = -1;
+
+ if (!starts_with(start, "# pack-refs with:")) {
+ err_fmt = "'%.*s' does not start with '# pack-refs with:'";
+ fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
+ } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
+ err_fmt = "'%.*s' is an unknown packed-refs header";
+ fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
+ }
+
+ if (err_fmt && fsck_msg_id >= 0) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report, fsck_msg_id, err_fmt,
+ (int)(eol - start), start);
+
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ int line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ } else {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+
+ ret |= fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_MISSING_HEADER,
+ "missing header line");
+ }
+
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
int ret = 0;
int fd;
@@ -1786,7 +1866,16 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read %s"), refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 42c8d4ca1e..a7b46b6cb9 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -639,4 +639,50 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
)
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ printf "$(git rev-parse main) refs/heads/main\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: packed-refs: packedRefMissingHeader: missing header line
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with:'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done &&
+
+ for unknown_header in "# pack-refs with: peeled fully-peeled sorted garbage" \
+ "# pack-refs with: peeled" \
+ "# pack-refs with: peeled peeled-fully sort"
+ do
+ printf "%s\n" "$unknown_header" >.git/packed-refs &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: packed-refs.header: unknownPackedRefHeader: '\''$unknown_header'\'' is an unknown packed-refs header
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
` (3 preceding siblings ...)
2025-01-30 4:07 ` [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-01-30 4:07 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:07 ` [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
` (3 subsequent siblings)
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We have already implemented the header consistency check for the raw
"packed-refs" file. Before we implement the consistency check for each
ref entry, let's analysis [1] which reports that "git fsck" cannot
detect some NUL characters.
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
So, we already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.
In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.
Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 883189f3a1..870c8e7aaa 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,22 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+ const char *pos = memchr(refname->buf, '\0', refname->len + 1);
+ return pos < refname->buf + refname->len;
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -895,6 +911,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_nul(&iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
` (4 preceding siblings ...)
2025-01-30 4:07 ` [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-01-30 4:07 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:08 ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
` (2 subsequent siblings)
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry, if the oid is not correct. It
will die the program. And then it will check whether the next
character of the oid is space. Then it will check whether the refname
is correct.
2. If the next line starts with '^', it will continue to parse the oid
of the peeled oid content and check whether the last character is
'\n'.
We can iterate each line by using the "packed_fsck_ref_next_line"
function. Then, create a new fsck message "badPackedRefEntry(ERROR)" to
report to the user when something is wrong.
Create two new functions "packed_fsck_ref_main_line" and
"packed_fsck_ref_peeled_line" for case 1 and case 2 respectively. Last,
update the unit test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 ++
fsck.h | 1 +
refs/packed-backend.c | 98 ++++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 42 +++++++++++++++
4 files changed, 143 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 34375a3143..2a7ec7592e 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 3107a0093d..40126242a4 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 870c8e7aaa..271c740728 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1819,10 +1819,86 @@ static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct strbuf *packed_entry,
+ const char *start, const char *eol)
+{
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+
+ report.path = packed_entry->buf;
+
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+ }
+
+ if (p != eol) {
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct strbuf *packed_entry,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
+
+ report.path = packed_entry->buf;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+ }
+
+ if (p == eol || !isspace(*p)) {
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
+ if (refname_contains_nul(refname)) {
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
+ }
+
+ if (check_refname_format(refname->buf, 0)) {
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
+ }
+
+ return 0;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
struct strbuf packed_entry = STRBUF_INIT;
+ struct strbuf refname = STRBUF_INIT;
int line_number = 1;
const char *eol;
int ret = 0;
@@ -1843,6 +1919,26 @@ static int packed_fsck_ref_content(struct fsck_options *o,
"missing header line");
}
+ while (start < eof) {
+ strbuf_reset(&packed_entry);
+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, &packed_entry, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ strbuf_reset(&packed_entry);
+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, &packed_entry,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname);
strbuf_release(&packed_entry);
return ret;
}
@@ -1890,7 +1986,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index a7b46b6cb9..e4b4a58684 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -685,4 +685,46 @@ test_expect_success 'packed-refs header should be checked' '
)
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s\n" "$short_oid refs/heads/branch-1" >>.git/packed-refs &&
+ printf "%sx\n" "$branch_1_oid" >>.git/packed-refs &&
+ printf "%s refs/heads/bad-branch\n" "$branch_2_oid" >>.git/packed-refs &&
+ printf "%s refs/heads/branch.\n" "$branch_2_oid" >>.git/packed-refs &&
+ printf "%s refs/tags/annotated-tag-3\n" "$tag_1_oid" >>.git/packed-refs &&
+ printf "^%s\n" "$short_oid" >>.git/packed-refs &&
+ printf "%s refs/tags/annotated-tag-4.\n" "$tag_2_oid" >>.git/packed-refs &&
+ printf "^%s garbage\n" "$tag_2_peeled_oid" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
` (5 preceding siblings ...)
2025-01-30 4:07 ` [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-01-30 4:08 ` shejialuo
2025-01-30 19:02 ` Junio C Hamano
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:08 ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
8 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:08 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We will always try to sort the "packed-refs" increasingly by comparing
the refname. So, we should add checks to verify whether the "packed-refs"
is sorted.
We already have code to parse the content. Let's create a new structure
"fsck_packed_ref_entry" to store the state during the parsing process
for every entry. It may seem that we could just add a new "struct strbuf
refname" into the "struct fsck_packed_ref_entry" and during the parsing
process, we could store the refname into this structure and we could
compare later. However, this is not a good design due to the following
reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. The most important thing is that we cannot reuse the existing compare
functions which cause repetition.
So, instead of storing the "struct strbuf", let's use the existing
structure "struct snaphost_record". And thus we could use the existing
function "cmp_packed_ref_records".
However, this function need an extra parameter for "struct snaphost".
Extract the common part into a new function "cmp_packed_ref_records" to
reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to use the new fsck
message "packedRefUnsorted(ERROR)" to report to the user.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/packed-backend.c | 100 +++++++++++++++++++++++++++++++---
t/t0602-reffiles-fsck.sh | 38 +++++++++++++
4 files changed, 135 insertions(+), 7 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 2a7ec7592e..7a11d35c5e 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -190,6 +190,9 @@
`packedRefMissingHeader`::
(INFO) The "packed-refs" file does not contain the header.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 40126242a4..0d3d1045ae 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 271c740728..b250f987b2 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,9 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
-{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+static int cmp_packed_refname(const char *r1, const char *r2)
+{
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +317,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1768,6 +1774,28 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+struct fsck_packed_ref_entry {
+ int line_number;
+
+ struct snapshot_record record;
+};
+
+static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(int line_number,
+ const char *start)
+{
+ struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
+ entry->line_number = line_number;
+ entry->record.start = start;
+ return entry;
+}
+
+static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, int nr)
+{
+ for (int i = 0; i < nr; i++)
+ free(entries[i]);
+ free(entries);
+}
+
static int packed_fsck_ref_next_line(struct fsck_options *o,
struct strbuf *packed_entry, const char *start,
const char *eof, const char **eol)
@@ -1893,13 +1921,60 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct fsck_packed_ref_entry **entries,
+ int nr)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ int ret = 0;
+
+ for (int i = 1; i < nr; i++) {
+ const char *r1 = entries[i - 1]->record.start + hexsz + 1;
+ const char *r2 = entries[i]->record.start + hexsz + 1;
+
+ if (cmp_packed_refname(r1, r2) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is not less than next refname '%s'";
+ const char *eol;
+ eol = memchr(entries[i - 1]->record.start, '\n',
+ entries[i - 1]->record.len);
+ strbuf_add(&refname1, r1, eol - r1);
+ eol = memchr(entries[i]->record.start, '\n',
+ entries[i]->record.len);
+ strbuf_add(&refname2, r2, eol - r2);
+
+ strbuf_addf(&packed_entry, "packed-refs line %d",
+ entries[i - 1]->line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname1.buf, refname2.buf);
+ goto cleanup;
+ }
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
const char *start, const char *eof)
{
struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_packed_ref_entry **entries;
struct strbuf refname = STRBUF_INIT;
+ int entry_alloc = 20;
int line_number = 1;
+ int entry_nr = 0;
const char *eol;
int ret = 0;
@@ -1919,7 +1994,13 @@ static int packed_fsck_ref_content(struct fsck_options *o,
"missing header line");
}
+ ALLOC_ARRAY(entries, entry_alloc);
while (start < eof) {
+ struct fsck_packed_ref_entry *entry
+ = create_fsck_packed_ref_entry(line_number, start);
+
+ ALLOC_GROW(entries, entry_nr + 1, entry_alloc);
+ entries[entry_nr++] = entry;
strbuf_reset(&packed_entry);
strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
@@ -1935,11 +2016,16 @@ static int packed_fsck_ref_content(struct fsck_options *o,
start = eol + 1;
line_number++;
}
+ entry->record.len = start - entry->record.start;
}
+ if (!ret)
+ ret |= packed_fsck_ref_sorted(o, ref_store, entries, entry_nr);
+
strbuf_release(&packed_entry);
strbuf_release(&refname);
strbuf_release(&packed_entry);
+ free_fsck_packed_ref_entries(entries, entry_nr);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index e4b4a58684..9d802d71a9 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -727,4 +727,42 @@ test_expect_success 'packed-refs content should be checked' '
)
'
+test_expect_success 'packed-ref sorted should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
+ printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname1'\'' is not less than next refname '\''$refname2'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s %s\n" "$tag_1_oid" "$refname3" >>.git/packed-refs &&
+ printf "^%s\n" "$tag_1_peeled_oid" >>.git/packed-refs &&
+ printf "%s %s\n" "$branch_2_oid" "$refname2" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname3'\'' is not less than next refname '\''$refname2'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
` (6 preceding siblings ...)
2025-01-30 4:08 ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-01-30 4:08 ` shejialuo
2025-01-30 19:03 ` Junio C Hamano
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
8 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-30 4:08 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
And we run this function in the first execution sequence of
"git-fsck(1)" because we don't want the existing code of "git-fsck(1)"
which implicitly checks the consistency of refs to die the program.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/fsck.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a4dcb0716..9a8613d07f 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -905,6 +905,34 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(struct repository *r)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+ uint64_t progress_num = 1;
+
+ if (show_progress)
+ progress = start_progress(r, _("Checking ref database"),
+ progress_num);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
@@ -970,6 +998,8 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged
2025-01-30 4:06 ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-01-30 17:53 ` Junio C Hamano
0 siblings, 0 replies; 168+ messages in thread
From: Junio C Hamano @ 2025-01-30 17:53 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> For every test, we would execute the command "cd repo" in the first but
> we never execute the command "cd .." to restore the working directory.
> However, it's either not a good idea use above way. Because if any test
> fails between "cd repo" and "cd ..", the "cd .." will never be reached.
> And we cannot correctly restore the working directory.
>
> Let's use subshell to ensure that the current working directory could be
> restored to the correct path.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
> 1 file changed, 494 insertions(+), 473 deletions(-)
Note for bystanders who may be interested in helping to ensure
correctness of this step.
The patch meant for the machines we see here is unreadable for
humans [*], but the result of applying it and then running
$ git show -wW t/
gives me a very clear "from here to there, the entire thing now has
a pair of () around it" pattern. If you look at the clean-up step
each test piece defines with test_when_finished at the front, and
comparing it with the directory name the test repository "git init"
in each test piece creates and "cd" goes into, it is fairly easy to
see that the patch is doing the right thing without doing anything
unwanted.
All the here-doc in the test are now indented one level deeper, but
you can check that they use "<<-EOF" to be oblivious to the leading
tabs, making this conversion a safe one.
One thing that is hard to validate by code inspection alone is
- This change will change the commit timestamps of the commits
created by "test_commit" helper function, now that they are run
in subshells to get their internal clock reset in each test
piece.
But if the tests rely on the exact commit object names, running the
resulting script just once would be sufficient to notice.
Overall, very nicely done.
Queued. Thanks.
[Footnote]
* No, I do not mean to say that you should spend time trying to
make the message readable by humans in a case like this. A patch
that can be mechanically processed and leave the byte sequence
you intended to give the recipients is exactly what we want.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 2/8] builtin/refs: get worktrees without reading head info
2025-01-30 4:07 ` [PATCH v2 2/8] builtin/refs: get worktrees without reading head info shejialuo
@ 2025-01-30 18:04 ` Junio C Hamano
2025-01-31 13:29 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-30 18:04 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> Although this behavior has no harm for the program, it will
> short-circuit the program. When the users execute "git refs verify" or
> "git fsck", we don't want to simply die the program but rather show the
> warnings or errors as many as possible to info the users.
"info" is not a verb; "inform"?
I can understand what you want to say with "show the warnings or
errors as many as possible", but giving errors on the same issue
many times is not what you meant---rather, you want the checker to
keep going and discover errors in many _other things_, after it
finds a single error in "HEAD".
..., we do want to diagnose a broken "HEAD", but we want to
notice as many breakages on other refs as we can instead of
dying after finding the first breakage. Dying on a broken
"HEAD" done by get_worktrees() goes against this goal.
or something, perhaps. Such a rewrite makes the sentence "Although
... short-circuit the program." unnecessary.
> So, we should
> avoid reading the head info.
With one reservation. We still want to diagnose a broken "HEAD", so
I'd probably strike this sentence out, and add a statement that says
we still check the contents of "HEAD" elsewhere as a substitute at
the end of the proposed commit log message, if I were writing it,
after explaining the use of get_worktrees_without_reading_head()
you did in the following two paragraphs (both of which read well).
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-01-30 4:07 ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
@ 2025-01-30 18:23 ` Junio C Hamano
2025-01-31 13:54 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
1 sibling, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-30 18:23 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> It might seems that the method one is much easier than method two.
> However, method one has a significant drawback. When we have checked the
> file mode using "lstat", we will need to read the file content, there is
> a possibility that when finishing reading the file content to the
> memory, the file could be changed into a symlink and we cannot notice.
To me, the above sounds like saying:
The user can run 'git refs verify' and it may declare that refs
are all good, and then somebody else can come in and turn the
packed-refs file into a bad one, but the user will not notice
the mischeif until the check is run the next time.
It is just the time that somebody else comes in becomes a bit
earlier than the time the 'git refs verify' command finishes, and
there is no fundamental difference.
> With method two, we could get the "fd" firstly. Even if the file is
> changed into a symlink, we could still operate the "fd" in the memory
> which is consistent across the checking which avoids race condition.
The end result is the same with the lstat(2) approach, isn't it,
though?. 'git refs verify' may say "I opened the file without
following symlink and checked the contents, which turned out to be
perfectly fine". But because that somebody else came in just after
the command did nofollow-open and swapped the packed-refs file, the
repository has a packed-refs file that is not a regular file after
the command returns success. So I am not sure if I am following
your argument to favor the latter over the former. What am I
missing?
As long as both approaches are equally portable, I do not think it
matters which one we pick from correctness point of view, and we can
pick the one that is easier to use to implement the feature.
On a platform without O_NOFOLLOW, open_nofollow() falls back to the
lstat and open, so your "open_nofollow() is better than lstat() and
open()" argument does not portably work, though.
> Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
> the user if "packed-refs" is not a regular file.
Good. Say "regular file" on the commit title, too, and it would be
perfect.
> -static int packed_fsck(struct ref_store *ref_store UNUSED,
> - struct fsck_options *o UNUSED,
> +static int packed_fsck(struct ref_store *ref_store,
> + struct fsck_options *o,
> struct worktree *wt)
> {
> + struct packed_ref_store *refs = packed_downcast(ref_store,
> + REF_STORE_READ, "fsck");
> + int ret = 0;
> + int fd;
>
> if (!is_main_worktree(wt))
> - return 0;
> + goto cleanup;
>
> - return 0;
> + if (o->verbose)
> + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
> +
> + fd = open_nofollow(refs->path, O_RDONLY);
> + if (fd < 0) {
> + /*
> + * If the packed-refs file doesn't exist, there's nothing
> + * to check.
> + */
> + if (errno == ENOENT)
> + goto cleanup;
> +
> + if (errno == ELOOP) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs";
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_FILETYPE,
> + "not a regular file");
> + goto cleanup;
> + }
> +
> + ret = error_errno(_("unable to open %s"), refs->path);
> + goto cleanup;
> + }
> +
> +cleanup:
> + return ret;
> }
Looking good.
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index cf7a202d0d..42c8d4ca1e 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
> )
> '
>
> +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> + git branch branch-1 &&
> + git branch branch-2 &&
> + git branch branch-3 &&
> + git pack-refs --all &&
> +
> + mv .git/packed-refs .git/packed-refs-back &&
> + ln -sf packed-refs-bak .git/packed-refs &&
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: packed-refs: badRefFiletype: not a regular file
> + EOF
> + rm .git/packed-refs &&
> + test_cmp expect err
> + )
> +'
> +
> test_done
OK. I notice that the previous step did not have any new test
associated with it. Perhaps we can corrupt "HEAD" *and* replace
packed-refs file with a symbolic link (or do some other damage
to the refs) and make sure both breakages are reported?
It does not have to be done in this step, and certainly not as a
part of this single test this step adds, but we'd want it tested
somewhere.
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check
2025-01-30 4:07 ` [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-01-30 18:58 ` Junio C Hamano
2025-01-31 14:23 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-30 18:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> In "packed-backend.c::create_snapshot", if there is a header (the line
> which starts with '#'), we will check whether the line starts with "#
> pack-refs with:". As we are going to implement the header consistency
> check, we should port this check into "packed_fsck".
>
> However, the above check is not enough, this is because "git pack-refs"
> will always write "PACKED_REFS_HEADER" which is a constant string to the
> "packed-refs" file. So, we should check the following things for the
> header.
I haven't done history digging in this area for a while, but we
should make sure we are not flagging a file that was written in
ancient version of Git whose repository is still supported.
> 1. If the header does not exist, we may report an error to the user
> because it should exist, but we do allow no header in "packed-refs"
> file. So, create a new fsck message "packedRefMissingHeader(INFO)" to
> warn the user and also keep compatibility.
Are we sure "it should exist"? I think the header did not exist
before "Git v1.5.0". I didn't check with other reimplementations of
Git (like jgit or libgit2), but as long as our reading side of the
runtime allows a packed-refs file without the header without
complaint, I do not think it is a good idea to treat it as a
report-worthy event from "git fsck".
> 2. If the header content does not start with "# packed-ref with:", we
> should report an error just like what "create_snapshot" does. So,
> create a new fsck message "badPackedRefHeader(ERROR)" for this.
This I can agree with. If the first line begins with "#" but not
with that string (with a trailing SP), that is a sign that it may
not even be a valid packed-refs file, which is a report-worthy
event.
> 3. If the header content is not the same as the constant string
> "PACKED_REFS_HEADER", ideally, we should report an error to the user.
NO. THAT IS NOT IDEAL AT ALL.
The header was written like this:
/* perhaps other traits later as well */
fprintf(cbdata.refs_file, "# pack-refs with: peeled \n");
in the older versions of Git before it was made into a separate
preprocessor macro and lost the comment (the above excerpt is from
"git show v1.5.0:builtin-pack-refs.c").
Notice "other traits later" in the comment?
The thing is _designed_ to be extensible. In fact, these days we
support a few more traits
static const char PACKED_REFS_HEADER[] =
"# pack-refs with: peeled fully-peeled sorted \n";
(an excerpt from the current refs/packed-backend.c).
Reporting an error when you see something written by an older
version of Git is far from ideal.
> However, we allow other contents as long as the header content starts
> with "# packed-ref with:". To keep compatibility, create a new fsck
> message "unknownPackedRefHeader(INFO)" to warn about this. We may
> tighten this rule in the future.
Whatever we do, what we do with an unknown trait should be in line
with what the runtime does. If the runtime failed (we do not, but
this is to illustrate the principle [*]) on a packed-refs file
without "sorted" trait, noticing that "sorted" is not there and
flagging as an error is a good thing to do. But if the runtime
gracefully degrades and sorts the list of refs read from such a
packed-refs file before continuing, then a packed-refs file that
lack "sorted" trait is not a report-worthy event.
I do not offhand recall if we introduced the concept of mandatory vs
optional traits in the packed-refs part of the system (like we have
in the index extension subsystem, where a version of Git that
encounters an unknown *and* mandatory index extension must refuse to
touch the repository), but if there is a mandatory trait declared in
the header that our version of Git does not understand, it is a
report-worthy event that must be flagged with "git refs verify".
> +static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
> +{
> + const char *err_fmt = NULL;
> + int fsck_msg_id = -1;
> +
> + if (!starts_with(start, "# pack-refs with:")) {
> + err_fmt = "'%.*s' does not start with '# pack-refs with:'";
> + fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
> + } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
> + err_fmt = "'%.*s' is an unknown packed-refs header";
> + fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
> + }
As I outlined above, this is totally unacceptable.
Inspecting the header is good, but if this code claims to be a
checker, it should do at least what the runtime does, i.e. parse the
header to tell what traits the packed-file declares, not just
assuming that it is a fixed string. And error on unknown trait(s)
if they are mandatory (if such a concept is implemented in the
runtime reading side). Informing on an unknown and optional
trait(s) I can live with, but personally I wouldn't recommend it.
In other words, report loudly if it is an error, but otherwise stay
silent if we know we tolerate it well.
> +static int packed_fsck_ref_content(struct fsck_options *o,
> + const char *start, const char *eof)
> +{
> + struct strbuf packed_entry = STRBUF_INIT;
> + int line_number = 1;
We limit ourselves with about 1 billion refs in the packed-refs
file, which may be plenty, but I do not quite understand the use of
this variable. There is no loop inside this so ...
> + const char *eol;
> + int ret = 0;
> +
> + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
... this is always line #1, and then
> + ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> + if (*start == '#') {
> + ret |= packed_fsck_ref_header(o, start, eol);
> +
> + start = eol + 1;
> + line_number++;
... it may be incremented, but upon returning from the funcition, it
is lost.
Perhaps you wanted to make it a function-scope static, but then you
are allowed to read one single packed-refs file during the life of
your process before you exit, which I am not sure is what you want?
> + } else {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs";
> +
> + ret |= fsck_report_ref(o, &report,
> + FSCK_MSG_PACKED_REF_MISSING_HEADER,
> + "missing header line");
> + }
> +
> + strbuf_release(&packed_entry);
> + return ret;
> +}
I'll stop here for now.
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-01-30 4:08 ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-01-30 19:02 ` Junio C Hamano
2025-01-31 14:35 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
1 sibling, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-30 19:02 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> We will always try to sort the "packed-refs" increasingly by comparing
> the refname. So, we should add checks to verify whether the "packed-refs"
> is sorted.
Do this _ONLY_ when the packed-refs file has a header that declares
"sorted" trait. Insisting on a packed-refs file that does not would
mean you are stricter than the runtime contract allows.
> +struct fsck_packed_ref_entry {
> + int line_number;
> +
> + struct snapshot_record record;
> +};
Not a huge deal, as 1 billion is still plenty of a large number, but
the same comment on the line-number applies here. We might want to
consistently use ulong for line numbers of files we read from.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process
2025-01-30 4:08 ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-01-30 19:03 ` Junio C Hamano
2025-01-31 14:37 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
1 sibling, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-30 19:03 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> +static void fsck_refs(struct repository *r)
> +{
> + struct child_process refs_verify = CHILD_PROCESS_INIT;
> + struct progress *progress = NULL;
> + uint64_t progress_num = 1;
> +
> + if (show_progress)
> + progress = start_progress(r, _("Checking ref database"),
> + progress_num);
I do not see why we need an extra variable progress_num here. Just
passing a literal constant 1 should be sufficient. The called
function has function prototype to help the compiler promite it to
the appropritate type.
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 2/8] builtin/refs: get worktrees without reading head info
2025-01-30 18:04 ` Junio C Hamano
@ 2025-01-31 13:29 ` shejialuo
2025-01-31 16:16 ` Junio C Hamano
0 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-31 13:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Thu, Jan 30, 2025 at 10:04:45AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > Although this behavior has no harm for the program, it will
> > short-circuit the program. When the users execute "git refs verify" or
> > "git fsck", we don't want to simply die the program but rather show the
> > warnings or errors as many as possible to info the users.
>
> "info" is not a verb; "inform"?
>
Let me improve this in the next version.
> I can understand what you want to say with "show the warnings or
> errors as many as possible", but giving errors on the same issue
> many times is not what you meant
Yes, this is correct.
> ---rather, you want the checker to
> keep going and discover errors in many _other things_, after it
> finds a single error in "HEAD".
>
I think this is a misunderstanding. Let me explain more to you.
1. If the content of the "HEAD" is not correct, we won't detect the
current directory as valid git repository.
2. If the referent of the "HEAD" is not in the "packed-refs", the
referent must be a loose ref or don't exist. In this situation, because
we will never touch the packed backend.
3. If the referent of the "HEAD" is in the "packed-refs", it will call
"create_snapshot" to create the snapshot. In this function, it would
call "verify_buffer_safe" to check the following things:
1. Check the correctness of th header.
2. Check via "verify_buffer_safe" method
So, even the referent entry is not correct in the "packed-refs", the
program won't die. But the above two cases will let the program die.
I want to say we cannot find any error in "HEAD" at now as above
described. From my perspective, we should retain the paragraph:
> Although this behavior has no harm for the program...
But we should change the statement
> we don't want to simply die the program but rather show the
> warnings or errors as many as possible to info the users. So, we should
> avoid reading the head info.
to
We should avoid reading the head information, which may execute the
read operation in packed backend with stricter checks to die the
program. Instead, we should continue to check other parts of the
"packed-refs" file completely.
> With one reservation. We still want to diagnose a broken "HEAD", so
> I'd probably strike this sentence out, and add a statement that says
> we still check the contents of "HEAD" elsewhere as a substitute at
> the end of the proposed commit log message, if I were writing it,
> after explaining the use of get_worktrees_without_reading_head()
> you did in the following two paragraphs (both of which read well).
I want to say that we cannot check the content of the "HEAD" itself. If
the content of "HEAD" is not correct, we cannot detect the current
directory as a valid git repository. So, there is no need to say "we
will check the contents of 'HEAD' else where".
I think the misunderstanding is that you think that if the "HEAD" is not
correct, the program will die but actually it is not.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-01-30 18:23 ` Junio C Hamano
@ 2025-01-31 13:54 ` shejialuo
2025-01-31 16:20 ` Junio C Hamano
0 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-01-31 13:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Thu, Jan 30, 2025 at 10:23:15AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > It might seems that the method one is much easier than method two.
> > However, method one has a significant drawback. When we have checked the
> > file mode using "lstat", we will need to read the file content, there is
> > a possibility that when finishing reading the file content to the
> > memory, the file could be changed into a symlink and we cannot notice.
>
> To me, the above sounds like saying:
>
> The user can run 'git refs verify' and it may declare that refs
> are all good, and then somebody else can come in and turn the
> packed-refs file into a bad one, but the user will not notice
> the mischeif until the check is run the next time.
>
Yes, it is.
> It is just the time that somebody else comes in becomes a bit
> earlier than the time the 'git refs verify' command finishes, and
> there is no fundamental difference.
>
> > With method two, we could get the "fd" firstly. Even if the file is
> > changed into a symlink, we could still operate the "fd" in the memory
> > which is consistent across the checking which avoids race condition.
>
> The end result is the same with the lstat(2) approach, isn't it,
> though?. 'git refs verify' may say "I opened the file without
> following symlink and checked the contents, which turned out to be
> perfectly fine". But because that somebody else came in just after
> the command did nofollow-open and swapped the packed-refs file, the
> repository has a packed-refs file that is not a regular file after
> the command returns success. So I am not sure if I am following
> your argument to favor the latter over the former. What am I
> missing?
>
Let me give you some background. In the version 1, I used the following
way:
```c
lstat(...)
if (!IS_REG(...))
report_error(...);
strbuf_read(...)
```
Patrick has told me that there is a possibility that between the `IS_REG`
and `strbuf_read`, the "packed-refs" could be converted into a symlink.
So, my idea is that we could use `open_nofollow`, when we have got the
file descriptor, no matter what happens to `packed-refs` file (deleted or
changed into a symlink), we could operate the file descriptor and read
its content.
However, on a platform with O_NOFOLLOW, this situation will also happen.
So, I think we may just use "open_nofollow" now and don't talk about the
method one at all to avoid confusing readers.
> As long as both approaches are equally portable, I do not think it
> matters which one we pick from correctness point of view, and we can
> pick the one that is easier to use to implement the feature.
>
> On a platform without O_NOFOLLOW, open_nofollow() falls back to the
> lstat and open, so your "open_nofollow() is better than lstat() and
> open()" argument does not portably work, though.
>
Yes, actually in my first implementation, I didn't notice this. But the
CI told me that and I finally chose "open_nofollow".
> > Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
> > the user if "packed-refs" is not a regular file.
>
> Good. Say "regular file" on the commit title, too, and it would be
> perfect.
>
Let me improve this in the next version.
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index cf7a202d0d..42c8d4ca1e 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
> > )
> > '
> >
> > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + (
> > + cd repo &&
> > + test_commit default &&
> > + git branch branch-1 &&
> > + git branch branch-2 &&
> > + git branch branch-3 &&
> > + git pack-refs --all &&
> > +
> > + mv .git/packed-refs .git/packed-refs-back &&
> > + ln -sf packed-refs-bak .git/packed-refs &&
> > + test_must_fail git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + error: packed-refs: badRefFiletype: not a regular file
> > + EOF
> > + rm .git/packed-refs &&
> > + test_cmp expect err
> > + )
> > +'
> > +
> > test_done
>
> OK. I notice that the previous step did not have any new test
> associated with it. Perhaps we can corrupt "HEAD" *and* replace
> packed-refs file with a symbolic link (or do some other damage
> to the refs) and make sure both breakages are reported?
>
As I have said in the previous comment, we cannot detect the error if
"HEAD" itself is corrupted. However, we will check the referent in the
later. So, we don't need to do this.
> It does not have to be done in this step, and certainly not as a
> part of this single test this step adds, but we'd want it tested
> somewhere.
>
If we need to check the referent of the "HEAD" in the "packed-refs". We
could do this in the later test. I could cover this in [PATCH 6/8].
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check
2025-01-30 18:58 ` Junio C Hamano
@ 2025-01-31 14:23 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-01-31 14:23 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Thu, Jan 30, 2025 at 10:58:32AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > In "packed-backend.c::create_snapshot", if there is a header (the line
> > which starts with '#'), we will check whether the line starts with "#
> > pack-refs with:". As we are going to implement the header consistency
> > check, we should port this check into "packed_fsck".
> >
> > However, the above check is not enough, this is because "git pack-refs"
> > will always write "PACKED_REFS_HEADER" which is a constant string to the
> > "packed-refs" file. So, we should check the following things for the
> > header.
>
> I haven't done history digging in this area for a while, but we
> should make sure we are not flagging a file that was written in
> ancient version of Git whose repository is still supported.
>
Understood.
> > 1. If the header does not exist, we may report an error to the user
> > because it should exist, but we do allow no header in "packed-refs"
> > file. So, create a new fsck message "packedRefMissingHeader(INFO)" to
> > warn the user and also keep compatibility.
>
> Are we sure "it should exist"? I think the header did not exist
> before "Git v1.5.0". I didn't check with other reimplementations of
> Git (like jgit or libgit2), but as long as our reading side of the
> runtime allows a packed-refs file without the header without
> complaint, I do not think it is a good idea to treat it as a
> report-worthy event from "git fsck".
>
OK, let me improve this in the next version.
> > 2. If the header content does not start with "# packed-ref with:", we
> > should report an error just like what "create_snapshot" does. So,
> > create a new fsck message "badPackedRefHeader(ERROR)" for this.
>
> This I can agree with. If the first line begins with "#" but not
> with that string (with a trailing SP), that is a sign that it may
> not even be a valid packed-refs file, which is a report-worthy
> event.
>
> > 3. If the header content is not the same as the constant string
> > "PACKED_REFS_HEADER", ideally, we should report an error to the user.
>
> NO. THAT IS NOT IDEAL AT ALL.
>
> The header was written like this:
>
> /* perhaps other traits later as well */
> fprintf(cbdata.refs_file, "# pack-refs with: peeled \n");
>
> in the older versions of Git before it was made into a separate
> preprocessor macro and lost the comment (the above excerpt is from
> "git show v1.5.0:builtin-pack-refs.c").
>
> Notice "other traits later" in the comment?
>
> The thing is _designed_ to be extensible. In fact, these days we
> support a few more traits
>
> static const char PACKED_REFS_HEADER[] =
> "# pack-refs with: peeled fully-peeled sorted \n";
>
> (an excerpt from the current refs/packed-backend.c).
>
> Reporting an error when you see something written by an older
> version of Git is far from ideal.
>
Understood, I think we should be consistency with the runtime check.
> > However, we allow other contents as long as the header content starts
> > with "# packed-ref with:". To keep compatibility, create a new fsck
> > message "unknownPackedRefHeader(INFO)" to warn about this. We may
> > tighten this rule in the future.
>
> Whatever we do, what we do with an unknown trait should be in line
> with what the runtime does. If the runtime failed (we do not, but
> this is to illustrate the principle [*]) on a packed-refs file
> without "sorted" trait, noticing that "sorted" is not there and
> flagging as an error is a good thing to do. But if the runtime
> gracefully degrades and sorts the list of refs read from such a
> packed-refs file before continuing, then a packed-refs file that
> lack "sorted" trait is not a report-worthy event.
>
Actually, the runtime won't complain about this. I agree with you here.
> I do not offhand recall if we introduced the concept of mandatory vs
> optional traits in the packed-refs part of the system (like we have
> in the index extension subsystem, where a version of Git that
> encounters an unknown *and* mandatory index extension must refuse to
> touch the repository), but if there is a mandatory trait declared in
> the header that our version of Git does not understand, it is a
> report-worthy event that must be flagged with "git refs verify".
>
I don't think any trait in "packed-refs" is mandatory. Because I have
done some experiments before implementing the code. We should only check
case 2 here.
> > +static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
> > +{
> > + const char *err_fmt = NULL;
> > + int fsck_msg_id = -1;
> > +
> > + if (!starts_with(start, "# pack-refs with:")) {
> > + err_fmt = "'%.*s' does not start with '# pack-refs with:'";
> > + fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
> > + } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
> > + err_fmt = "'%.*s' is an unknown packed-refs header";
> > + fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
> > + }
>
> As I outlined above, this is totally unacceptable.
>
> Inspecting the header is good, but if this code claims to be a
> checker, it should do at least what the runtime does, i.e. parse the
> header to tell what traits the packed-file declares, not just
> assuming that it is a fixed string. And error on unknown trait(s)
> if they are mandatory (if such a concept is implemented in the
> runtime reading side). Informing on an unknown and optional
> trait(s) I can live with, but personally I wouldn't recommend it.
>
Got it, I don't want to report unknown trait(s) either.
> In other words, report loudly if it is an error, but otherwise stay
> silent if we know we tolerate it well.
>
Thanks for this suggestion.
> > +static int packed_fsck_ref_content(struct fsck_options *o,
> > + const char *start, const char *eof)
> > +{
> > + struct strbuf packed_entry = STRBUF_INIT;
> > + int line_number = 1;
>
> We limit ourselves with about 1 billion refs in the packed-refs
> file, which may be plenty,
Let me change this to `size_t`. This would be better.
> but I do not quite understand the use of
> this variable. There is no loop inside this so ...
>
The reason why I define this variable is that I am going to use loop to
check each entry in the next patch.
> > + const char *eol;
> > + int ret = 0;
> > +
> > + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
>
> ... this is always line #1, and then
>
> > + ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> > + if (*start == '#') {
> > + ret |= packed_fsck_ref_header(o, start, eol);
> > +
> > + start = eol + 1;
> > + line_number++;
>
> ... it may be incremented, but upon returning from the funcition, it
> is lost.
>
> Perhaps you wanted to make it a function-scope static, but then you
> are allowed to read one single packed-refs file during the life of
> your process before you exit, which I am not sure is what you want?
>
Actually, what I want is use this variable for looping the each ref
entry in the "packed-refs" file.
> > + } else {
> > + struct fsck_ref_report report = { 0 };
> > + report.path = "packed-refs";
> > +
> > + ret |= fsck_report_ref(o, &report,
> > + FSCK_MSG_PACKED_REF_MISSING_HEADER,
> > + "missing header line");
> > + }
> > +
> > + strbuf_release(&packed_entry);
> > + return ret;
> > +}
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-01-30 19:02 ` Junio C Hamano
@ 2025-01-31 14:35 ` shejialuo
2025-01-31 16:23 ` Junio C Hamano
2025-02-03 8:40 ` Patrick Steinhardt
0 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-01-31 14:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Thu, Jan 30, 2025 at 11:02:18AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > We will always try to sort the "packed-refs" increasingly by comparing
> > the refname. So, we should add checks to verify whether the "packed-refs"
> > is sorted.
>
> Do this _ONLY_ when the packed-refs file has a header that declares
> "sorted" trait. Insisting on a packed-refs file that does not would
> mean you are stricter than the runtime contract allows.
>
From my perspective, we should check whether it is sorted when the
header has a "sorted" trait. Actually, in the runtime, when calling
`create_snapshot` method, the following would happen:
1. If there is no "sorted" trait, it will sort the "packed-refs".
2. If there is, it won't sort the "packed-refs".
So, we DO allow refs unsorted.
Actually, I have used `git show v1.5.0:builtin-pack-refs.c`, in this
version, it does not sort the ref. However, I quite don't understand the
comment from Patrick in the version one about this patch:
> Makes sense. It has been a source of bugs a couple years ago, and it can
> silently make you receive wrong results, so this is quite a sensible
> check to have.
Patrick, could you please help to explain this. I don't know whether we
need to check whether "packed-refs" is sorted always. It seems that we
truly allow refs unsorted. We need to know whether we should tighten
this?
> > +struct fsck_packed_ref_entry {
> > + int line_number;
> > +
> > + struct snapshot_record record;
> > +};
>
> Not a huge deal, as 1 billion is still plenty of a large number, but
> the same comment on the line-number applies here. We might want to
> consistently use ulong for line numbers of files we read from.
Yes, let me improve this.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process
2025-01-30 19:03 ` Junio C Hamano
@ 2025-01-31 14:37 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-01-31 14:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Thu, Jan 30, 2025 at 11:03:55AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > +static void fsck_refs(struct repository *r)
> > +{
> > + struct child_process refs_verify = CHILD_PROCESS_INIT;
> > + struct progress *progress = NULL;
> > + uint64_t progress_num = 1;
> > +
> > + if (show_progress)
> > + progress = start_progress(r, _("Checking ref database"),
> > + progress_num);
>
> I do not see why we need an extra variable progress_num here. Just
> passing a literal constant 1 should be sufficient. The called
> function has function prototype to help the compiler promite it to
> the appropritate type.
You are correct, let me improve this in the next version.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 2/8] builtin/refs: get worktrees without reading head info
2025-01-31 13:29 ` shejialuo
@ 2025-01-31 16:16 ` Junio C Hamano
0 siblings, 0 replies; 168+ messages in thread
From: Junio C Hamano @ 2025-01-31 16:16 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> I want to say that we cannot check the content of the "HEAD" itself. If
> the content of "HEAD" is not correct, we cannot detect the current
> directory as a valid git repository. So, there is no need to say "we
> will check the contents of 'HEAD' else where".
Instead you should say "we detected your HEAD is broken" somewhere
in the documentation for this, and then the end-user should get a
message to telling them about the broken HEAD in such a case,
though.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-01-31 13:54 ` shejialuo
@ 2025-01-31 16:20 ` Junio C Hamano
2025-02-01 9:47 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-31 16:20 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> However, on a platform with O_NOFOLLOW, this situation will also happen.
> So, I think we may just use "open_nofollow" now and don't talk about the
> method one at all to avoid confusing readers.
Exactly. That is what you see below ;-)
>> As long as both approaches are equally portable, I do not think it
>> matters which one we pick from correctness point of view, and we can
>> pick the one that is easier to use to implement the feature.
>>
>> On a platform without O_NOFOLLOW, open_nofollow() falls back to the
>> lstat and open, so your "open_nofollow() is better than lstat() and
>> open()" argument does not portably work, though.
>> ...
>> OK. I notice that the previous step did not have any new test
>> associated with it. Perhaps we can corrupt "HEAD" *and* replace
>> packed-refs file with a symbolic link (or do some other damage
>> to the refs) and make sure both breakages are reported?
>
> As I have said in the previous comment, we cannot detect the error if
> "HEAD" itself is corrupted. However, we will check the referent in the
> later. So, we don't need to do this.
I still think you absolutely need to diagnose and tell the user
about the broken HEAD. With your "don't check HEAD because a
repository with a broken HEAD is not a repository", a check run in
such a place may find everything else in the repository perfectly
fine, but because the user wanted "git refs verify" to tell them
about breakages, you would want to somehow tell them about it.
Either it is missing, malformed, whatever.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-01-31 14:35 ` shejialuo
@ 2025-01-31 16:23 ` Junio C Hamano
2025-02-01 9:50 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
1 sibling, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-01-31 16:23 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> On Thu, Jan 30, 2025 at 11:02:18AM -0800, Junio C Hamano wrote:
>> shejialuo <shejialuo@gmail.com> writes:
>>
>> > We will always try to sort the "packed-refs" increasingly by comparing
>> > the refname. So, we should add checks to verify whether the "packed-refs"
>> > is sorted.
>>
>> Do this _ONLY_ when the packed-refs file has a header that declares
>> "sorted" trait. Insisting on a packed-refs file that does not would
>> mean you are stricter than the runtime contract allows.
>>
>
> From my perspective, we should check whether it is sorted when the
> header has a "sorted" trait.
So the three-lines you wrote is not accurate, then. That is why I
said that "should add checks" should not be unconditional---we
should not check if the file contents is sorted when "sorted" trait
is not declared.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-01-31 16:20 ` Junio C Hamano
@ 2025-02-01 9:47 ` shejialuo
2025-02-03 20:15 ` Junio C Hamano
0 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-01 9:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Fri, Jan 31, 2025 at 08:20:36AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> >
> > As I have said in the previous comment, we cannot detect the error if
> > "HEAD" itself is corrupted. However, we will check the referent in the
> > later. So, we don't need to do this.
>
> I still think you absolutely need to diagnose and tell the user
> about the broken HEAD. With your "don't check HEAD because a
> repository with a broken HEAD is not a repository", a check run in
> such a place may find everything else in the repository perfectly
> fine, but because the user wanted "git refs verify" to tell them
> about breakages, you would want to somehow tell them about it.
> Either it is missing, malformed, whatever.
Yes, that's absolutely correct. However, I don't want to do this in
this series. Actually, there is no check for root ref. I will add checks
for root refs later.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-01-31 16:23 ` Junio C Hamano
@ 2025-02-01 9:50 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-01 9:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Fri, Jan 31, 2025 at 08:23:22AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > On Thu, Jan 30, 2025 at 11:02:18AM -0800, Junio C Hamano wrote:
> >> shejialuo <shejialuo@gmail.com> writes:
> >>
> >> > We will always try to sort the "packed-refs" increasingly by comparing
> >> > the refname. So, we should add checks to verify whether the "packed-refs"
> >> > is sorted.
> >>
> >> Do this _ONLY_ when the packed-refs file has a header that declares
> >> "sorted" trait. Insisting on a packed-refs file that does not would
> >> mean you are stricter than the runtime contract allows.
> >>
> >
> > From my perspective, we should check whether it is sorted when the
> > header has a "sorted" trait.
>
> So the three-lines you wrote is not accurate, then. That is why I
> said that "should add checks" should not be unconditional---we
> should not check if the file contents is sorted when "sorted" trait
> is not declared.
I have made confusion here. Sorry. Let me improve this in the next
version.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-01-30 4:07 ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-30 18:23 ` Junio C Hamano
@ 2025-02-03 8:40 ` Patrick Steinhardt
1 sibling, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 8:40 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 30, 2025 at 12:07:23PM +0800, shejialuo wrote:
> Although "git-fsck(1)" and "packed-backend.c" will check some
> consistency and correctness of "packed-refs" file, they never check the
> filetype of the "packed-refs". The user should always use "git
> packed-refs" command to create the raw regular "packed-refs" file, so we
It's `git pack-refs`, not `git packed-refs`.
Otherwise I'm not going to comment on the rest of the commit, as Junio
has already sufficiently discussed it with you, and I very much agree
with his assessment that we don't need to discuss whether or not to use
`open_nofollow()` in this depth.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters
2025-01-30 4:07 ` [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-02-03 8:40 ` Patrick Steinhardt
2025-02-05 10:09 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 8:40 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 30, 2025 at 12:07:46PM +0800, shejialuo wrote:
> We have already implemented the header consistency check for the raw
> "packed-refs" file. Before we implement the consistency check for each
> ref entry, let's analysis [1] which reports that "git fsck" cannot
> detect some NUL characters.
This paragraph doesn't quite parse. I think it can simply be left out,
as the remainder of the commit message already explains in more than
enough detail what you're doing.
> "packed-backend.c::next_record" will use "check_refname_format" to check
> the consistency of the refname. If it is not OK, the program will die.
> So, we already have the code path and we must miss out something.
>
> We use the following code to get the refname:
>
> strbuf_add(&iter->refname_buf, p, eol - p);
> iter->base.refname = iter->refname_buf.buf
>
> In the above code, `p` is the start pointer of the refname and `eol` is
> the next newline pointer. We calculate the length of the refname by
> subtracting the two pointers. Then we add the memory range between `p`
> and `eol` to get the refname.
>
> However, if there are some NUL characters in the memory range between `p`
> and `eol`, we will see the refname as a valid ref name as long as the
> memory range between `p` and first occurred NUL character is valid.
>
> In order to catch above corruption, create a new function
> "refname_contains_nul" by searching the first NUL character. If it is
> not at the end of the string, there must be some NUL characters in the
> refname.
>
> Use this function in "next_record" function to die the program if
> "refname_contains_nul" returns true.
Yeah, makes sense to me. NUL bytes are invalid, and nothing good can
come out of it.
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 883189f3a1..870c8e7aaa 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -494,6 +494,22 @@ static void verify_buffer_safe(struct snapshot *snapshot)
> last_line, eof - last_line);
> }
>
> +/*
> + * When parsing the "packed-refs" file, we will parse it line by line.
> + * Because we know the start pointer of the refname and the next
> + * newline pointer, we could calculate the length of the refname by
> + * subtracting the two pointers. However, there is a corner case where
> + * the refname contains corrupted embedded NUL characters. And
> + * `check_refname_format()` will not catch this when the truncated
> + * refname is still a valid refname. To prevent this, we need to check
> + * whether the refname contains the NUL characters.
> + */
> +static int refname_contains_nul(struct strbuf *refname)
> +{
> + const char *pos = memchr(refname->buf, '\0', refname->len + 1);
> + return pos < refname->buf + refname->len;
> +}
This can be simplified to:
return !!memchr(refname->buf, '\0', refname->len);
Ideally, we'd be amending `check_refname_format()` to do the checking
for us. But we can't without a wider refactoring because that function
gets a C string, and C strings are naturally terminadet by NUL
characters.
I think that adding a new function for this is a bit over the top
though, as the check is unlikely to be useful in a lot of places and the
logic is rather trivial. So I'd just inline the check into
`next_record()`.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check
2025-01-30 4:07 ` [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 4:28 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 8:40 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 30, 2025 at 12:07:58PM +0800, shejialuo wrote:
> "packed-backend.c::next_record" will parse the ref entry to check the
> consistency. This function has already checked the following things:
>
> 1. Parse the main line of the ref entry, if the oid is not correct. It
> will die the program. And then it will check whether the next
> character of the oid is space. Then it will check whether the refname
> is correct.
> 2. If the next line starts with '^', it will continue to parse the oid
> of the peeled oid content and check whether the last character is
> '\n'.
>
> We can iterate each line by using the "packed_fsck_ref_next_line"
> function. Then, create a new fsck message "badPackedRefEntry(ERROR)" to
> report to the user when something is wrong.
>
> Create two new functions "packed_fsck_ref_main_line" and
> "packed_fsck_ref_peeled_line" for case 1 and case 2 respectively. Last,
> update the unit test to exercise the code.
I think this message is going into too much detail about _how_ you are
doing things compared to _what_ you are doing and what the intent is.
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 870c8e7aaa..271c740728 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1819,10 +1819,86 @@ static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
> return 0;
> }
>
> +static int packed_fsck_ref_peeled_line(struct fsck_options *o,
> + struct ref_store *ref_store,
> + struct strbuf *packed_entry,
> + const char *start, const char *eol)
> +{
> + struct fsck_ref_report report = { 0 };
> + struct object_id peeled;
> + const char *p;
> +
> + report.path = packed_entry->buf;
> +
> + start++;
It's a bit weird that we increment `start` here, as it is very intimate
with how the caller calls us. Might be easier to reason about when the
caller did this for us.
> + if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
> + return fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_PACKED_REF_ENTRY,
> + "'%.*s' has invalid peeled oid",
> + (int)(eol - start), start);
> + }
All the braces around those single-line return statements can go away.
> @@ -1843,6 +1919,26 @@ static int packed_fsck_ref_content(struct fsck_options *o,
> "missing header line");
> }
>
> + while (start < eof) {
> + strbuf_reset(&packed_entry);
> + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
> + ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> + ret |= packed_fsck_ref_main_line(o, ref_store, &packed_entry, &refname, start, eol);
Don't we have to stop in case `next_line()` returns an error?
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-01-30 4:08 ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-30 19:02 ` Junio C Hamano
@ 2025-02-03 8:40 ` Patrick Steinhardt
1 sibling, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 8:40 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 30, 2025 at 12:08:10PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 271c740728..b250f987b2 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1768,6 +1774,28 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> return empty_ref_iterator_begin();
> }
>
> +struct fsck_packed_ref_entry {
> + int line_number;
This should rather be a `size_t`, or at least `unsigned`.
> +
> + struct snapshot_record record;
> +};
> +
> +static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(int line_number,
> + const char *start)
> +{
> + struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
> + entry->line_number = line_number;
> + entry->record.start = start;
> + return entry;
> +}
> +
> +static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, int nr)
> +{
> + for (int i = 0; i < nr; i++)
Let's use `size_t` for both `i` and `nr`.
> + free(entries[i]);
> + free(entries);
> +}
> +
> static int packed_fsck_ref_next_line(struct fsck_options *o,
> struct strbuf *packed_entry, const char *start,
> const char *eof, const char **eol)
> @@ -1893,13 +1921,60 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
> return 0;
> }
>
> +static int packed_fsck_ref_sorted(struct fsck_options *o,
> + struct ref_store *ref_store,
> + struct fsck_packed_ref_entry **entries,
> + int nr)
> +{
> + size_t hexsz = ref_store->repo->hash_algo->hexsz;
> + struct strbuf packed_entry = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> + struct strbuf refname1 = STRBUF_INIT;
> + struct strbuf refname2 = STRBUF_INIT;
> + int ret = 0;
> +
> + for (int i = 1; i < nr; i++) {
Here, as well.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-01-31 14:35 ` shejialuo
2025-01-31 16:23 ` Junio C Hamano
@ 2025-02-03 8:40 ` Patrick Steinhardt
1 sibling, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 8:40 UTC (permalink / raw)
To: shejialuo; +Cc: Junio C Hamano, git, Karthik Nayak, Michael Haggerty
On Fri, Jan 31, 2025 at 10:35:51PM +0800, shejialuo wrote:
> On Thu, Jan 30, 2025 at 11:02:18AM -0800, Junio C Hamano wrote:
> > shejialuo <shejialuo@gmail.com> writes:
> > Makes sense. It has been a source of bugs a couple years ago, and it can
> > silently make you receive wrong results, so this is quite a sensible
> > check to have.
>
> Patrick, could you please help to explain this. I don't know whether we
> need to check whether "packed-refs" is sorted always. It seems that we
> truly allow refs unsorted. We need to know whether we should tighten
> this?
The context here is that packed-refs sometimes claim that they are
sorted, but indeed they aren't. There are two sources for this that I've
seen in the wild:
- An invalid comparison function. I think I remember that libgit2 at
one point sorted them incorrectly, but not a 100% sure anymore where
I've seen this.
- A user manually edits the packed-refs file, but isn't aware of the
sorting.
So we should assert that a packed-refs file is correctly sorted, but
only when the header claims that it should be sorted.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process
2025-01-30 4:08 ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-30 19:03 ` Junio C Hamano
@ 2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 5:32 ` shejialuo
1 sibling, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-03 8:40 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 30, 2025 at 12:08:22PM +0800, shejialuo wrote:
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 7a4dcb0716..9a8613d07f 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -905,6 +905,34 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
> return res;
> }
>
> +static void fsck_refs(struct repository *r)
> +{
> + struct child_process refs_verify = CHILD_PROCESS_INIT;
> + struct progress *progress = NULL;
> + uint64_t progress_num = 1;
> +
> + if (show_progress)
> + progress = start_progress(r, _("Checking ref database"),
> + progress_num);
Hm. I don't really think that this progress meter adds anything right
now. It only shows either 0 or 1, so it basically only tells you when
you're done. And that is something that the user can tell without a
progress meter.
> +
> + if (verbose)
> + fprintf_ln(stderr, _("Checking ref database"));
> +
> + child_process_init(&refs_verify);
> + refs_verify.git_cmd = 1;
> + strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
> + if (verbose)
> + strvec_push(&refs_verify.args, "--verbose");
> + if (check_strict)
> + strvec_push(&refs_verify.args, "--strict");
> +
> + if (run_command(&refs_verify))
> + errors_found |= ERROR_REFS;
> +
> + display_progress(progress, 1);
> + stop_progress(&progress);
> +}
> +
> static char const * const fsck_usage[] = {
> N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
> " [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
> @@ -970,6 +998,8 @@ int cmd_fsck(int argc,
> git_config(git_fsck_config, &fsck_obj_options);
> prepare_repo_settings(the_repository);
>
> + fsck_refs(the_repository);
I think there needs to be a way to disable this. How about we add an
option `--[no-]references` to do so? I was briefly wondering whether we
also want to have `--only-references`, but if a user wants to do that
they can simply execute `git refs verify` directly.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-02-01 9:47 ` shejialuo
@ 2025-02-03 20:15 ` Junio C Hamano
2025-02-04 3:58 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-02-03 20:15 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> On Fri, Jan 31, 2025 at 08:20:36AM -0800, Junio C Hamano wrote:
>> shejialuo <shejialuo@gmail.com> writes:
>>
>> >
>> > As I have said in the previous comment, we cannot detect the error if
>> > "HEAD" itself is corrupted. However, we will check the referent in the
>> > later. So, we don't need to do this.
>>
>> I still think you absolutely need to diagnose and tell the user
>> about the broken HEAD. With your "don't check HEAD because a
>> repository with a broken HEAD is not a repository", a check run in
>> such a place may find everything else in the repository perfectly
>> fine, but because the user wanted "git refs verify" to tell them
>> about breakages, you would want to somehow tell them about it.
>> Either it is missing, malformed, whatever.
>
> Yes, that's absolutely correct. However, I don't want to do this in
> this series. Actually, there is no check for root ref. I will add checks
> for root refs later.
Another thing I just thought of is that what is your plans for
repository discovery when HEAD is iffy. In the working tree of our
project, you go to a subdirectory, say "t/", and then corrupt the
HEAD, would "git refs verify" still recognise that ../.git/ is the
"repository" the user is interested in, but it has a broken HEAD?
setup.c:is_git_directory() would say "no", so I am not sure the
discovery would work without changing that, and I am not sure if it
is worth doing (i.e. when the user knows the repository's HEAD is
broken, it is OK to disable discovery and force them to say
GIT_DIR=/this/directory).
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular
2025-02-03 20:15 ` Junio C Hamano
@ 2025-02-04 3:58 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-04 3:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Mon, Feb 03, 2025 at 12:15:39PM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > On Fri, Jan 31, 2025 at 08:20:36AM -0800, Junio C Hamano wrote:
> >> shejialuo <shejialuo@gmail.com> writes:
> >>
> >> >
> >> > As I have said in the previous comment, we cannot detect the error if
> >> > "HEAD" itself is corrupted. However, we will check the referent in the
> >> > later. So, we don't need to do this.
> >>
> >> I still think you absolutely need to diagnose and tell the user
> >> about the broken HEAD. With your "don't check HEAD because a
> >> repository with a broken HEAD is not a repository", a check run in
> >> such a place may find everything else in the repository perfectly
> >> fine, but because the user wanted "git refs verify" to tell them
> >> about breakages, you would want to somehow tell them about it.
> >> Either it is missing, malformed, whatever.
> >
> > Yes, that's absolutely correct. However, I don't want to do this in
> > this series. Actually, there is no check for root ref. I will add checks
> > for root refs later.
>
> Another thing I just thought of is that what is your plans for
> repository discovery when HEAD is iffy. In the working tree of our
> project, you go to a subdirectory, say "t/", and then corrupt the
> HEAD, would "git refs verify" still recognise that ../.git/ is the
> "repository" the user is interested in, but it has a broken HEAD?
>
> setup.c:is_git_directory() would say "no", so I am not sure the
> discovery would work without changing that, and I am not sure if it
> is worth doing (i.e. when the user knows the repository's HEAD is
> broken, it is OK to disable discovery and force them to say
> GIT_DIR=/this/directory).
I have to say I am not so familiar with the "setup.c" code. Thanks for
the direction here, I will dive into to figure out a solution.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check
2025-02-03 8:40 ` Patrick Steinhardt
@ 2025-02-04 4:28 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-04 4:28 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Mon, Feb 03, 2025 at 09:40:25AM +0100, Patrick Steinhardt wrote:
> On Thu, Jan 30, 2025 at 12:07:58PM +0800, shejialuo wrote:
> > "packed-backend.c::next_record" will parse the ref entry to check the
> > consistency. This function has already checked the following things:
> >
> > 1. Parse the main line of the ref entry, if the oid is not correct. It
> > will die the program. And then it will check whether the next
> > character of the oid is space. Then it will check whether the refname
> > is correct.
> > 2. If the next line starts with '^', it will continue to parse the oid
> > of the peeled oid content and check whether the last character is
> > '\n'.
> >
> > We can iterate each line by using the "packed_fsck_ref_next_line"
> > function. Then, create a new fsck message "badPackedRefEntry(ERROR)" to
> > report to the user when something is wrong.
> >
> > Create two new functions "packed_fsck_ref_main_line" and
> > "packed_fsck_ref_peeled_line" for case 1 and case 2 respectively. Last,
> > update the unit test to exercise the code.
>
> I think this message is going into too much detail about _how_ you are
> doing things compared to _what_ you are doing and what the intent is.
>
I think I have caused some confusion here. The reason why I mention what
"next_record" does is that I want to port these two checks. Let me
improve this in the next version. I will highlight more about the
motivation.
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index 870c8e7aaa..271c740728 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -1819,10 +1819,86 @@ static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
> > return 0;
> > }
> >
> > +static int packed_fsck_ref_peeled_line(struct fsck_options *o,
> > + struct ref_store *ref_store,
> > + struct strbuf *packed_entry,
> > + const char *start, const char *eol)
> > +{
> > + struct fsck_ref_report report = { 0 };
> > + struct object_id peeled;
> > + const char *p;
> > +
> > + report.path = packed_entry->buf;
> > +
> > + start++;
>
> It's a bit weird that we increment `start` here, as it is very intimate
> with how the caller calls us. Might be easier to reason about when the
> caller did this for us.
>
For each ref entry, we have two pointers, one is the `start` which is
used to indicate the start of the line and `eol` is the end of the line.
Let's see how we call this function:
if (start < eof && *start == '^') {
strbuf_reset(&packed_entry);
strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
ret |= packed_fsck_ref_peeled_line(o, ref_store, &packed_entry,
start, eol);
start = eol + 1;
line_number++;
}
The reason why we do this is that we need to skip the '^' character. I
don't do this in the `if` statement. This is because I want to make the
semantics of the `start` variable unchanged.
I would add a comment here to explain why we need to execute "start++".
> > + if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
> > + return fsck_report_ref(o, &report,
> > + FSCK_MSG_BAD_PACKED_REF_ENTRY,
> > + "'%.*s' has invalid peeled oid",
> > + (int)(eol - start), start);
> > + }
>
> All the braces around those single-line return statements can go away.
>
I see. So, I have misunderstanding here. I have thought that we should
add braces because we have split this single statement into multiple
lines. Let me update this in the next version.
> > @@ -1843,6 +1919,26 @@ static int packed_fsck_ref_content(struct fsck_options *o,
> > "missing header line");
> > }
> >
> > + while (start < eof) {
> > + strbuf_reset(&packed_entry);
> > + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
> > + ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> > + ret |= packed_fsck_ref_main_line(o, ref_store, &packed_entry, &refname, start, eol);
>
> Don't we have to stop in case `next_line()` returns an error?
>
No, we don't have to stop. We will continue to check the last ref entry,
this is intentional, we still need to check the last ref entry even
though there is no newline. I don't think we should ignore this part.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process
2025-02-03 8:40 ` Patrick Steinhardt
@ 2025-02-04 5:32 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-04 5:32 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Mon, Feb 03, 2025 at 09:40:43AM +0100, Patrick Steinhardt wrote:
> On Thu, Jan 30, 2025 at 12:08:22PM +0800, shejialuo wrote:
> > diff --git a/builtin/fsck.c b/builtin/fsck.c
> > index 7a4dcb0716..9a8613d07f 100644
> > --- a/builtin/fsck.c
> > +++ b/builtin/fsck.c
> > @@ -905,6 +905,34 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
> > return res;
> > }
> >
> > +static void fsck_refs(struct repository *r)
> > +{
> > + struct child_process refs_verify = CHILD_PROCESS_INIT;
> > + struct progress *progress = NULL;
> > + uint64_t progress_num = 1;
> > +
> > + if (show_progress)
> > + progress = start_progress(r, _("Checking ref database"),
> > + progress_num);
>
> Hm. I don't really think that this progress meter adds anything right
> now. It only shows either 0 or 1, so it basically only tells you when
> you're done. And that is something that the user can tell without a
> progress meter.
>
You are correct in the functionality part. Actually, my very initial
implementation is what you have said. I simply used the following way to
indicate the user that we are going to check ref database.
fprintf_ln(stderr, _("Checking ref database"));
However, it will break a test in "t/t1050-large.sh::fsck large blobs". I
cite the shell script below:
test_expect_success 'fsck large blobs' '
git fsck 2>err &&
test_must_be_empty err
'
> > +
> > + if (verbose)
> > + fprintf_ln(stderr, _("Checking ref database"));
> > +
That's the reason why we need to use `verbose` to control the behavior
here. Put it futhermore, We either use `process` or `verbose` to print
the message to the user. This is a pattern widely used in "git-fsck(1)".
For example "builtin/fsck.c::fsck_object_dir", we have the following
code:
if (verbose)
fprintf_ln(stderr, _("Checking object directory"));
if (show_progress)
progress = start_progress(the_repository,
_("Checking object directories"), 256);
So, that's why I use progress here. We need this to print the
information to the user. I have also tried to print to the stdout like
the following
fprintf_ln(stdout, _("Checking ref database"));
It will also break the test.
> > + child_process_init(&refs_verify);
> > + refs_verify.git_cmd = 1;
> > + strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
> > + if (verbose)
> > + strvec_push(&refs_verify.args, "--verbose");
> > + if (check_strict)
> > + strvec_push(&refs_verify.args, "--strict");
> > +
> > + if (run_command(&refs_verify))
> > + errors_found |= ERROR_REFS;
> > +
> > + display_progress(progress, 1);
> > + stop_progress(&progress);
> > +}
> > +
> > static char const * const fsck_usage[] = {
> > N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
> > " [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
> > @@ -970,6 +998,8 @@ int cmd_fsck(int argc,
> > git_config(git_fsck_config, &fsck_obj_options);
> > prepare_repo_settings(the_repository);
> >
> > + fsck_refs(the_repository);
>
> I think there needs to be a way to disable this. How about we add an
> option `--[no-]references` to do so? I was briefly wondering whether we
> also want to have `--only-references`, but if a user wants to do that
> they can simply execute `git refs verify` directly.
>
Good idea, let me improve this in the next version.
Thanks,
Jialuo
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters
2025-02-03 8:40 ` Patrick Steinhardt
@ 2025-02-05 10:09 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-05 10:09 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Mon, Feb 03, 2025 at 09:40:22AM +0100, Patrick Steinhardt wrote:
> On Thu, Jan 30, 2025 at 12:07:46PM +0800, shejialuo wrote:
> > We have already implemented the header consistency check for the raw
> > "packed-refs" file. Before we implement the consistency check for each
> > ref entry, let's analysis [1] which reports that "git fsck" cannot
> > detect some NUL characters.
>
> This paragraph doesn't quite parse. I think it can simply be left out,
> as the remainder of the commit message already explains in more than
> enough detail what you're doing.
>
Let me improve this in the next version.
> > "packed-backend.c::next_record" will use "check_refname_format" to check
> > the consistency of the refname. If it is not OK, the program will die.
> > So, we already have the code path and we must miss out something.
> >
> > We use the following code to get the refname:
> >
> > strbuf_add(&iter->refname_buf, p, eol - p);
> > iter->base.refname = iter->refname_buf.buf
> >
> > In the above code, `p` is the start pointer of the refname and `eol` is
> > the next newline pointer. We calculate the length of the refname by
> > subtracting the two pointers. Then we add the memory range between `p`
> > and `eol` to get the refname.
> >
> > However, if there are some NUL characters in the memory range between `p`
> > and `eol`, we will see the refname as a valid ref name as long as the
> > memory range between `p` and first occurred NUL character is valid.
> >
> > In order to catch above corruption, create a new function
> > "refname_contains_nul" by searching the first NUL character. If it is
> > not at the end of the string, there must be some NUL characters in the
> > refname.
> >
> > Use this function in "next_record" function to die the program if
> > "refname_contains_nul" returns true.
>
> Yeah, makes sense to me. NUL bytes are invalid, and nothing good can
> come out of it.
>
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index 883189f3a1..870c8e7aaa 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -494,6 +494,22 @@ static void verify_buffer_safe(struct snapshot *snapshot)
> > last_line, eof - last_line);
> > }
> >
> > +/*
> > + * When parsing the "packed-refs" file, we will parse it line by line.
> > + * Because we know the start pointer of the refname and the next
> > + * newline pointer, we could calculate the length of the refname by
> > + * subtracting the two pointers. However, there is a corner case where
> > + * the refname contains corrupted embedded NUL characters. And
> > + * `check_refname_format()` will not catch this when the truncated
> > + * refname is still a valid refname. To prevent this, we need to check
> > + * whether the refname contains the NUL characters.
> > + */
> > +static int refname_contains_nul(struct strbuf *refname)
> > +{
> > + const char *pos = memchr(refname->buf, '\0', refname->len + 1);
> > + return pos < refname->buf + refname->len;
> > +}
>
> This can be simplified to:
>
> return !!memchr(refname->buf, '\0', refname->len);
>
This is very nice.
> Ideally, we'd be amending `check_refname_format()` to do the checking
> for us. But we can't without a wider refactoring because that function
> gets a C string, and C strings are naturally terminadet by NUL
> characters.
>
Yes, we cannot. Actually, this is a corner case. NUL character is so
special.
> I think that adding a new function for this is a bit over the top
> though, as the check is unlikely to be useful in a lot of places and the
> logic is rather trivial. So I'd just inline the check into
> `next_record()`.
>
The reason why I extract this logic into a separate function is that we
will reuse this logic for later packed backend consistency checking. We
nearly use the same way to parse the raw "packed-ref" files. So, I don't
want to repeat here.
I will improve the commit message to add the motivation why we need to
use a function instead of using it in the inline way.
Thanks,
Jialuo
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v3 0/8] add more ref consistency checks
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
` (7 preceding siblings ...)
2025-01-30 4:08 ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-02-06 5:56 ` shejialuo
2025-02-06 5:58 ` [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
` (8 more replies)
8 siblings, 9 replies; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:56 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi All:
This new version handles the following problem:
1. [PACTH v3 2/8]: enhance the commit message.
2. [PACTH v3 3/8]: delete some paragraph in the commit message to make
it more clear.
3. [PATCH v3 4/8]: remove unneeded checks for header and related tests
and update the commit message.
4. [PATCH v3 5/8]: enhance the code suggested by Patrick.
5. [PATCH v3 6/8]: enhance the commit message and add a comment to
explain why we need to execute `start++` for peeled line.
6. [PATCH v3 7/8]: parse the header to get whether there is a "sorted"
trait. If so, we need to check whether it is sorted and update the
test to exercise.
7. [PATCH v3 8/8]: use 1 literal instead of creating a new variable. And
add options "--[no-]references" to allow the user disable checking
the ref database. Then, update the related documentation and commit
message.
Thanks,
Jialuo
shejialuo (8):
t0602: use subshell to ensure working directory unchanged
builtin/refs: get worktrees without reading head information
packed-backend: check whether the "packed-refs" is regular file
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NUL characters
packed-backend: add "packed-refs" entry consistency check
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.txt | 14 +
Documentation/git-fsck.txt | 6 +-
builtin/fsck.c | 33 +-
builtin/refs.c | 2 +-
fsck.h | 4 +
refs/packed-backend.c | 338 +++++++++-
t/t0602-reffiles-fsck.sh | 1111 +++++++++++++++++++--------------
worktree.c | 5 +
worktree.h | 6 +
9 files changed, 1036 insertions(+), 483 deletions(-)
Range-diff against v2:
1: 20889b7b18 = 1: 20889b7b18 t0602: use subshell to ensure working directory unchanged
2: 97688c8700 ! 2: 9d7780e953 builtin/refs: get worktrees without reading head info
@@ Metadata
Author: shejialuo <shejialuo@gmail.com>
## Commit message ##
- builtin/refs: get worktrees without reading head info
+ builtin/refs: get worktrees without reading head information
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
@@ Commit message
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
- "git fsck", we don't want to simply die the program but rather show the
- warnings or errors as many as possible to info the users. So, we should
- avoid reading the head info.
+ "git fsck", we should avoid reading the head information, which may
+ execute the read operation in packed backend with stricter checks to die
+ the program. Instead, we should continue to check other parts of the
+ "packed-refs" file completely.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
- reading head info.
+ reading head information.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
3: 122ad3be02 ! 3: 44d26f6440 packed-backend: check whether the "packed-refs" is regular
@@ Metadata
Author: shejialuo <shejialuo@gmail.com>
## Commit message ##
- packed-backend: check whether the "packed-refs" is regular
+ packed-backend: check whether the "packed-refs" is regular file
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". The user should always use "git
- packed-refs" command to create the raw regular "packed-refs" file, so we
+ pack-refs" command to create the raw regular "packed-refs" file, so we
need to explicitly check this in "git refs verify".
- We could use the following two ways to check whether the "packed-refs"
- is regular:
-
- 1. We could use "lstat" system call to check the file mode.
- 2. We could use "open_nofollow" wrapper to open the raw "packed-refs" file
- If the returned fd value is less than 0, we could check whether the
- "errno" is "ELOOP" to report an error to the user.
-
- It might seems that the method one is much easier than method two.
- However, method one has a significant drawback. When we have checked the
- file mode using "lstat", we will need to read the file content, there is
- a possibility that when finishing reading the file content to the
- memory, the file could be changed into a symlink and we cannot notice.
-
- With method two, we could get the "fd" firstly. Even if the file is
- changed into a symlink, we could still operate the "fd" in the memory
- which is consistent across the checking which avoids race condition.
+ We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
+ If the returned "fd" value is less than 0, we could check whether the
+ "errno" is "ELOOP" to report an error to the user.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
4: c3d32993c5 ! 4: a9ab7af16a packed-backend: add "packed-refs" header consistency check
@@ Commit message
pack-refs with:". As we are going to implement the header consistency
check, we should port this check into "packed_fsck".
- However, the above check is not enough, this is because "git pack-refs"
- will always write "PACKED_REFS_HEADER" which is a constant string to the
- "packed-refs" file. So, we should check the following things for the
- header.
+ However, we need to consider other situations and discuss whether we
+ need to add checks.
- 1. If the header does not exist, we may report an error to the user
- because it should exist, but we do allow no header in "packed-refs"
- file. So, create a new fsck message "packedRefMissingHeader(INFO)" to
- warn the user and also keep compatibility.
+ 1. If the header does not exist, we should not report an error to the
+ user. This is because in older Git version, we never write header in
+ the "packed-refs" file. Also, we do allow no header in "packed-refs"
+ in runtime.
2. If the header content does not start with "# packed-ref with:", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
- "PACKED_REFS_HEADER", ideally, we should report an error to the user.
- However, we allow other contents as long as the header content starts
- with "# packed-ref with:". To keep compatibility, create a new fsck
- message "unknownPackedRefHeader(INFO)" to warn about this. We may
- tighten this rule in the future.
+ "PACKED_REFS_HEADER". This is expected because we make it extensible
+ intentionally. So, there is no need to report.
- In order to achieve above checks, read the "packed-refs" file via
- "strbuf_read". Like what "create_snapshot" and other functions do, we
- could split the line by finding the next newline in the buffer. When we
- cannot find a newline, we could report an error.
+ As we have analyzed, we only need to check the case 2 in the above. In
+ order to do this, read the "packed-refs" file via "strbuf_read". Like
+ what "create_snapshot" and other functions do, we could split the line
+ by finding the next newline in the buffer. When we cannot find a
+ newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
- Then, parse the first line to apply the above three checks. Update the
- test to excise the code.
+ Then, parse the first line to apply the checks. Update the test to
+ exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
@@ Documentation/fsck-msgids.txt
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
-+
-+`packedRefMissingHeader`::
-+ (INFO) The "packed-refs" file does not contain the header.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
-@@
- `treeNotSorted`::
- (ERROR) A tree is not properly sorted.
-
-+`unknownPackedRefHeader`::
-+ (INFO) The "packed-refs" header starts with "# pack-refs with:"
-+ but the remaining content is not the same as what `git pack-refs`
-+ would write.
-+
- `unknownType`::
- (ERROR) Found an unknown object type.
-
## fsck.h ##
@@ fsck.h: enum fsck_msg_type {
@@ fsck.h: enum fsck_msg_type {
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
-@@ fsck.h: enum fsck_msg_type {
- FUNC(REF_MISSING_NEWLINE, INFO) \
- FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
- FUNC(TRAILING_REF_CONTENT, INFO) \
-+ FUNC(UNKNOWN_PACKED_REF_HEADER, INFO) \
-+ FUNC(PACKED_REF_MISSING_HEADER, INFO) \
- /* ignored (elevated when requested) */ \
- FUNC(EXTRA_HEADER_ENTRY, IGNORE)
-
## refs/packed-backend.c ##
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ return ret;
+}
+
-+static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
++static int packed_fsck_ref_header(struct fsck_options *o,
++ const char *start, const char *eol)
+{
-+ const char *err_fmt = NULL;
-+ int fsck_msg_id = -1;
-+
+ if (!starts_with(start, "# pack-refs with:")) {
-+ err_fmt = "'%.*s' does not start with '# pack-refs with:'";
-+ fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
-+ } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
-+ err_fmt = "'%.*s' is an unknown packed-refs header";
-+ fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
-+ }
-+
-+ if (err_fmt && fsck_msg_id >= 0) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
-+ return fsck_report_ref(o, &report, fsck_msg_id, err_fmt,
++ return fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_PACKED_REF_HEADER,
++ "'%.*s' does not start with '# pack-refs with:'",
+ (int)(eol - start), start);
-+
+ }
+
+ return 0;
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ const char *start, const char *eof)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
-+ int line_number = 1;
++ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
-+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
-+ } else {
-+ struct fsck_ref_report report = { 0 };
-+ report.path = "packed-refs";
-+
-+ ret |= fsck_report_ref(o, &report,
-+ FSCK_MSG_PACKED_REF_MISSING_HEADER,
-+ "missing header line");
+ }
+
+ strbuf_release(&packed_entry);
@@ t/t0602-reffiles-fsck.sh: test_expect_success SYMLINKS 'the filetype of packed-r
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
-+ printf "$(git rev-parse main) refs/heads/main\n" >.git/packed-refs &&
-+ git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ warning: packed-refs: packedRefMissingHeader: missing header line
-+ EOF
-+ rm .git/packed-refs &&
-+ test_cmp expect err &&
-+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
-+ "# pack-refs with traits: peeled fully-peeled sorted " \
-+ "# pack-refs with a: peeled fully-peeled"
++ "# pack-refs with traits: peeled fully-peeled sorted " \
++ "# pack-refs with a: peeled fully-peeled"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success SYMLINKS 'the filetype of packed-r
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
-+ done &&
-+
-+ for unknown_header in "# pack-refs with: peeled fully-peeled sorted garbage" \
-+ "# pack-refs with: peeled" \
-+ "# pack-refs with: peeled peeled-fully sort"
-+ do
-+ printf "%s\n" "$unknown_header" >.git/packed-refs &&
-+ git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ warning: packed-refs.header: unknownPackedRefHeader: '\''$unknown_header'\'' is an unknown packed-refs header
-+ EOF
-+ rm .git/packed-refs &&
-+ test_cmp expect err || return 1
+ done
+ )
+'
5: c545a61107 ! 5: 9b075434a1 packed-backend: check whether the refname contains NUL characters
@@ Metadata
## Commit message ##
packed-backend: check whether the refname contains NUL characters
- We have already implemented the header consistency check for the raw
- "packed-refs" file. Before we implement the consistency check for each
- ref entry, let's analysis [1] which reports that "git fsck" cannot
- detect some NUL characters.
-
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
- So, we already have the code path and we must miss out something.
+ However, it is reported in [1], we cannot catch some corruption. But we
+ already have the code path and we must miss out something.
We use the following code to get the refname:
@@ refs/packed-backend.c: static void verify_buffer_safe(struct snapshot *snapshot)
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
-+ const char *pos = memchr(refname->buf, '\0', refname->len + 1);
-+ return pos < refname->buf + refname->len;
++ return !!memchr(refname->buf, '\0', refname->len);
+}
+
#define SMALL_FILE_SIZE (32*1024)
6: a480e2bf49 ! 6: a976508319 packed-backend: add "packed-refs" entry consistency check
@@ Commit message
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
- 1. Parse the main line of the ref entry, if the oid is not correct. It
- will die the program. And then it will check whether the next
- character of the oid is space. Then it will check whether the refname
- is correct.
- 2. If the next line starts with '^', it will continue to parse the oid
- of the peeled oid content and check whether the last character is
- '\n'.
+ 1. Parse the main line of the ref entry to inspect whether the oid is
+ not correct. Then, check whether the next character is oid. Then
+ check the refname.
+ 2. If the next line starts with '^', it would continue to parse the
+ peeled oid and check whether the last character is '\n'.
- We can iterate each line by using the "packed_fsck_ref_next_line"
- function. Then, create a new fsck message "badPackedRefEntry(ERROR)" to
- report to the user when something is wrong.
-
- Create two new functions "packed_fsck_ref_main_line" and
- "packed_fsck_ref_peeled_line" for case 1 and case 2 respectively. Last,
- update the unit test to exercise the code.
+ As we decide to implement the ref consistency check for "packed-refs",
+ let's port these two checks and update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
@@ fsck.h: enum fsck_msg_type {
FUNC(BAD_REF_CONTENT, ERROR) \
## refs/packed-backend.c ##
-@@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o, const char *start, con
+@@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o,
return 0;
}
@@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o,
+
+ report.path = packed_entry->buf;
+
++ /*
++ * Skip the '^' and parse the peeled oid.
++ */
+ start++;
-+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
++ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
-+ }
+
-+ if (p != eol) {
++ if (p != eol)
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
-+ }
+
+ return 0;
+}
@@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o,
+
+ report.path = packed_entry->buf;
+
-+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
++ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
-+ }
+
-+ if (p == eol || !isspace(*p)) {
++ if (p == eol || !isspace(*p))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
-+ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
-+ if (refname_contains_nul(refname)) {
++ if (refname_contains_nul(refname))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
-+ }
+
-+ if (check_refname_format(refname->buf, 0)) {
++ if (check_refname_format(refname->buf, 0))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
-+ }
+
+ return 0;
+}
@@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o,
{
struct strbuf packed_entry = STRBUF_INIT;
+ struct strbuf refname = STRBUF_INIT;
- int line_number = 1;
+ unsigned long line_number = 1;
const char *eol;
int ret = 0;
@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o,
- "missing header line");
+ line_number++;
}
+ while (start < eof) {
+ strbuf_reset(&packed_entry);
-+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, &packed_entry, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ strbuf_reset(&packed_entry);
-+ strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, &packed_entry,
+ start, eol);
7: 199965dfb7 ! 7: 707e3e2151 packed-backend: check whether the "packed-refs" is sorted
@@ Metadata
## Commit message ##
packed-backend: check whether the "packed-refs" is sorted
- We will always try to sort the "packed-refs" increasingly by comparing
- the refname. So, we should add checks to verify whether the "packed-refs"
- is sorted.
+ When there is a "sorted" trait in the header of the "packed-refs" file,
+ it means that each entry is sorted increasingly by comparing the
+ refname. We should add checks to verify whether the "packed-refs" is
+ sorted in this case.
- We already have code to parse the content. Let's create a new structure
- "fsck_packed_ref_entry" to store the state during the parsing process
- for every entry. It may seem that we could just add a new "struct strbuf
- refname" into the "struct fsck_packed_ref_entry" and during the parsing
- process, we could store the refname into this structure and we could
- compare later. However, this is not a good design due to the following
- reasons:
+ Update the "packed_fsck_ref_header" to know whether there is a "sorted"
+ trail in the header. Then, create a new structure "fsck_packed_ref_entry"
+ to store the state during the parsing process for every entry. It may
+ seem that we could just add a new "struct strbuf refname" into the
+ "struct fsck_packed_ref_entry" and during the parsing process, we could
+ store the refname into this structure and thus we could compare later.
+ However, this is not a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
@@ Commit message
## Documentation/fsck-msgids.txt ##
@@
- `packedRefMissingHeader`::
- (INFO) The "packed-refs" file does not contain the header.
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
}
+struct fsck_packed_ref_entry {
-+ int line_number;
++ unsigned long line_number;
+
+ struct snapshot_record record;
+};
+
-+static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(int line_number,
++static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(unsigned long line_number,
+ const char *start)
+{
+ struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ return entry;
+}
+
-+static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, int nr)
++static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, size_t nr)
+{
-+ for (int i = 0; i < nr; i++)
++ for (size_t i = 0; i < nr; i++)
+ free(entries[i]);
+ free(entries);
+}
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
static int packed_fsck_ref_next_line(struct fsck_options *o,
struct strbuf *packed_entry, const char *start,
const char *eof, const char **eol)
+@@ refs/packed-backend.c: static int packed_fsck_ref_next_line(struct fsck_options *o,
+ }
+
+ static int packed_fsck_ref_header(struct fsck_options *o,
+- const char *start, const char *eol)
++ const char *start, const char *eol,
++ unsigned int *sorted)
+ {
+- if (!starts_with(start, "# pack-refs with:")) {
++ struct string_list traits = STRING_LIST_INIT_NODUP;
++ char *tmp_line;
++ int ret = 0;
++ char *p;
++
++ tmp_line = xmemdupz(start, eol - start);
++ if (!skip_prefix(tmp_line, "# pack-refs with:", (const char **)&p)) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+- return fsck_report_ref(o, &report,
+- FSCK_MSG_BAD_PACKED_REF_HEADER,
+- "'%.*s' does not start with '# pack-refs with:'",
+- (int)(eol - start), start);
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_PACKED_REF_HEADER,
++ "'%.*s' does not start with '# pack-refs with:'",
++ (int)(eol - start), start);
++ goto cleanup;
+ }
+
+- return 0;
++ string_list_split_in_place(&traits, p, " ", -1);
++ *sorted = unsorted_string_list_has_string(&traits, "sorted");
++
++cleanup:
++ free(tmp_line);
++ string_list_clear(&traits, 0);
++ return ret;
+ }
+
+ static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ refs/packed-backend.c: static int packed_fsck_ref_main_line(struct fsck_options *o,
return 0;
}
@@ refs/packed-backend.c: static int packed_fsck_ref_main_line(struct fsck_options
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct fsck_packed_ref_entry **entries,
-+ int nr)
++ size_t nr)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
@@ refs/packed-backend.c: static int packed_fsck_ref_main_line(struct fsck_options
+ struct strbuf refname2 = STRBUF_INIT;
+ int ret = 0;
+
-+ for (int i = 1; i < nr; i++) {
++ for (size_t i = 1; i < nr; i++) {
+ const char *r1 = entries[i - 1]->record.start + hexsz + 1;
+ const char *r2 = entries[i]->record.start + hexsz + 1;
+
@@ refs/packed-backend.c: static int packed_fsck_ref_main_line(struct fsck_options
+ entries[i]->record.len);
+ strbuf_add(&refname2, r2, eol - r2);
+
-+ strbuf_addf(&packed_entry, "packed-refs line %d",
++ strbuf_addf(&packed_entry, "packed-refs line %lu",
+ entries[i - 1]->line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
@@ refs/packed-backend.c: static int packed_fsck_ref_main_line(struct fsck_options
struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_packed_ref_entry **entries;
struct strbuf refname = STRBUF_INIT;
-+ int entry_alloc = 20;
- int line_number = 1;
-+ int entry_nr = 0;
+ unsigned long line_number = 1;
++ unsigned int sorted = 0;
++ size_t entry_alloc = 20;
++ size_t entry_nr = 0;
const char *eol;
int ret = 0;
-@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o,
- "missing header line");
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ if (*start == '#') {
+- ret |= packed_fsck_ref_header(o, start, eol);
++ ret |= packed_fsck_ref_header(o, start, eol, &sorted);
+
+ start = eol + 1;
+ line_number++;
}
+ ALLOC_ARRAY(entries, entry_alloc);
@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o
+ ALLOC_GROW(entries, entry_nr + 1, entry_alloc);
+ entries[entry_nr++] = entry;
strbuf_reset(&packed_entry);
- strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o,
start = eol + 1;
@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o
+ entry->record.len = start - entry->record.start;
}
-+ if (!ret)
++ if (!ret && sorted)
+ ret |= packed_fsck_ref_sorted(o, ref_store, entries, entry_nr);
+
strbuf_release(&packed_entry);
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs content should be che
)
'
-+test_expect_success 'packed-ref sorted should be checked' '
++test_expect_success 'packed-ref with sorted trait should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs content should be che
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
++
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s %s\n" "$tag_1_oid" "$refname3" >>.git/packed-refs &&
+ printf "^%s\n" "$tag_1_peeled_oid" >>.git/packed-refs &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs content should be che
+ test_cmp expect err
+ )
+'
++
++test_expect_success 'packed-ref without sorted trait should not be checked' '
++ test_when_finished "rm -rf repo" &&
++ git init repo &&
++ (
++ cd repo &&
++ test_commit default &&
++ git branch branch-1 &&
++ git branch branch-2 &&
++ git tag -a annotated-tag-1 -m tag-1 &&
++ branch_1_oid=$(git rev-parse branch-1) &&
++ branch_2_oid=$(git rev-parse branch-2) &&
++ tag_1_oid=$(git rev-parse annotated-tag-1) &&
++ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
++ refname1="refs/heads/main" &&
++ refname2="refs/heads/foo" &&
++ refname3="refs/tags/foo" &&
++ printf "# pack-refs with: peeled fully-peeled \n" >.git/packed-refs &&
++ printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
++ printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
++ git refs verify 2>err &&
++ test_must_be_empty err
++ )
++'
+
test_done
8: 81a2164c04 ! 8: 4f2170aa7c builtin/fsck: add `git refs verify` child process
@@ Commit message
It's hard to know how many loose refs we will check now. We might
improve this later.
- And we run this function in the first execution sequence of
- "git-fsck(1)" because we don't want the existing code of "git-fsck(1)"
- which implicitly checks the consistency of refs to die the program.
+ Then, introduce the option to allow the user to disable checking ref
+ database consistency. Put this function in the very first execution
+ sequence of "git-fsck(1)" due to that we don't want the existing code of
+ "git-fsck(1)" which would implicitly check the consistency of refs to
+ die the program.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
+ ## Documentation/git-fsck.txt ##
+@@ Documentation/git-fsck.txt: SYNOPSIS
+ 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
+ [--[no-]full] [--strict] [--verbose] [--lost-found]
+ [--[no-]dangling] [--[no-]progress] [--connectivity-only]
+- [--[no-]name-objects] [<object>...]
++ [--[no-]name-objects] [--[no-]references] [<object>...]
+
+ DESCRIPTION
+ -----------
+@@ Documentation/git-fsck.txt: care about this output and want to speed it up further.
+ progress status even if the standard error stream is not
+ directed to a terminal.
+
++--[no-]references::
++ Control whether to check the references database consistency
++ via 'git refs verify'. See linkgit:git-refs[1] for details.
++
+ CONFIGURATION
+ -------------
+
+
## builtin/fsck.c ##
+@@ builtin/fsck.c: static int verbose;
+ static int show_progress = -1;
+ static int show_dangling = 1;
+ static int name_objects;
++static int check_references = 1;
+ #define ERROR_OBJECT 01
+ #define ERROR_REACHABLE 02
+ #define ERROR_PACK 04
@@ builtin/fsck.c: static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
@@ builtin/fsck.c: static int check_pack_rev_indexes(struct repository *r, int show
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
-+ uint64_t progress_num = 1;
+
+ if (show_progress)
-+ progress = start_progress(r, _("Checking ref database"),
-+ progress_num);
++ progress = start_progress(r, _("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
@@ builtin/fsck.c: static int check_pack_rev_indexes(struct repository *r, int show
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
+ " [--[no-]dangling] [--[no-]progress] [--connectivity-only]\n"
+- " [--[no-]name-objects] [<object>...]"),
++ " [--[no-]name-objects] [--[no-]references] [<object>...]"),
+ NULL
+ };
+
+@@ builtin/fsck.c: static struct option fsck_opts[] = {
+ N_("write dangling objects in .git/lost-found")),
+ OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
+ OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),
++ OPT_BOOL(0, "references", &check_references, N_("check reference database consistency")),
+ OPT_END(),
+ };
+
@@ builtin/fsck.c: int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
-+ fsck_refs(the_repository);
++ if (check_references)
++ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
--
2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
@ 2025-02-06 5:58 ` shejialuo
2025-02-06 5:58 ` [PATCH v3 2/8] builtin/refs: get worktrees without reading head information shejialuo
` (7 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:58 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
For every test, we would execute the command "cd repo" in the first but
we never execute the command "cd .." to restore the working directory.
However, it's either not a good idea use above way. Because if any test
fails between "cd repo" and "cd ..", the "cd .." will never be reached.
And we cannot correctly restore the working directory.
Let's use subshell to ensure that the current working directory could be
restored to the correct path.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
1 file changed, 494 insertions(+), 473 deletions(-)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..cf7a202d0d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -14,222 +14,229 @@ test_expect_success 'ref name should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
-
- git commit --allow-empty -m initial &&
- git checkout -b default-branch &&
- git tag default-tag &&
- git tag multi_hierarchy/default-tag &&
-
- cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
- git refs verify 2>err &&
- test_must_be_empty err &&
- rm $branch_dir_prefix/@ &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
- git refs verify 2>err &&
- rm $tag_dir_prefix/tag-1.lock &&
- test_must_be_empty err &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/.lock: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/.lock &&
- test_cmp expect err &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/$refname: badRefName: invalid refname format
- EOF
- rm "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ (
+ cd repo &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ git commit --allow-empty -m initial &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_cmp expect err || return 1
- done &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- mkdir "$branch_dir_prefix/$refname" &&
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ error: refs/tags/.lock: badRefName: invalid refname format
EOF
- rm -r "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done
+ rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=warn refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=ignore refs verify 2>err &&
- test_must_be_empty err
+ (
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/.branch-1: badRefName: invalid refname format
+ EOF
+ rm $branch_dir_prefix/.branch-1 &&
+ test_cmp expect err &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=ignore refs verify 2>err &&
+ test_must_be_empty err
+ )
'
test_expect_success 'ref name check should work for multiple worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
-
- cd repo &&
- test_commit initial &&
- git checkout -b branch-1 &&
- test_commit second &&
- git checkout -b branch-2 &&
- test_commit third &&
- git checkout -b branch-3 &&
- git worktree add ./worktree-1 branch-1 &&
- git worktree add ./worktree-2 branch-2 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
- (
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
(
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
-
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err &&
-
- for worktree in "worktree-1" "worktree-2"
- do
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
(
- cd $worktree &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err || return 1
- )
- done
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+ )
'
test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- git refs verify 2>err &&
- test_must_be_empty err &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- for trailing_content in " garbage" " more garbage"
- do
- printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $branch_dir_prefix/branch-garbage &&
- test_cmp expect err || return 1
- done &&
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- '\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err &&
- printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- garbage'\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
+ )
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -237,99 +244,103 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- bad_content_1=$(git rev-parse main)x &&
- bad_content_2=xfsazqfxcadas &&
- bad_content_3=Xfsazqfxcadas &&
- printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
- printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
- printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
- error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
- error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'textual symref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for good_referent in "refs/heads/branch" "HEAD"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
- do
- printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-1 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-2 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-3 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-complicated &&
- test_cmp expect err
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+ )
'
test_expect_success 'textual symref content should be checked (aggregate)' '
@@ -337,32 +348,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
- printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'the target of the textual symref should be checked' '
@@ -370,28 +383,30 @@ test_expect_success 'the target of the textual symref should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
- git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
-
- for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
- do
- printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
- EOF
- rm $branch_dir_prefix/branch-bad-1 &&
- test_cmp expect err || return 1
- done
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked' '
@@ -399,201 +414,207 @@ test_expect_success SYMLINKS 'symlink symref content should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
- test_cmp expect err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- main_worktree_refdir_prefix=.git/refs/heads &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
-
- ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree2_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $main_worktree_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- for bad_referent_name in ".tag" "branch "
- do
- ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
- test_cmp expect err || return 1
- done
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
- test_must_fail git refs verify 2>err &&
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $worktree1_refdir_prefix/bad-branch-1 &&
- test_cmp expect err || return 1
- done &&
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
- test_must_fail git refs verify 2>err &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $worktree2_refdir_prefix/bad-branch-2 &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree1_refdir_prefix/branch-garbage &&
- test_cmp expect err
+ rm $worktree1_refdir_prefix/branch-garbage &&
+ test_cmp expect err
+ )
'
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v3 2/8] builtin/refs: get worktrees without reading head information
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
2025-02-06 5:58 ` [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-02-06 5:58 ` shejialuo
2025-02-06 5:58 ` [PATCH v3 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
` (6 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:58 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we should avoid reading the head information, which may
execute the read operation in packed backend with stricter checks to die
the program. Instead, we should continue to check other parts of the
"packed-refs" file completely.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head information.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 6 ++++++
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index 248bbb39d4..89b7d86cef 100644
--- a/worktree.c
+++ b/worktree.c
@@ -175,6 +175,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..1ba4a161a0 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,12 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. This is useful when checking
+ * the consistency, as reading HEAD may not be necessary.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v3 3/8] packed-backend: check whether the "packed-refs" is regular file
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
2025-02-06 5:58 ` [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-06 5:58 ` [PATCH v3 2/8] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-06 5:58 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
` (5 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:58 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". The user should always use "git
pack-refs" command to create the raw regular "packed-refs" file, so we
need to explicitly check this in "git refs verify".
We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
If the returned "fd" value is less than 0, we could check whether the
"errno" is "ELOOP" to report an error to the user.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++
2 files changed, 57 insertions(+), 4 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a7b6f74b6e..6401cecd5f 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../git-compat-util.h"
#include "../config.h"
#include "../dir.h"
+#include "../fsck.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ int ret = 0;
+ int fd;
if (!is_main_worktree(wt))
- return 0;
+ goto cleanup;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ fd = open_nofollow(refs->path, O_RDONLY);
+ if (fd < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ if (errno == ELOOP) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+ ret = error_errno(_("unable to open %s"), refs->path);
+ goto cleanup;
+ }
+
+cleanup:
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index cf7a202d0d..42c8d4ca1e 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
)
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-bak .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
` (2 preceding siblings ...)
2025-02-06 5:58 ` [PATCH v3 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-06 5:59 ` shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-06 5:59 ` [PATCH v3 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
` (4 subsequent siblings)
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with:". As we are going to implement the header consistency
check, we should port this check into "packed_fsck".
However, we need to consider other situations and discuss whether we
need to add checks.
1. If the header does not exist, we should not report an error to the
user. This is because in older Git version, we never write header in
the "packed-refs" file. Also, we do allow no header in "packed-refs"
in runtime.
2. If the header content does not start with "# packed-ref with:", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER". This is expected because we make it extensible
intentionally. So, there is no need to report.
As we have analyzed, we only need to check the case 2 in the above. In
order to do this, read the "packed-refs" file via "strbuf_read". Like
what "create_snapshot" and other functions do, we could split the line
by finding the next newline in the buffer. When we cannot find a
newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
Then, parse the first line to apply the checks. Update the test to
exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 8 ++++
fsck.h | 2 +
refs/packed-backend.c | 73 +++++++++++++++++++++++++++++++++++
t/t0602-reffiles-fsck.sh | 25 ++++++++++++
4 files changed, 108 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index b14bc44ca4..11906f90fd 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,10 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index a44c231a5f..67e3c97bc0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6401cecd5f..683cfe78dc 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ struct strbuf *packed_entry, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct fsck_ref_report report = { 0 };
+
+ report.path = packed_entry->buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o,
+ const char *start, const char *eol)
+{
+ if (!starts_with(start, "# pack-refs with:")) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with:'",
+ (int)(eol - start), start);
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ }
+
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
int ret = 0;
int fd;
@@ -1786,7 +1850,16 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read %s"), refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 42c8d4ca1e..da321f16c6 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -639,4 +639,29 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
)
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with:'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v3 5/8] packed-backend: check whether the refname contains NUL characters
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
` (3 preceding siblings ...)
2025-02-06 5:59 ` [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-06 5:59 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
` (3 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
However, it is reported in [1], we cannot catch some corruption. But we
already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.
In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.
Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 683cfe78dc..c8bb93bb18 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,21 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+ return !!memchr(refname->buf, '\0', refname->len);
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -895,6 +910,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_nul(&iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
` (4 preceding siblings ...)
2025-02-06 5:59 ` [PATCH v3 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-02-06 5:59 ` shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-06 5:59 ` [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
` (2 subsequent siblings)
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry to inspect whether the oid is
not correct. Then, check whether the next character is oid. Then
check the refname.
2. If the next line starts with '^', it would continue to parse the
peeled oid and check whether the last character is '\n'.
As we decide to implement the ref consistency check for "packed-refs",
let's port these two checks and update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 ++
fsck.h | 1 +
refs/packed-backend.c | 95 ++++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 42 ++++++++++++++++
4 files changed, 140 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 11906f90fd..02a7bf0503 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 67e3c97bc0..14d70f6653 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index c8bb93bb18..658f6bc7da 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1809,10 +1809,83 @@ static int packed_fsck_ref_header(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct strbuf *packed_entry,
+ const char *start, const char *eol)
+{
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+
+ report.path = packed_entry->buf;
+
+ /*
+ * Skip the '^' and parse the peeled oid.
+ */
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+
+ if (p != eol)
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+
+ return 0;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct strbuf *packed_entry,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
+
+ report.path = packed_entry->buf;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+
+ if (p == eol || !isspace(*p))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
+ if (refname_contains_nul(refname))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
+
+ if (check_refname_format(refname->buf, 0))
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
+
+ return 0;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
struct strbuf packed_entry = STRBUF_INIT;
+ struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
const char *eol;
int ret = 0;
@@ -1826,6 +1899,26 @@ static int packed_fsck_ref_content(struct fsck_options *o,
line_number++;
}
+ while (start < eof) {
+ strbuf_reset(&packed_entry);
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, &packed_entry, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ strbuf_reset(&packed_entry);
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, &packed_entry,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname);
strbuf_release(&packed_entry);
return ret;
}
@@ -1873,7 +1966,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index da321f16c6..3ab6b5bba5 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -664,4 +664,46 @@ test_expect_success 'packed-refs header should be checked' '
)
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s\n" "$short_oid refs/heads/branch-1" >>.git/packed-refs &&
+ printf "%sx\n" "$branch_1_oid" >>.git/packed-refs &&
+ printf "%s refs/heads/bad-branch\n" "$branch_2_oid" >>.git/packed-refs &&
+ printf "%s refs/heads/branch.\n" "$branch_2_oid" >>.git/packed-refs &&
+ printf "%s refs/tags/annotated-tag-3\n" "$tag_1_oid" >>.git/packed-refs &&
+ printf "^%s\n" "$short_oid" >>.git/packed-refs &&
+ printf "%s refs/tags/annotated-tag-4.\n" "$tag_2_oid" >>.git/packed-refs &&
+ printf "^%s garbage\n" "$tag_2_peeled_oid" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
` (5 preceding siblings ...)
2025-02-06 5:59 ` [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-06 5:59 ` shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-06 6:00 ` [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-06 5:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
When there is a "sorted" trait in the header of the "packed-refs" file,
it means that each entry is sorted increasingly by comparing the
refname. We should add checks to verify whether the "packed-refs" is
sorted in this case.
Update the "packed_fsck_ref_header" to know whether there is a "sorted"
trail in the header. Then, create a new structure "fsck_packed_ref_entry"
to store the state during the parsing process for every entry. It may
seem that we could just add a new "struct strbuf refname" into the
"struct fsck_packed_ref_entry" and during the parsing process, we could
store the refname into this structure and thus we could compare later.
However, this is not a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. The most important thing is that we cannot reuse the existing compare
functions which cause repetition.
So, instead of storing the "struct strbuf", let's use the existing
structure "struct snaphost_record". And thus we could use the existing
function "cmp_packed_ref_records".
However, this function need an extra parameter for "struct snaphost".
Extract the common part into a new function "cmp_packed_ref_records" to
reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to use the new fsck
message "packedRefUnsorted(ERROR)" to report to the user.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/packed-backend.c | 131 ++++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 63 ++++++++++++++++
4 files changed, 183 insertions(+), 15 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 02a7bf0503..9601fff228 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -187,6 +187,9 @@
(ERROR) The "packed-refs" file contains an entry that is
not terminated by a newline.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 14d70f6653..19f3cb2773 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 658f6bc7da..0fbdc5c3fa 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,9 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
-{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+static int cmp_packed_refname(const char *r1, const char *r2)
+{
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +317,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1767,6 +1773,28 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+struct fsck_packed_ref_entry {
+ unsigned long line_number;
+
+ struct snapshot_record record;
+};
+
+static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(unsigned long line_number,
+ const char *start)
+{
+ struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
+ entry->line_number = line_number;
+ entry->record.start = start;
+ return entry;
+}
+
+static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, size_t nr)
+{
+ for (size_t i = 0; i < nr; i++)
+ free(entries[i]);
+ free(entries);
+}
+
static int packed_fsck_ref_next_line(struct fsck_options *o,
struct strbuf *packed_entry, const char *start,
const char *eof, const char **eol)
@@ -1794,19 +1822,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
}
static int packed_fsck_ref_header(struct fsck_options *o,
- const char *start, const char *eol)
+ const char *start, const char *eol,
+ unsigned int *sorted)
{
- if (!starts_with(start, "# pack-refs with:")) {
+ struct string_list traits = STRING_LIST_INIT_NODUP;
+ char *tmp_line;
+ int ret = 0;
+ char *p;
+
+ tmp_line = xmemdupz(start, eol - start);
+ if (!skip_prefix(tmp_line, "# pack-refs with:", (const char **)&p)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs.header";
- return fsck_report_ref(o, &report,
- FSCK_MSG_BAD_PACKED_REF_HEADER,
- "'%.*s' does not start with '# pack-refs with:'",
- (int)(eol - start), start);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with:'",
+ (int)(eol - start), start);
+ goto cleanup;
}
- return 0;
+ string_list_split_in_place(&traits, p, " ", -1);
+ *sorted = unsorted_string_list_has_string(&traits, "sorted");
+
+cleanup:
+ free(tmp_line);
+ string_list_clear(&traits, 0);
+ return ret;
}
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ -1880,26 +1922,80 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ struct fsck_packed_ref_entry **entries,
+ size_t nr)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ int ret = 0;
+
+ for (size_t i = 1; i < nr; i++) {
+ const char *r1 = entries[i - 1]->record.start + hexsz + 1;
+ const char *r2 = entries[i]->record.start + hexsz + 1;
+
+ if (cmp_packed_refname(r1, r2) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is not less than next refname '%s'";
+ const char *eol;
+ eol = memchr(entries[i - 1]->record.start, '\n',
+ entries[i - 1]->record.len);
+ strbuf_add(&refname1, r1, eol - r1);
+ eol = memchr(entries[i]->record.start, '\n',
+ entries[i]->record.len);
+ strbuf_add(&refname2, r2, eol - r2);
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu",
+ entries[i - 1]->line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname1.buf, refname2.buf);
+ goto cleanup;
+ }
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
const char *start, const char *eof)
{
struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_packed_ref_entry **entries;
struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
+ unsigned int sorted = 0;
+ size_t entry_alloc = 20;
+ size_t entry_nr = 0;
const char *eol;
int ret = 0;
strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
if (*start == '#') {
- ret |= packed_fsck_ref_header(o, start, eol);
+ ret |= packed_fsck_ref_header(o, start, eol, &sorted);
start = eol + 1;
line_number++;
}
+ ALLOC_ARRAY(entries, entry_alloc);
while (start < eof) {
+ struct fsck_packed_ref_entry *entry
+ = create_fsck_packed_ref_entry(line_number, start);
+
+ ALLOC_GROW(entries, entry_nr + 1, entry_alloc);
+ entries[entry_nr++] = entry;
strbuf_reset(&packed_entry);
strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
@@ -1915,11 +2011,16 @@ static int packed_fsck_ref_content(struct fsck_options *o,
start = eol + 1;
line_number++;
}
+ entry->record.len = start - entry->record.start;
}
+ if (!ret && sorted)
+ ret |= packed_fsck_ref_sorted(o, ref_store, entries, entry_nr);
+
strbuf_release(&packed_entry);
strbuf_release(&refname);
strbuf_release(&packed_entry);
+ free_fsck_packed_ref_entries(entries, entry_nr);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 3ab6b5bba5..adcb5c1bda 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -706,4 +706,67 @@ test_expect_success 'packed-refs content should be checked' '
)
'
+test_expect_success 'packed-ref with sorted trait should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
+ printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname1'\'' is not less than next refname '\''$refname2'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
+ printf "%s %s\n" "$tag_1_oid" "$refname3" >>.git/packed-refs &&
+ printf "^%s\n" "$tag_1_peeled_oid" >>.git/packed-refs &&
+ printf "%s %s\n" "$branch_2_oid" "$refname2" >>.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname3'\'' is not less than next refname '\''$refname2'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
+test_expect_success 'packed-ref without sorted trait should not be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+ printf "# pack-refs with: peeled fully-peeled \n" >.git/packed-refs &&
+ printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
+ printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
` (6 preceding siblings ...)
2025-02-06 5:59 ` [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-02-06 6:00 ` shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
8 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-06 6:00 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
Then, introduce the option to allow the user to disable checking ref
database consistency. Put this function in the very first execution
sequence of "git-fsck(1)" due to that we don't want the existing code of
"git-fsck(1)" which would implicitly check the consistency of refs to
die the program.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/git-fsck.txt | 6 +++++-
builtin/fsck.c | 33 ++++++++++++++++++++++++++++++++-
2 files changed, 37 insertions(+), 2 deletions(-)
diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 5b82e4605c..9bd433028f 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -12,7 +12,7 @@ SYNOPSIS
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
[--[no-]full] [--strict] [--verbose] [--lost-found]
[--[no-]dangling] [--[no-]progress] [--connectivity-only]
- [--[no-]name-objects] [<object>...]
+ [--[no-]name-objects] [--[no-]references] [<object>...]
DESCRIPTION
-----------
@@ -104,6 +104,10 @@ care about this output and want to speed it up further.
progress status even if the standard error stream is not
directed to a terminal.
+--[no-]references::
+ Control whether to check the references database consistency
+ via 'git refs verify'. See linkgit:git-refs[1] for details.
+
CONFIGURATION
-------------
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a4dcb0716..f4f395cfbd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -50,6 +50,7 @@ static int verbose;
static int show_progress = -1;
static int show_dangling = 1;
static int name_objects;
+static int check_references = 1;
#define ERROR_OBJECT 01
#define ERROR_REACHABLE 02
#define ERROR_PACK 04
@@ -905,11 +906,37 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(struct repository *r)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+
+ if (show_progress)
+ progress = start_progress(r, _("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
" [--[no-]dangling] [--[no-]progress] [--connectivity-only]\n"
- " [--[no-]name-objects] [<object>...]"),
+ " [--[no-]name-objects] [--[no-]references] [<object>...]"),
NULL
};
@@ -928,6 +955,7 @@ static struct option fsck_opts[] = {
N_("write dangling objects in .git/lost-found")),
OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),
+ OPT_BOOL(0, "references", &check_references, N_("check reference database consistency")),
OPT_END(),
};
@@ -970,6 +998,9 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ if (check_references)
+ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-06 5:59 ` [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:12 ` shejialuo
2025-02-12 17:48 ` Junio C Hamano
0 siblings, 2 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-12 9:56 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Feb 06, 2025 at 01:59:04PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 6401cecd5f..683cfe78dc 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> +static int packed_fsck_ref_header(struct fsck_options *o,
> + const char *start, const char *eol)
> +{
> + if (!starts_with(start, "# pack-refs with:")) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs.header";
> +
> + return fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_PACKED_REF_HEADER,
> + "'%.*s' does not start with '# pack-refs with:'",
> + (int)(eol - start), start);
> + }
> +
> + return 0;
> +}
Okay. We still complain about bad headers, but only if there is a line
starting with "#" and only if the prefix doesn't match. This addresses
Junio's comment that packfiles don't have to have a header, and that
they may contain capabilities that we don't understand.
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 42c8d4ca1e..da321f16c6 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -639,4 +639,29 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> )
> '
>
> +test_expect_success 'packed-refs header should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> +
> + git refs verify 2>err &&
> + test_must_be_empty err &&
> +
> + for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
> + "# pack-refs with traits: peeled fully-peeled sorted " \
> + "# pack-refs with a: peeled fully-peeled"
Instead of verifying thrice that we complain about bad header prefixes,
should we maybe replace two of these with instances where we check a
packed-refs file _without_ a header and one with capabilities that we
don't understand?
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check
2025-02-06 5:59 ` [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:18 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-12 9:56 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Feb 06, 2025 at 01:59:40PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index c8bb93bb18..658f6bc7da 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1826,6 +1899,26 @@ static int packed_fsck_ref_content(struct fsck_options *o,
> line_number++;
> }
>
> + while (start < eof) {
> + strbuf_reset(&packed_entry);
> + strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
Instead of greedily computing the name of the line, can we pass in the
line number? The motivation is that in a well-formatted packed-refs file
we won't ever need this string at all, so it's wasteful to proactively
compute it for every single line.
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index da321f16c6..3ab6b5bba5 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -664,4 +664,46 @@ test_expect_success 'packed-refs header should be checked' '
> )
> '
>
> +test_expect_success 'packed-refs content should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> + git branch branch-1 &&
> + git branch branch-2 &&
> + git tag -a annotated-tag-1 -m tag-1 &&
> + git tag -a annotated-tag-2 -m tag-2 &&
> +
> + branch_1_oid=$(git rev-parse branch-1) &&
> + branch_2_oid=$(git rev-parse branch-2) &&
> + tag_1_oid=$(git rev-parse annotated-tag-1) &&
> + tag_2_oid=$(git rev-parse annotated-tag-2) &&
> + tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
> + tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
> + short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
> +
> + printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
> + printf "%s\n" "$short_oid refs/heads/branch-1" >>.git/packed-refs &&
> + printf "%sx\n" "$branch_1_oid" >>.git/packed-refs &&
> + printf "%s refs/heads/bad-branch\n" "$branch_2_oid" >>.git/packed-refs &&
> + printf "%s refs/heads/branch.\n" "$branch_2_oid" >>.git/packed-refs &&
> + printf "%s refs/tags/annotated-tag-3\n" "$tag_1_oid" >>.git/packed-refs &&
> + printf "^%s\n" "$short_oid" >>.git/packed-refs &&
> + printf "%s refs/tags/annotated-tag-4.\n" "$tag_2_oid" >>.git/packed-refs &&
> + printf "^%s garbage\n" "$tag_2_peeled_oid" >>.git/packed-refs &&
This can be simplified using HERE docs.
cat >.git/packed-refs <<-EOF
# pack-refs with: peeled fully-peeled sorted
$short_oid refs/heads/branch-1
${branch_1_oid}x
$branch_2_oid refs/heads/bad-branch
$branch_2_oid refs/heads/branch.
$tag_1_oid refs/tags/annotated-tag-3
^$short_oid\n
$tag_2_oid refs/tags/annotated-tag-4.
^$tag_2_peeled_oid garbage
EOF
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process
2025-02-06 6:00 ` [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:21 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-12 9:56 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Feb 06, 2025 at 02:00:07PM +0800, shejialuo wrote:
> diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
> index 5b82e4605c..9bd433028f 100644
> --- a/Documentation/git-fsck.txt
> +++ b/Documentation/git-fsck.txt
> @@ -12,7 +12,7 @@ SYNOPSIS
> 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
> [--[no-]full] [--strict] [--verbose] [--lost-found]
> [--[no-]dangling] [--[no-]progress] [--connectivity-only]
> - [--[no-]name-objects] [<object>...]
> + [--[no-]name-objects] [--[no-]references] [<object>...]
>
> DESCRIPTION
> -----------
> @@ -104,6 +104,10 @@ care about this output and want to speed it up further.
> progress status even if the standard error stream is not
> directed to a terminal.
>
> +--[no-]references::
> + Control whether to check the references database consistency
> + via 'git refs verify'. See linkgit:git-refs[1] for details.
I think we should note the default, which is to check them.
It would also be nice to have a couple of tests to verify that the flag
does what it is intended to do.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-02-06 5:59 ` [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:20 ` shejialuo
2025-02-12 10:56 ` shejialuo
0 siblings, 2 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-12 9:56 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Feb 06, 2025 at 01:59:55PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 658f6bc7da..0fbdc5c3fa 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1767,6 +1773,28 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> return empty_ref_iterator_begin();
> }
>
> +struct fsck_packed_ref_entry {
> + unsigned long line_number;
> +
> + struct snapshot_record record;
> +};
> +
> +static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(unsigned long line_number,
> + const char *start)
> +{
> + struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
> + entry->line_number = line_number;
> + entry->record.start = start;
> + return entry;
> +}
> +
> +static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, size_t nr)
> +{
> + for (size_t i = 0; i < nr; i++)
> + free(entries[i]);
> + free(entries);
> +}
> +
> static int packed_fsck_ref_next_line(struct fsck_options *o,
> struct strbuf *packed_entry, const char *start,
> const char *eof, const char **eol)
> @@ -1794,19 +1822,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
> }
>
> static int packed_fsck_ref_header(struct fsck_options *o,
> - const char *start, const char *eol)
> + const char *start, const char *eol,
> + unsigned int *sorted)
> {
> - if (!starts_with(start, "# pack-refs with:")) {
> + struct string_list traits = STRING_LIST_INIT_NODUP;
> + char *tmp_line;
> + int ret = 0;
> + char *p;
> +
> + tmp_line = xmemdupz(start, eol - start);
> + if (!skip_prefix(tmp_line, "# pack-refs with:", (const char **)&p)) {
> struct fsck_ref_report report = { 0 };
> report.path = "packed-refs.header";
>
> - return fsck_report_ref(o, &report,
> - FSCK_MSG_BAD_PACKED_REF_HEADER,
> - "'%.*s' does not start with '# pack-refs with:'",
> - (int)(eol - start), start);
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_PACKED_REF_HEADER,
> + "'%.*s' does not start with '# pack-refs with:'",
> + (int)(eol - start), start);
> + goto cleanup;
> }
>
> - return 0;
> + string_list_split_in_place(&traits, p, " ", -1);
> + *sorted = unsorted_string_list_has_string(&traits, "sorted");
I think we call them capabilities, not traits.
[snip]
> static int packed_fsck_ref_content(struct fsck_options *o,
> struct ref_store *ref_store,
> const char *start, const char *eof)
> {
> struct strbuf packed_entry = STRBUF_INIT;
> + struct fsck_packed_ref_entry **entries;
> struct strbuf refname = STRBUF_INIT;
> unsigned long line_number = 1;
> + unsigned int sorted = 0;
> + size_t entry_alloc = 20;
> + size_t entry_nr = 0;
> const char *eol;
> int ret = 0;
>
> strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
> ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> if (*start == '#') {
> - ret |= packed_fsck_ref_header(o, start, eol);
> + ret |= packed_fsck_ref_header(o, start, eol, &sorted);
>
> start = eol + 1;
> line_number++;
> }
>
> + ALLOC_ARRAY(entries, entry_alloc);
> while (start < eof) {
> + struct fsck_packed_ref_entry *entry
> + = create_fsck_packed_ref_entry(line_number, start);
Instead of slurping in all entries and allocating them in an array, can
we instead remember the last one and just compare that the last record
is smaller than the current record?
> @@ -1915,11 +2011,16 @@ static int packed_fsck_ref_content(struct fsck_options *o,
> start = eol + 1;
> line_number++;
> }
> + entry->record.len = start - entry->record.start;
> }
>
> + if (!ret && sorted)
> + ret |= packed_fsck_ref_sorted(o, ref_store, entries, entry_nr);
Okay, we now conditionally check whether the refs are sorted based on
whether or not we found the "sorted" capability.
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 3ab6b5bba5..adcb5c1bda 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -706,4 +706,67 @@ test_expect_success 'packed-refs content should be checked' '
> )
> '
>
> +test_expect_success 'packed-ref with sorted trait should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> + git branch branch-1 &&
> + git branch branch-2 &&
> + git tag -a annotated-tag-1 -m tag-1 &&
> + branch_1_oid=$(git rev-parse branch-1) &&
> + branch_2_oid=$(git rev-parse branch-2) &&
> + tag_1_oid=$(git rev-parse annotated-tag-1) &&
> + tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
> + refname1="refs/heads/main" &&
> + refname2="refs/heads/foo" &&
> + refname3="refs/tags/foo" &&
> + printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
> + printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
> + printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
Same comment here as in the previous patch, this can be simplified with
HERE docs.
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: packed-refs line 2: packedRefUnsorted: refname '\''$refname1'\'' is not less than next refname '\''$refname2'\''
> + EOF
> + rm .git/packed-refs &&
> + test_cmp expect err &&
> +
> + printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
> + printf "%s %s\n" "$tag_1_oid" "$refname3" >>.git/packed-refs &&
> + printf "^%s\n" "$tag_1_peeled_oid" >>.git/packed-refs &&
> + printf "%s %s\n" "$branch_2_oid" "$refname2" >>.git/packed-refs &&
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: packed-refs line 2: packedRefUnsorted: refname '\''$refname3'\'' is not less than next refname '\''$refname2'\''
> + EOF
> + rm .git/packed-refs &&
> + test_cmp expect err
> + )
> +'
> +
> +test_expect_success 'packed-ref without sorted trait should not be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> + git branch branch-1 &&
> + git branch branch-2 &&
> + git tag -a annotated-tag-1 -m tag-1 &&
> + branch_1_oid=$(git rev-parse branch-1) &&
> + branch_2_oid=$(git rev-parse branch-2) &&
> + tag_1_oid=$(git rev-parse annotated-tag-1) &&
> + tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
> + refname1="refs/heads/main" &&
> + refname2="refs/heads/foo" &&
> + refname3="refs/tags/foo" &&
> + printf "# pack-refs with: peeled fully-peeled \n" >.git/packed-refs &&
> + printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
> + printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
And here.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-12 9:56 ` Patrick Steinhardt
@ 2025-02-12 10:12 ` shejialuo
2025-02-12 17:48 ` Junio C Hamano
1 sibling, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-12 10:12 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Wed, Feb 12, 2025 at 10:56:43AM +0100, Patrick Steinhardt wrote:
[snip]
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index 42c8d4ca1e..da321f16c6 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -639,4 +639,29 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> > )
> > '
> >
> > +test_expect_success 'packed-refs header should be checked' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + (
> > + cd repo &&
> > + test_commit default &&
> > +
> > + git refs verify 2>err &&
> > + test_must_be_empty err &&
> > +
> > + for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
> > + "# pack-refs with traits: peeled fully-peeled sorted " \
> > + "# pack-refs with a: peeled fully-peeled"
>
> Instead of verifying thrice that we complain about bad header prefixes,
> should we maybe replace two of these with instances where we check a
> packed-refs file _without_ a header and one with capabilities that we
> don't understand?
>
I think we could add some tests to verify that we won't complain about
above two cases where packed-refs file without a header and one with
capabilities that we don't understand.
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check
2025-02-12 9:56 ` Patrick Steinhardt
@ 2025-02-12 10:18 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-12 10:18 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Wed, Feb 12, 2025 at 10:56:50AM +0100, Patrick Steinhardt wrote:
> On Thu, Feb 06, 2025 at 01:59:40PM +0800, shejialuo wrote:
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index c8bb93bb18..658f6bc7da 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -1826,6 +1899,26 @@ static int packed_fsck_ref_content(struct fsck_options *o,
> > line_number++;
> > }
> >
> > + while (start < eof) {
> > + strbuf_reset(&packed_entry);
> > + strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
>
> Instead of greedily computing the name of the line, can we pass in the
> line number? The motivation is that in a well-formatted packed-refs file
> we won't ever need this string at all, so it's wasteful to proactively
> compute it for every single line.
>
I agree with you here. And I already have idea to do this. Let me
improve this in the next version.
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index da321f16c6..3ab6b5bba5 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -664,4 +664,46 @@ test_expect_success 'packed-refs header should be checked' '
> > )
> > '
> >
> > +test_expect_success 'packed-refs content should be checked' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + (
> > + cd repo &&
> > + test_commit default &&
> > + git branch branch-1 &&
> > + git branch branch-2 &&
> > + git tag -a annotated-tag-1 -m tag-1 &&
> > + git tag -a annotated-tag-2 -m tag-2 &&
> > +
> > + branch_1_oid=$(git rev-parse branch-1) &&
> > + branch_2_oid=$(git rev-parse branch-2) &&
> > + tag_1_oid=$(git rev-parse annotated-tag-1) &&
> > + tag_2_oid=$(git rev-parse annotated-tag-2) &&
> > + tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
> > + tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
> > + short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
> > +
> > + printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
> > + printf "%s\n" "$short_oid refs/heads/branch-1" >>.git/packed-refs &&
> > + printf "%sx\n" "$branch_1_oid" >>.git/packed-refs &&
> > + printf "%s refs/heads/bad-branch\n" "$branch_2_oid" >>.git/packed-refs &&
> > + printf "%s refs/heads/branch.\n" "$branch_2_oid" >>.git/packed-refs &&
> > + printf "%s refs/tags/annotated-tag-3\n" "$tag_1_oid" >>.git/packed-refs &&
> > + printf "^%s\n" "$short_oid" >>.git/packed-refs &&
> > + printf "%s refs/tags/annotated-tag-4.\n" "$tag_2_oid" >>.git/packed-refs &&
> > + printf "^%s garbage\n" "$tag_2_peeled_oid" >>.git/packed-refs &&
>
> This can be simplified using HERE docs.
>
> cat >.git/packed-refs <<-EOF
> # pack-refs with: peeled fully-peeled sorted
> $short_oid refs/heads/branch-1
> ${branch_1_oid}x
> $branch_2_oid refs/heads/bad-branch
> $branch_2_oid refs/heads/branch.
> $tag_1_oid refs/tags/annotated-tag-3
> ^$short_oid\n
> $tag_2_oid refs/tags/annotated-tag-4.
> ^$tag_2_peeled_oid garbage
> EOF
>
Thanks for the suggestion, I will improve this in the next version.
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-02-12 9:56 ` Patrick Steinhardt
@ 2025-02-12 10:20 ` shejialuo
2025-02-12 10:42 ` Patrick Steinhardt
2025-02-12 10:56 ` shejialuo
1 sibling, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-12 10:20 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Wed, Feb 12, 2025 at 10:56:56AM +0100, Patrick Steinhardt wrote:
> On Thu, Feb 06, 2025 at 01:59:55PM +0800, shejialuo wrote:
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index 658f6bc7da..0fbdc5c3fa 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -1767,6 +1773,28 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> > return empty_ref_iterator_begin();
> > }
> >
> > +struct fsck_packed_ref_entry {
> > + unsigned long line_number;
> > +
> > + struct snapshot_record record;
> > +};
> > +
> > +static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(unsigned long line_number,
> > + const char *start)
> > +{
> > + struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
> > + entry->line_number = line_number;
> > + entry->record.start = start;
> > + return entry;
> > +}
> > +
> > +static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, size_t nr)
> > +{
> > + for (size_t i = 0; i < nr; i++)
> > + free(entries[i]);
> > + free(entries);
> > +}
> > +
> > static int packed_fsck_ref_next_line(struct fsck_options *o,
> > struct strbuf *packed_entry, const char *start,
> > const char *eof, const char **eol)
> > @@ -1794,19 +1822,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
> > }
> >
> > static int packed_fsck_ref_header(struct fsck_options *o,
> > - const char *start, const char *eol)
> > + const char *start, const char *eol,
> > + unsigned int *sorted)
> > {
> > - if (!starts_with(start, "# pack-refs with:")) {
> > + struct string_list traits = STRING_LIST_INIT_NODUP;
> > + char *tmp_line;
> > + int ret = 0;
> > + char *p;
> > +
> > + tmp_line = xmemdupz(start, eol - start);
> > + if (!skip_prefix(tmp_line, "# pack-refs with:", (const char **)&p)) {
> > struct fsck_ref_report report = { 0 };
> > report.path = "packed-refs.header";
> >
> > - return fsck_report_ref(o, &report,
> > - FSCK_MSG_BAD_PACKED_REF_HEADER,
> > - "'%.*s' does not start with '# pack-refs with:'",
> > - (int)(eol - start), start);
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_BAD_PACKED_REF_HEADER,
> > + "'%.*s' does not start with '# pack-refs with:'",
> > + (int)(eol - start), start);
> > + goto cleanup;
> > }
> >
> > - return 0;
> > + string_list_split_in_place(&traits, p, " ", -1);
> > + *sorted = unsorted_string_list_has_string(&traits, "sorted");
>
> I think we call them capabilities, not traits.
>
Yes, capabilities will be more semantic. But the original code in
"packed-backend.c" uses "traits". Let us follow the original style to
make sure consistency.
> [snip]
> > static int packed_fsck_ref_content(struct fsck_options *o,
> > struct ref_store *ref_store,
> > const char *start, const char *eof)
> > {
> > struct strbuf packed_entry = STRBUF_INIT;
> > + struct fsck_packed_ref_entry **entries;
> > struct strbuf refname = STRBUF_INIT;
> > unsigned long line_number = 1;
> > + unsigned int sorted = 0;
> > + size_t entry_alloc = 20;
> > + size_t entry_nr = 0;
> > const char *eol;
> > int ret = 0;
> >
> > strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
> > ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> > if (*start == '#') {
> > - ret |= packed_fsck_ref_header(o, start, eol);
> > + ret |= packed_fsck_ref_header(o, start, eol, &sorted);
> >
> > start = eol + 1;
> > line_number++;
> > }
> >
> > + ALLOC_ARRAY(entries, entry_alloc);
> > while (start < eof) {
> > + struct fsck_packed_ref_entry *entry
> > + = create_fsck_packed_ref_entry(line_number, start);
>
> Instead of slurping in all entries and allocating them in an array, can
> we instead remember the last one and just compare that the last record
> is smaller than the current record?
>
> > @@ -1915,11 +2011,16 @@ static int packed_fsck_ref_content(struct fsck_options *o,
> > start = eol + 1;
> > line_number++;
> > }
> > + entry->record.len = start - entry->record.start;
> > }
> >
> > + if (!ret && sorted)
> > + ret |= packed_fsck_ref_sorted(o, ref_store, entries, entry_nr);
>
> Okay, we now conditionally check whether the refs are sorted based on
> whether or not we found the "sorted" capability.
>
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index 3ab6b5bba5..adcb5c1bda 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -706,4 +706,67 @@ test_expect_success 'packed-refs content should be checked' '
> > )
> > '
> >
> > +test_expect_success 'packed-ref with sorted trait should be checked' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + (
> > + cd repo &&
> > + test_commit default &&
> > + git branch branch-1 &&
> > + git branch branch-2 &&
> > + git tag -a annotated-tag-1 -m tag-1 &&
> > + branch_1_oid=$(git rev-parse branch-1) &&
> > + branch_2_oid=$(git rev-parse branch-2) &&
> > + tag_1_oid=$(git rev-parse annotated-tag-1) &&
> > + tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
> > + refname1="refs/heads/main" &&
> > + refname2="refs/heads/foo" &&
> > + refname3="refs/tags/foo" &&
> > + printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
> > + printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
> > + printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
>
> Same comment here as in the previous patch, this can be simplified with
> HERE docs.
>
> > + test_must_fail git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + error: packed-refs line 2: packedRefUnsorted: refname '\''$refname1'\'' is not less than next refname '\''$refname2'\''
> > + EOF
> > + rm .git/packed-refs &&
> > + test_cmp expect err &&
> > +
> > + printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
> > + printf "%s %s\n" "$tag_1_oid" "$refname3" >>.git/packed-refs &&
> > + printf "^%s\n" "$tag_1_peeled_oid" >>.git/packed-refs &&
> > + printf "%s %s\n" "$branch_2_oid" "$refname2" >>.git/packed-refs &&
> > + test_must_fail git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + error: packed-refs line 2: packedRefUnsorted: refname '\''$refname3'\'' is not less than next refname '\''$refname2'\''
> > + EOF
> > + rm .git/packed-refs &&
> > + test_cmp expect err
> > + )
> > +'
> > +
> > +test_expect_success 'packed-ref without sorted trait should not be checked' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + (
> > + cd repo &&
> > + test_commit default &&
> > + git branch branch-1 &&
> > + git branch branch-2 &&
> > + git tag -a annotated-tag-1 -m tag-1 &&
> > + branch_1_oid=$(git rev-parse branch-1) &&
> > + branch_2_oid=$(git rev-parse branch-2) &&
> > + tag_1_oid=$(git rev-parse annotated-tag-1) &&
> > + tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
> > + refname1="refs/heads/main" &&
> > + refname2="refs/heads/foo" &&
> > + refname3="refs/tags/foo" &&
> > + printf "# pack-refs with: peeled fully-peeled \n" >.git/packed-refs &&
> > + printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
> > + printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
>
> And here.
>
Thanks, I will improve this in the next version.
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process
2025-02-12 9:56 ` Patrick Steinhardt
@ 2025-02-12 10:21 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-12 10:21 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Wed, Feb 12, 2025 at 10:56:53AM +0100, Patrick Steinhardt wrote:
> On Thu, Feb 06, 2025 at 02:00:07PM +0800, shejialuo wrote:
> > diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
> > index 5b82e4605c..9bd433028f 100644
> > --- a/Documentation/git-fsck.txt
> > +++ b/Documentation/git-fsck.txt
> > @@ -12,7 +12,7 @@ SYNOPSIS
> > 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
> > [--[no-]full] [--strict] [--verbose] [--lost-found]
> > [--[no-]dangling] [--[no-]progress] [--connectivity-only]
> > - [--[no-]name-objects] [<object>...]
> > + [--[no-]name-objects] [--[no-]references] [<object>...]
> >
> > DESCRIPTION
> > -----------
> > @@ -104,6 +104,10 @@ care about this output and want to speed it up further.
> > progress status even if the standard error stream is not
> > directed to a terminal.
> >
> > +--[no-]references::
> > + Control whether to check the references database consistency
> > + via 'git refs verify'. See linkgit:git-refs[1] for details.
>
> I think we should note the default, which is to check them.
>
OK, let me improve the documentation in the next version.
> It would also be nice to have a couple of tests to verify that the flag
> does what it is intended to do.
>
Good idea, we could test via trailing contents to do this. Let me
improve this.
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-02-12 10:20 ` shejialuo
@ 2025-02-12 10:42 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-12 10:42 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Wed, Feb 12, 2025 at 06:20:01PM +0800, shejialuo wrote:
> On Wed, Feb 12, 2025 at 10:56:56AM +0100, Patrick Steinhardt wrote:
> > On Thu, Feb 06, 2025 at 01:59:55PM +0800, shejialuo wrote:
> > > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > > index 658f6bc7da..0fbdc5c3fa 100644
> > > --- a/refs/packed-backend.c
> > > +++ b/refs/packed-backend.c
> > > - return 0;
> > > + string_list_split_in_place(&traits, p, " ", -1);
> > > + *sorted = unsorted_string_list_has_string(&traits, "sorted");
> >
> > I think we call them capabilities, not traits.
> >
>
> Yes, capabilities will be more semantic. But the original code in
> "packed-backend.c" uses "traits". Let us follow the original style to
> make sure consistency.
Interesting, TIL. But yeah, in that case we should continue to call them
traits.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:20 ` shejialuo
@ 2025-02-12 10:56 ` shejialuo
1 sibling, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-12 10:56 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Wed, Feb 12, 2025 at 10:56:56AM +0100, Patrick Steinhardt wrote:
> > static int packed_fsck_ref_content(struct fsck_options *o,
> > struct ref_store *ref_store,
> > const char *start, const char *eof)
> > {
> > struct strbuf packed_entry = STRBUF_INIT;
> > + struct fsck_packed_ref_entry **entries;
> > struct strbuf refname = STRBUF_INIT;
> > unsigned long line_number = 1;
> > + unsigned int sorted = 0;
> > + size_t entry_alloc = 20;
> > + size_t entry_nr = 0;
> > const char *eol;
> > int ret = 0;
> >
> > strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
> > ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> > if (*start == '#') {
> > - ret |= packed_fsck_ref_header(o, start, eol);
> > + ret |= packed_fsck_ref_header(o, start, eol, &sorted);
> >
> > start = eol + 1;
> > line_number++;
> > }
> >
> > + ALLOC_ARRAY(entries, entry_alloc);
> > while (start < eof) {
> > + struct fsck_packed_ref_entry *entry
> > + = create_fsck_packed_ref_entry(line_number, start);
>
> Instead of slurping in all entries and allocating them in an array, can
> we instead remember the last one and just compare that the last record
> is smaller than the current record?
>
Sorry here, I have missed out this. Actually, the way you say is the
most efficient way to check whether the "packed-refs" is sorted.
However, there is a concern. When we check each ref entry, we could
compare the refname with previous refname. But I don't want to do this
due to the reason that I don't want to mix up these two checks. To
conclude, we have the following call sequences which are independent.
1. check ref entry consistency. (oid, refnames, format...)
2. check whether the "packed-refs" is sorted.
But I do agree with your concern. The reason why I record them is that I
think we have already parsed the file, I think there is no need to parse
it again. So, I use a way to record the information needed to check. And
this would definitely introduce memory burden.
So we have two choices:
1. Keep the design unchanged (space overhead).
2. Parse the file again (time overhead). Thus we only have two allocated
memory.
From my writing, I think 2 will be better. If there are many entries, we
would allocate too much memory.
Let me improve this.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:12 ` shejialuo
@ 2025-02-12 17:48 ` Junio C Hamano
2025-02-14 3:53 ` shejialuo
1 sibling, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-02-12 17:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: shejialuo, git, Karthik Nayak, Michael Haggerty
Patrick Steinhardt <ps@pks.im> writes:
> On Thu, Feb 06, 2025 at 01:59:04PM +0800, shejialuo wrote:
>> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
>> index 6401cecd5f..683cfe78dc 100644
>> --- a/refs/packed-backend.c
>> +++ b/refs/packed-backend.c
>> @@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
>> +static int packed_fsck_ref_header(struct fsck_options *o,
>> + const char *start, const char *eol)
>> +{
>> + if (!starts_with(start, "# pack-refs with:")) {
>> + struct fsck_ref_report report = { 0 };
>> + report.path = "packed-refs.header";
>> +
>> + return fsck_report_ref(o, &report,
>> + FSCK_MSG_BAD_PACKED_REF_HEADER,
>> + "'%.*s' does not start with '# pack-refs with:'",
>> + (int)(eol - start), start);
>> + }
>> +
>> + return 0;
>> +}
>
> Okay. We still complain about bad headers, but only if there is a line
> starting with "#" and only if the prefix doesn't match. This addresses
> Junio's comment that packfiles don't have to have a header, and that
> they may contain capabilities that we don't understand.
We'd want to also ensure that there is a single trailing whitespace
after that colon, which we have always written after "with:", no?
>> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
>> index 42c8d4ca1e..da321f16c6 100755
>> --- a/t/t0602-reffiles-fsck.sh
>> +++ b/t/t0602-reffiles-fsck.sh
>> @@ -639,4 +639,29 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
>> )
>> '
>>
>> +test_expect_success 'packed-refs header should be checked' '
>> + test_when_finished "rm -rf repo" &&
>> + git init repo &&
>> + (
>> + cd repo &&
>> + test_commit default &&
>> +
>> + git refs verify 2>err &&
>> + test_must_be_empty err &&
>> +
>> + for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
>> + "# pack-refs with traits: peeled fully-peeled sorted " \
>> + "# pack-refs with a: peeled fully-peeled"
>
> Instead of verifying thrice that we complain about bad header prefixes,
> should we maybe replace two of these with instances where we check a
> packed-refs file _without_ a header and one with capabilities that we
> don't understand?
Yup. I also notice that refs/packed-backend.c:create_snapshot()
would accept "# pack-refs with:peeled" if I am not reading it
correctly, which is an unrelated bug.
Thanks.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-12 17:48 ` Junio C Hamano
@ 2025-02-14 3:53 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 3:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git, Karthik Nayak, Michael Haggerty
On Wed, Feb 12, 2025 at 09:48:09AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > On Thu, Feb 06, 2025 at 01:59:04PM +0800, shejialuo wrote:
> >> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> >> index 6401cecd5f..683cfe78dc 100644
> >> --- a/refs/packed-backend.c
> >> +++ b/refs/packed-backend.c
> >> @@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> >> +static int packed_fsck_ref_header(struct fsck_options *o,
> >> + const char *start, const char *eol)
> >> +{
> >> + if (!starts_with(start, "# pack-refs with:")) {
> >> + struct fsck_ref_report report = { 0 };
> >> + report.path = "packed-refs.header";
> >> +
> >> + return fsck_report_ref(o, &report,
> >> + FSCK_MSG_BAD_PACKED_REF_HEADER,
> >> + "'%.*s' does not start with '# pack-refs with:'",
> >> + (int)(eol - start), start);
> >> + }
> >> +
> >> + return 0;
> >> +}
> >
> > Okay. We still complain about bad headers, but only if there is a line
> > starting with "#" and only if the prefix doesn't match. This addresses
> > Junio's comment that packfiles don't have to have a header, and that
> > they may contain capabilities that we don't understand.
>
> We'd want to also ensure that there is a single trailing whitespace
> after that colon, which we have always written after "with:", no?
>
As you have commented below, I don't add this check due to the reason
that "create_snapshot" method does _not_ check this.
> >> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> >> index 42c8d4ca1e..da321f16c6 100755
> >> --- a/t/t0602-reffiles-fsck.sh
> >> +++ b/t/t0602-reffiles-fsck.sh
> >> @@ -639,4 +639,29 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> >> )
> >> '
> >>
> >> +test_expect_success 'packed-refs header should be checked' '
> >> + test_when_finished "rm -rf repo" &&
> >> + git init repo &&
> >> + (
> >> + cd repo &&
> >> + test_commit default &&
> >> +
> >> + git refs verify 2>err &&
> >> + test_must_be_empty err &&
> >> +
> >> + for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
> >> + "# pack-refs with traits: peeled fully-peeled sorted " \
> >> + "# pack-refs with a: peeled fully-peeled"
> >
> > Instead of verifying thrice that we complain about bad header prefixes,
> > should we maybe replace two of these with instances where we check a
> > packed-refs file _without_ a header and one with capabilities that we
> > don't understand?
>
> Yup. I also notice that refs/packed-backend.c:create_snapshot()
> would accept "# pack-refs with:peeled" if I am not reading it
> correctly, which is an unrelated bug.
>
Yes, you are correct. Let me fix this in the next version.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v4 0/8] add more ref consistency checks
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
` (7 preceding siblings ...)
2025-02-06 6:00 ` [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-02-14 4:50 ` shejialuo
2025-02-14 4:51 ` [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
` (9 more replies)
8 siblings, 10 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi All:
This patch enhances the following things:
1. [PATCH v4 4/8]: update the tests to verify that we don't report any
errors to the user in some cases. Also, suggested by Junio, make sure
that we check whether there is a trailing space after "# packed-refs
with:".
2. [PATCH v4 6/8]: instead of greedily calculating the name of the line,
lazily compute when there is any errors. And use the HERE docs to
improve the test script.
3. [PATCH v4 7/8]: instead of storing the states, we parse the file
again to check whether the file is sorted to avoid allocating too
much memory. And use the HERE docs to improve the test script.
4. [PATCH v4 8/8]: update the documentation to emphasis the default. And
add tests to exercise the code.
shejialuo (8):
t0602: use subshell to ensure working directory unchanged
builtin/refs: get worktrees without reading head information
packed-backend: check whether the "packed-refs" is regular file
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NUL characters
packed-backend: add "packed-refs" entry consistency check
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.txt | 14 +
Documentation/git-fsck.txt | 7 +-
builtin/fsck.c | 33 +-
builtin/refs.c | 2 +-
fsck.h | 4 +
refs/packed-backend.c | 349 +++++++++-
t/t0602-reffiles-fsck.sh | 1205 ++++++++++++++++++++-------------
worktree.c | 5 +
worktree.h | 6 +
9 files changed, 1140 insertions(+), 485 deletions(-)
Range-diff against v3:
1: 20889b7b18 = 1: 20889b7b18 t0602: use subshell to ensure working directory unchanged
2: 9d7780e953 = 2: 9d7780e953 builtin/refs: get worktrees without reading head information
3: 44d26f6440 = 3: 44d26f6440 packed-backend: check whether the "packed-refs" is regular file
4: a9ab7af16a ! 4: 976c5baba0 packed-backend: add "packed-refs" header consistency check
@@ Commit message
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
- pack-refs with:". As we are going to implement the header consistency
- check, we should port this check into "packed_fsck".
+ pack-refs with:". Before we port this check into "packed_fsck", let's
+ fix "create_snapshot" to check the prefix "# packed-ref with: " instead
+ of "# packed-ref with:" due to that we will always write a single
+ trailing space after the colon.
However, we need to consider other situations and discuss whether we
need to add checks.
@@ Commit message
user. This is because in older Git version, we never write header in
the "packed-refs" file. Also, we do allow no header in "packed-refs"
in runtime.
- 2. If the header content does not start with "# packed-ref with:", we
+ 2. If the header content does not start with "# packed-ref with: ", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
@@ fsck.h: enum fsck_msg_type {
FUNC(ZERO_PADDED_DATE, ERROR) \
## refs/packed-backend.c ##
+@@ refs/packed-backend.c: static struct snapshot *create_snapshot(struct packed_ref_store *refs)
+
+ tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
+
+- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
++ if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
+ die_invalid_line(refs->path,
+ snapshot->buf,
+ snapshot->eof - snapshot->buf);
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
-+ struct strbuf *packed_entry, const char *start,
++ unsigned long line_number, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
++ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+
-+ report.path = packed_entry->buf;
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
++ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ * the buffer.
+ */
+ *eol = eof;
++ strbuf_release(&packed_entry);
+ }
+
+ return ret;
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+static int packed_fsck_ref_header(struct fsck_options *o,
+ const char *start, const char *eol)
+{
-+ if (!starts_with(start, "# pack-refs with:")) {
++ if (!starts_with(start, "# pack-refs with: ")) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
-+ "'%.*s' does not start with '# pack-refs with:'",
++ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ }
+
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
-+ struct strbuf packed_entry = STRBUF_INIT;
+ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
-+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
-+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
++ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ line_number++;
+ }
+
-+ strbuf_release(&packed_entry);
+ return ret;
+}
+
@@ t/t0602-reffiles-fsck.sh: test_expect_success SYMLINKS 'the filetype of packed-r
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
-+ "# pack-refs with a: peeled fully-peeled"
++ "# pack-refs with a: peeled fully-peeled" \
++ "# pack-refs with:peeled fully-peeled sorted"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
-+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with:'\''
++ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
++
++test_expect_success 'packed-refs missing header should not be reported' '
++ test_when_finished "rm -rf repo" &&
++ git init repo &&
++ (
++ cd repo &&
++ test_commit default &&
++
++ printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
++ git refs verify 2>err &&
++ test_must_be_empty err
++ )
++'
++
++test_expect_success 'packed-refs unknown traits should not be reported' '
++ test_when_finished "rm -rf repo" &&
++ git init repo &&
++ (
++ cd repo &&
++ test_commit default &&
++
++ printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
++ git refs verify 2>err &&
++ test_must_be_empty err
++ )
++'
+
test_done
5: 9b075434a1 = 5: b66f142d7f packed-backend: check whether the refname contains NUL characters
6: a976508319 ! 6: f68028e171 packed-backend: add "packed-refs" entry consistency check
@@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o,
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
-+ struct strbuf *packed_entry,
++ unsigned long line_number,
+ const char *start, const char *eol)
+{
++ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
-+
-+ report.path = packed_entry->buf;
++ int ret = 0;
+
+ /*
+ * Skip the '^' and parse the peeled oid.
+ */
+ start++;
-+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo))
-+ return fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
-+ "'%.*s' has invalid peeled oid",
-+ (int)(eol - start), start);
-+
-+ if (p != eol)
-+ return fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
-+ "has trailing garbage after peeled oid '%.*s'",
-+ (int)(eol - p), p);
-+
-+ return 0;
++ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
++ report.path = packed_entry.buf;
++
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_PACKED_REF_ENTRY,
++ "'%.*s' has invalid peeled oid",
++ (int)(eol - start), start);
++ goto cleanup;
++ }
++
++ if (p != eol) {
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
++ report.path = packed_entry.buf;
++
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_PACKED_REF_ENTRY,
++ "has trailing garbage after peeled oid '%.*s'",
++ (int)(eol - p), p);
++ goto cleanup;
++ }
++cleanup:
++ strbuf_release(&packed_entry);
++ return ret;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
-+ struct strbuf *packed_entry,
++ unsigned long line_number,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
++ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
++ int ret = 0;
+
-+ report.path = packed_entry->buf;
++ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
++ report.path = packed_entry.buf;
+
-+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo))
-+ return fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
-+ "'%.*s' has invalid oid",
-+ (int)(eol - start), start);
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_PACKED_REF_ENTRY,
++ "'%.*s' has invalid oid",
++ (int)(eol - start), start);
++ goto cleanup;
++ }
+
-+ if (p == eol || !isspace(*p))
-+ return fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
-+ "has no space after oid '%s' but with '%.*s'",
-+ oid_to_hex(&oid), (int)(eol - p), p);
++ if (p == eol || !isspace(*p)) {
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
++ report.path = packed_entry.buf;
++
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_PACKED_REF_ENTRY,
++ "has no space after oid '%s' but with '%.*s'",
++ oid_to_hex(&oid), (int)(eol - p), p);
++ goto cleanup;
++ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
-+ if (refname_contains_nul(refname))
-+ return fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
-+ "refname '%s' contains NULL binaries",
-+ refname->buf);
-+
-+ if (check_refname_format(refname->buf, 0))
-+ return fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_REF_NAME,
-+ "has bad refname '%s'", refname->buf);
-+
-+ return 0;
++ if (refname_contains_nul(refname)) {
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
++ report.path = packed_entry.buf;
++
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_PACKED_REF_ENTRY,
++ "refname '%s' contains NULL binaries",
++ refname->buf);
++ }
++
++ if (check_refname_format(refname->buf, 0)) {
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
++ report.path = packed_entry.buf;
++
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_REF_NAME,
++ "has bad refname '%s'", refname->buf);
++ }
++
++cleanup:
++ strbuf_release(&packed_entry);
++ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
- struct strbuf packed_entry = STRBUF_INIT;
+ struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
const char *eol;
@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o
}
+ while (start < eof) {
-+ strbuf_reset(&packed_entry);
-+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
-+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
-+ ret |= packed_fsck_ref_main_line(o, ref_store, &packed_entry, &refname, start, eol);
++ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
++ ret |= packed_fsck_ref_main_line(o, ref_store, line_number, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
-+ strbuf_reset(&packed_entry);
-+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
-+ ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
-+ ret |= packed_fsck_ref_peeled_line(o, ref_store, &packed_entry,
++ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
++ ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
-+ strbuf_release(&packed_entry);
+ strbuf_release(&refname);
- strbuf_release(&packed_entry);
return ret;
}
+
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
cleanup:
## t/t0602-reffiles-fsck.sh ##
-@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs header should be checked' '
+@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs unknown traits should not be reported' '
)
'
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs header should be chec
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
-+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
-+ printf "%s\n" "$short_oid refs/heads/branch-1" >>.git/packed-refs &&
-+ printf "%sx\n" "$branch_1_oid" >>.git/packed-refs &&
-+ printf "%s refs/heads/bad-branch\n" "$branch_2_oid" >>.git/packed-refs &&
-+ printf "%s refs/heads/branch.\n" "$branch_2_oid" >>.git/packed-refs &&
-+ printf "%s refs/tags/annotated-tag-3\n" "$tag_1_oid" >>.git/packed-refs &&
-+ printf "^%s\n" "$short_oid" >>.git/packed-refs &&
-+ printf "%s refs/tags/annotated-tag-4.\n" "$tag_2_oid" >>.git/packed-refs &&
-+ printf "^%s garbage\n" "$tag_2_peeled_oid" >>.git/packed-refs &&
++ cat >.git/packed-refs <<-EOF &&
++ # pack-refs with: peeled fully-peeled sorted
++ $short_oid refs/heads/branch-1
++ ${branch_1_oid}x
++ $branch_2_oid refs/heads/bad-branch
++ $branch_2_oid refs/heads/branch.
++ $tag_1_oid refs/tags/annotated-tag-3
++ ^$short_oid
++ $tag_2_oid refs/tags/annotated-tag-4.
++ ^$tag_2_peeled_oid garbage
++ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
7: 707e3e2151 ! 7: 4a7adf293f packed-backend: check whether the "packed-refs" is sorted
@@ Commit message
sorted in this case.
Update the "packed_fsck_ref_header" to know whether there is a "sorted"
- trail in the header. Then, create a new structure "fsck_packed_ref_entry"
- to store the state during the parsing process for every entry. It may
- seem that we could just add a new "struct strbuf refname" into the
- "struct fsck_packed_ref_entry" and during the parsing process, we could
- store the refname into this structure and thus we could compare later.
- However, this is not a good design due to the following reasons:
+ trail in the header. It may seem that we could record all refnames
+ during the parsing process and then compare later. However, this is not
+ a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
- 2. The most important thing is that we cannot reuse the existing compare
- functions which cause repetition.
+ 2. We cannot reuse the existing compare function "cmp_packed_ref_records"
+ which cause repetition.
- So, instead of storing the "struct strbuf", let's use the existing
- structure "struct snaphost_record". And thus we could use the existing
- function "cmp_packed_ref_records".
+ Because "cmp_packed_ref_records" needs an extra parameter "struct
+ snaphost", extract the common part into a new function
+ "cmp_packed_ref_records" to reuse this function to compare.
- However, this function need an extra parameter for "struct snaphost".
- Extract the common part into a new function "cmp_packed_ref_records" to
- reuse this function to compare.
-
- Then, create a new function "packed_fsck_ref_sorted" to use the new fsck
- message "packedRefUnsorted(ERROR)" to report to the user.
+ Then, create a new function "packed_fsck_ref_sorted" to parse the file
+ again and user the new fsck message "packedRefUnsorted(ERROR)" to report
+ to the user if the file is not sorted.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
@@ refs/packed-backend.c: static int cmp_packed_ref_records(const void *v1, const v
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
-@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
- return empty_ref_iterator_begin();
- }
-
-+struct fsck_packed_ref_entry {
-+ unsigned long line_number;
-+
-+ struct snapshot_record record;
-+};
-+
-+static struct fsck_packed_ref_entry *create_fsck_packed_ref_entry(unsigned long line_number,
-+ const char *start)
-+{
-+ struct fsck_packed_ref_entry *entry = xcalloc(1, sizeof(*entry));
-+ entry->line_number = line_number;
-+ entry->record.start = start;
-+ return entry;
-+}
-+
-+static void free_fsck_packed_ref_entries(struct fsck_packed_ref_entry **entries, size_t nr)
-+{
-+ for (size_t i = 0; i < nr; i++)
-+ free(entries[i]);
-+ free(entries);
-+}
-+
- static int packed_fsck_ref_next_line(struct fsck_options *o,
- struct strbuf *packed_entry, const char *start,
- const char *eof, const char **eol)
@@ refs/packed-backend.c: static int packed_fsck_ref_next_line(struct fsck_options *o,
}
@@ refs/packed-backend.c: static int packed_fsck_ref_next_line(struct fsck_options
+ const char *start, const char *eol,
+ unsigned int *sorted)
{
-- if (!starts_with(start, "# pack-refs with:")) {
+- if (!starts_with(start, "# pack-refs with: ")) {
+ struct string_list traits = STRING_LIST_INIT_NODUP;
+ char *tmp_line;
+ int ret = 0;
+ char *p;
+
+ tmp_line = xmemdupz(start, eol - start);
-+ if (!skip_prefix(tmp_line, "# pack-refs with:", (const char **)&p)) {
++ if (!skip_prefix(tmp_line, "# pack-refs with: ", (const char **)&p)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs.header";
- return fsck_report_ref(o, &report,
- FSCK_MSG_BAD_PACKED_REF_HEADER,
-- "'%.*s' does not start with '# pack-refs with:'",
+- "'%.*s' does not start with '# pack-refs with: '",
- (int)(eol - start), start);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
-+ "'%.*s' does not start with '# pack-refs with:'",
++ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ goto cleanup;
}
@@ refs/packed-backend.c: static int packed_fsck_ref_next_line(struct fsck_options
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ refs/packed-backend.c: static int packed_fsck_ref_main_line(struct fsck_options *o,
- return 0;
+ return ret;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
-+ struct fsck_packed_ref_entry **entries,
-+ size_t nr)
++ const char *start, const char *eof)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
++ unsigned long line_number = 1;
++ const char *former = NULL;
++ const char *current;
++ const char *eol;
+ int ret = 0;
+
-+ for (size_t i = 1; i < nr; i++) {
-+ const char *r1 = entries[i - 1]->record.start + hexsz + 1;
-+ const char *r2 = entries[i]->record.start + hexsz + 1;
++ if (*start == '#') {
++ eol = memchr(start, '\n', eof - start);
++ start = eol + 1;
++ line_number++;
++ }
++
++ for (; start < eof; line_number++, start = eol + 1) {
++ eol = memchr(start, '\n', eof - start);
++
++ if (*start == '^')
++ continue;
++
++ if (!former) {
++ former = start + hexsz + 1;
++ continue;
++ }
+
-+ if (cmp_packed_refname(r1, r2) >= 0) {
++ current = start + hexsz + 1;
++ if (cmp_packed_refname(former, current) >= 0) {
+ const char *err_fmt =
-+ "refname '%s' is not less than next refname '%s'";
-+ const char *eol;
-+ eol = memchr(entries[i - 1]->record.start, '\n',
-+ entries[i - 1]->record.len);
-+ strbuf_add(&refname1, r1, eol - r1);
-+ eol = memchr(entries[i]->record.start, '\n',
-+ entries[i]->record.len);
-+ strbuf_add(&refname2, r2, eol - r2);
++ "refname '%s' is less than previous refname '%s'";
++
++ eol = memchr(former, '\n', eof - former);
++ strbuf_add(&refname1, former, eol - former);
++ eol = memchr(current, '\n', eof - current);
++ strbuf_add(&refname2, current, eol - current);
+
-+ strbuf_addf(&packed_entry, "packed-refs line %lu",
-+ entries[i - 1]->line_number);
++ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
-+ err_fmt, refname1.buf, refname2.buf);
++ err_fmt, refname2.buf, refname1.buf);
+ goto cleanup;
+ }
++ former = current;
+ }
+
+cleanup:
@@ refs/packed-backend.c: static int packed_fsck_ref_main_line(struct fsck_options
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
++ unsigned int *sorted,
const char *start, const char *eof)
{
- struct strbuf packed_entry = STRBUF_INIT;
-+ struct fsck_packed_ref_entry **entries;
struct strbuf refname = STRBUF_INIT;
- unsigned long line_number = 1;
-+ unsigned int sorted = 0;
-+ size_t entry_alloc = 20;
-+ size_t entry_nr = 0;
- const char *eol;
- int ret = 0;
+@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o,
- strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
- ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
if (*start == '#') {
- ret |= packed_fsck_ref_header(o, start, eol);
-+ ret |= packed_fsck_ref_header(o, start, eol, &sorted);
++ ret |= packed_fsck_ref_header(o, start, eol, sorted);
start = eol + 1;
line_number++;
- }
+@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
++ unsigned int sorted = 0;
+ int ret = 0;
+ int fd;
-+ ALLOC_ARRAY(entries, entry_alloc);
- while (start < eof) {
-+ struct fsck_packed_ref_entry *entry
-+ = create_fsck_packed_ref_entry(line_number, start);
-+
-+ ALLOC_GROW(entries, entry_nr + 1, entry_alloc);
-+ entries[entry_nr++] = entry;
- strbuf_reset(&packed_entry);
- strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
- ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
-@@ refs/packed-backend.c: static int packed_fsck_ref_content(struct fsck_options *o,
- start = eol + 1;
- line_number++;
- }
-+ entry->record.len = start - entry->record.start;
+@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
+ goto cleanup;
}
+- ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
++ ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+ if (!ret && sorted)
-+ ret |= packed_fsck_ref_sorted(o, ref_store, entries, entry_nr);
-+
- strbuf_release(&packed_entry);
- strbuf_release(&refname);
- strbuf_release(&packed_entry);
-+ free_fsck_packed_ref_entries(entries, entry_nr);
- return ret;
- }
++ ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
++ packed_ref_content.buf + packed_ref_content.len);
+ cleanup:
+ strbuf_release(&packed_ref_content);
## t/t0602-reffiles-fsck.sh ##
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs content should be checked' '
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs content should be che
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
-+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
-+ printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
-+ printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
++
++ cat >.git/packed-refs <<-EOF &&
++ # pack-refs with: peeled fully-peeled sorted
++ EOF
++ git refs verify 2>err &&
++ rm .git/packed-refs &&
++ test_must_be_empty err &&
++
++ cat >.git/packed-refs <<-EOF &&
++ # pack-refs with: peeled fully-peeled sorted
++ $branch_2_oid $refname1
++ EOF
++ git refs verify 2>err &&
++ rm .git/packed-refs &&
++ test_must_be_empty err &&
++
++ cat >.git/packed-refs <<-EOF &&
++ # pack-refs with: peeled fully-peeled sorted
++ $branch_2_oid $refname1
++ $branch_1_oid $refname2
++ $tag_1_oid $refname3
++ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
-+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname1'\'' is not less than next refname '\''$refname2'\''
++ error: packed-refs line 3: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname1'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
-+ printf "# pack-refs with: peeled fully-peeled sorted \n" >.git/packed-refs &&
-+ printf "%s %s\n" "$tag_1_oid" "$refname3" >>.git/packed-refs &&
-+ printf "^%s\n" "$tag_1_peeled_oid" >>.git/packed-refs &&
-+ printf "%s %s\n" "$branch_2_oid" "$refname2" >>.git/packed-refs &&
++ cat >.git/packed-refs <<-EOF &&
++ # pack-refs with: peeled fully-peeled sorted
++ $tag_1_oid $refname3
++ ^$tag_1_peeled_oid
++ $branch_2_oid $refname2
++ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
-+ error: packed-refs line 2: packedRefUnsorted: refname '\''$refname3'\'' is not less than next refname '\''$refname2'\''
++ error: packed-refs line 4: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname3'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs content should be che
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
-+ printf "# pack-refs with: peeled fully-peeled \n" >.git/packed-refs &&
-+ printf "%s %s\n" "$branch_2_oid" "$refname1" >>.git/packed-refs &&
-+ printf "%s %s\n" "$branch_1_oid" "$refname2" >>.git/packed-refs &&
++
++ cat >.git/packed-refs <<-EOF &&
++ # pack-refs with: peeled fully-peeled
++ $branch_2_oid $refname1
++ $branch_1_oid $refname2
++ EOF
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
8: 4f2170aa7c ! 8: 2dd3437478 builtin/fsck: add `git refs verify` child process
@@ Commit message
"git-fsck(1)" which would implicitly check the consistency of refs to
die the program.
+ Last, update the test to exercise the code.
+
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
@@ Documentation/git-fsck.txt: care about this output and want to speed it up furth
+--[no-]references::
+ Control whether to check the references database consistency
+ via 'git refs verify'. See linkgit:git-refs[1] for details.
++ The default is to check the references database.
+
CONFIGURATION
-------------
@@ builtin/fsck.c: int cmd_fsck(int argc,
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
+
+ ## t/t0602-reffiles-fsck.sh ##
+@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-ref without sorted trait should not be checked' '
+ )
+ '
+
++test_expect_success '--[no-]references option should apply to fsck' '
++ test_when_finished "rm -rf repo" &&
++ git init repo &&
++ branch_dir_prefix=.git/refs/heads &&
++ (
++ cd repo &&
++ test_commit default &&
++ for trailing_content in " garbage" " more garbage"
++ do
++ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
++ git fsck 2>err &&
++ cat >expect <<-EOF &&
++ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
++ EOF
++ rm $branch_dir_prefix/branch-garbage &&
++ test_cmp expect err || return 1
++ done &&
++
++ for trailing_content in " garbage" " more garbage"
++ do
++ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
++ git fsck --references 2>err &&
++ cat >expect <<-EOF &&
++ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
++ EOF
++ rm $branch_dir_prefix/branch-garbage &&
++ test_cmp expect err || return 1
++ done &&
++
++ for trailing_content in " garbage" " more garbage"
++ do
++ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
++ git fsck --no-references 2>err &&
++ rm $branch_dir_prefix/branch-garbage &&
++ test_must_be_empty err || return 1
++ done
++ )
++'
++
+ test_done
--
2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
@ 2025-02-14 4:51 ` shejialuo
2025-02-14 4:52 ` [PATCH v4 2/8] builtin/refs: get worktrees without reading head information shejialuo
` (8 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
For every test, we would execute the command "cd repo" in the first but
we never execute the command "cd .." to restore the working directory.
However, it's either not a good idea use above way. Because if any test
fails between "cd repo" and "cd ..", the "cd .." will never be reached.
And we cannot correctly restore the working directory.
Let's use subshell to ensure that the current working directory could be
restored to the correct path.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
1 file changed, 494 insertions(+), 473 deletions(-)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..cf7a202d0d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -14,222 +14,229 @@ test_expect_success 'ref name should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
-
- git commit --allow-empty -m initial &&
- git checkout -b default-branch &&
- git tag default-tag &&
- git tag multi_hierarchy/default-tag &&
-
- cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
- git refs verify 2>err &&
- test_must_be_empty err &&
- rm $branch_dir_prefix/@ &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
- git refs verify 2>err &&
- rm $tag_dir_prefix/tag-1.lock &&
- test_must_be_empty err &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/.lock: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/.lock &&
- test_cmp expect err &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/$refname: badRefName: invalid refname format
- EOF
- rm "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ (
+ cd repo &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ git commit --allow-empty -m initial &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_cmp expect err || return 1
- done &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- mkdir "$branch_dir_prefix/$refname" &&
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ error: refs/tags/.lock: badRefName: invalid refname format
EOF
- rm -r "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done
+ rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=warn refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=ignore refs verify 2>err &&
- test_must_be_empty err
+ (
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/.branch-1: badRefName: invalid refname format
+ EOF
+ rm $branch_dir_prefix/.branch-1 &&
+ test_cmp expect err &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=ignore refs verify 2>err &&
+ test_must_be_empty err
+ )
'
test_expect_success 'ref name check should work for multiple worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
-
- cd repo &&
- test_commit initial &&
- git checkout -b branch-1 &&
- test_commit second &&
- git checkout -b branch-2 &&
- test_commit third &&
- git checkout -b branch-3 &&
- git worktree add ./worktree-1 branch-1 &&
- git worktree add ./worktree-2 branch-2 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
- (
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
(
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
-
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err &&
-
- for worktree in "worktree-1" "worktree-2"
- do
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
(
- cd $worktree &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err || return 1
- )
- done
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+ )
'
test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- git refs verify 2>err &&
- test_must_be_empty err &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- for trailing_content in " garbage" " more garbage"
- do
- printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $branch_dir_prefix/branch-garbage &&
- test_cmp expect err || return 1
- done &&
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- '\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err &&
- printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- garbage'\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
+ )
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -237,99 +244,103 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- bad_content_1=$(git rev-parse main)x &&
- bad_content_2=xfsazqfxcadas &&
- bad_content_3=Xfsazqfxcadas &&
- printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
- printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
- printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
- error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
- error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'textual symref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for good_referent in "refs/heads/branch" "HEAD"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
- do
- printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-1 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-2 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-3 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-complicated &&
- test_cmp expect err
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+ )
'
test_expect_success 'textual symref content should be checked (aggregate)' '
@@ -337,32 +348,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
- printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'the target of the textual symref should be checked' '
@@ -370,28 +383,30 @@ test_expect_success 'the target of the textual symref should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
- git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
-
- for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
- do
- printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
- EOF
- rm $branch_dir_prefix/branch-bad-1 &&
- test_cmp expect err || return 1
- done
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked' '
@@ -399,201 +414,207 @@ test_expect_success SYMLINKS 'symlink symref content should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
- test_cmp expect err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- main_worktree_refdir_prefix=.git/refs/heads &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
-
- ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree2_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $main_worktree_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- for bad_referent_name in ".tag" "branch "
- do
- ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
- test_cmp expect err || return 1
- done
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
- test_must_fail git refs verify 2>err &&
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $worktree1_refdir_prefix/bad-branch-1 &&
- test_cmp expect err || return 1
- done &&
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
- test_must_fail git refs verify 2>err &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $worktree2_refdir_prefix/bad-branch-2 &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree1_refdir_prefix/branch-garbage &&
- test_cmp expect err
+ rm $worktree1_refdir_prefix/branch-garbage &&
+ test_cmp expect err
+ )
'
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v4 2/8] builtin/refs: get worktrees without reading head information
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
2025-02-14 4:51 ` [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-02-14 4:52 ` shejialuo
2025-02-14 9:19 ` Karthik Nayak
2025-02-14 4:52 ` [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
` (7 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:52 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we should avoid reading the head information, which may
execute the read operation in packed backend with stricter checks to die
the program. Instead, we should continue to check other parts of the
"packed-refs" file completely.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head information.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 6 ++++++
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index 248bbb39d4..89b7d86cef 100644
--- a/worktree.c
+++ b/worktree.c
@@ -175,6 +175,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..1ba4a161a0 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,12 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. This is useful when checking
+ * the consistency, as reading HEAD may not be necessary.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
2025-02-14 4:51 ` [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-14 4:52 ` [PATCH v4 2/8] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-14 4:52 ` shejialuo
2025-02-14 9:50 ` Karthik Nayak
2025-02-14 4:52 ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
` (6 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:52 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". The user should always use "git
pack-refs" command to create the raw regular "packed-refs" file, so we
need to explicitly check this in "git refs verify".
We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
If the returned "fd" value is less than 0, we could check whether the
"errno" is "ELOOP" to report an error to the user.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++
2 files changed, 57 insertions(+), 4 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a7b6f74b6e..6401cecd5f 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../git-compat-util.h"
#include "../config.h"
#include "../dir.h"
+#include "../fsck.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ int ret = 0;
+ int fd;
if (!is_main_worktree(wt))
- return 0;
+ goto cleanup;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ fd = open_nofollow(refs->path, O_RDONLY);
+ if (fd < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ if (errno == ELOOP) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+ ret = error_errno(_("unable to open %s"), refs->path);
+ goto cleanup;
+ }
+
+cleanup:
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index cf7a202d0d..42c8d4ca1e 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
)
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-bak .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
` (2 preceding siblings ...)
2025-02-14 4:52 ` [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-14 4:52 ` shejialuo
2025-02-14 10:30 ` Karthik Nayak
2025-02-14 14:01 ` Junio C Hamano
2025-02-14 4:52 ` [PATCH v4 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
` (5 subsequent siblings)
9 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:52 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with:". Before we port this check into "packed_fsck", let's
fix "create_snapshot" to check the prefix "# packed-ref with: " instead
of "# packed-ref with:" due to that we will always write a single
trailing space after the colon.
However, we need to consider other situations and discuss whether we
need to add checks.
1. If the header does not exist, we should not report an error to the
user. This is because in older Git version, we never write header in
the "packed-refs" file. Also, we do allow no header in "packed-refs"
in runtime.
2. If the header content does not start with "# packed-ref with: ", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER". This is expected because we make it extensible
intentionally. So, there is no need to report.
As we have analyzed, we only need to check the case 2 in the above. In
order to do this, read the "packed-refs" file via "strbuf_read". Like
what "create_snapshot" and other functions do, we could split the line
by finding the next newline in the buffer. When we cannot find a
newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
Then, parse the first line to apply the checks. Update the test to
exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 8 ++++
fsck.h | 2 +
refs/packed-backend.c | 75 ++++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 52 ++++++++++++++++++++++++
4 files changed, 136 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index b14bc44ca4..11906f90fd 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,10 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index a44c231a5f..67e3c97bc0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6401cecd5f..ff74ab915e 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
+ if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
die_invalid_line(refs->path,
snapshot->buf,
snapshot->eof - snapshot->buf);
@@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ unsigned long line_number, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ strbuf_release(&packed_entry);
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o,
+ const char *start, const char *eol)
+{
+ if (!starts_with(start, "# pack-refs with: ")) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ }
+
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
int ret = 0;
int fd;
@@ -1786,7 +1850,16 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read %s"), refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 42c8d4ca1e..30be1982df 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -639,4 +639,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
)
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled" \
+ "# pack-refs with:peeled fully-peeled sorted"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
+
+test_expect_success 'packed-refs missing header should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
+test_expect_success 'packed-refs unknown traits should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v4 5/8] packed-backend: check whether the refname contains NUL characters
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
` (3 preceding siblings ...)
2025-02-14 4:52 ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-14 4:52 ` shejialuo
2025-02-14 4:53 ` [PATCH v4 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
` (4 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:52 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
However, it is reported in [1], we cannot catch some corruption. But we
already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.
In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.
Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index ff74ab915e..692e315e41 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,21 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+ return !!memchr(refname->buf, '\0', refname->len);
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -895,6 +910,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_nul(&iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v4 6/8] packed-backend: add "packed-refs" entry consistency check
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
` (4 preceding siblings ...)
2025-02-14 4:52 ` [PATCH v4 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-02-14 4:53 ` shejialuo
2025-02-14 4:59 ` [PATCH v4 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
` (3 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:53 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry to inspect whether the oid is
not correct. Then, check whether the next character is oid. Then
check the refname.
2. If the next line starts with '^', it would continue to parse the
peeled oid and check whether the last character is '\n'.
As we decide to implement the ref consistency check for "packed-refs",
let's port these two checks and update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/packed-backend.c | 121 +++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 44 +++++++++++++
4 files changed, 168 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 11906f90fd..02a7bf0503 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 67e3c97bc0..14d70f6653 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 692e315e41..5d1dcfec6f 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1812,9 +1812,113 @@ static int packed_fsck_ref_header(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+ int ret = 0;
+
+ /*
+ * Skip the '^' and parse the peeled oid.
+ */
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p != eol) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+ goto cleanup;
+ }
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
+ int ret = 0;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p == eol || !isspace(*p)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+ goto cleanup;
+ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
+ if (refname_contains_nul(refname)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
+ }
+
+ if (check_refname_format(refname->buf, 0)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
+ struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
const char *eol;
int ret = 0;
@@ -1827,6 +1931,21 @@ static int packed_fsck_ref_content(struct fsck_options *o,
line_number++;
}
+ while (start < eof) {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, line_number, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
+ strbuf_release(&refname);
return ret;
}
@@ -1873,7 +1992,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 30be1982df..058a783cb7 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -691,4 +691,48 @@ test_expect_success 'packed-refs unknown traits should not be reported' '
)
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $short_oid refs/heads/branch-1
+ ${branch_1_oid}x
+ $branch_2_oid refs/heads/bad-branch
+ $branch_2_oid refs/heads/branch.
+ $tag_1_oid refs/tags/annotated-tag-3
+ ^$short_oid
+ $tag_2_oid refs/tags/annotated-tag-4.
+ ^$tag_2_peeled_oid garbage
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v4 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
` (5 preceding siblings ...)
2025-02-14 4:53 ` [PATCH v4 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-14 4:59 ` shejialuo
2025-02-14 4:59 ` [PATCH v4 8/8] builtin/fsck: add `git refs verify` child process shejialuo
` (2 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
When there is a "sorted" trait in the header of the "packed-refs" file,
it means that each entry is sorted increasingly by comparing the
refname. We should add checks to verify whether the "packed-refs" is
sorted in this case.
Update the "packed_fsck_ref_header" to know whether there is a "sorted"
trail in the header. It may seem that we could record all refnames
during the parsing process and then compare later. However, this is not
a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. We cannot reuse the existing compare function "cmp_packed_ref_records"
which cause repetition.
Because "cmp_packed_ref_records" needs an extra parameter "struct
snaphost", extract the common part into a new function
"cmp_packed_ref_records" to reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to parse the file
again and user the new fsck message "packedRefUnsorted(ERROR)" to report
to the user if the file is not sorted.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/packed-backend.c | 116 +++++++++++++++++++++++++++++-----
t/t0602-reffiles-fsck.sh | 87 +++++++++++++++++++++++++
4 files changed, 191 insertions(+), 16 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 02a7bf0503..9601fff228 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -187,6 +187,9 @@
(ERROR) The "packed-refs" file contains an entry that is
not terminated by a newline.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 14d70f6653..19f3cb2773 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 5d1dcfec6f..391efced54 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,9 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
-{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+static int cmp_packed_refname(const char *r1, const char *r2)
+{
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +317,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1797,19 +1803,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
}
static int packed_fsck_ref_header(struct fsck_options *o,
- const char *start, const char *eol)
+ const char *start, const char *eol,
+ unsigned int *sorted)
{
- if (!starts_with(start, "# pack-refs with: ")) {
+ struct string_list traits = STRING_LIST_INIT_NODUP;
+ char *tmp_line;
+ int ret = 0;
+ char *p;
+
+ tmp_line = xmemdupz(start, eol - start);
+ if (!skip_prefix(tmp_line, "# pack-refs with: ", (const char **)&p)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs.header";
- return fsck_report_ref(o, &report,
- FSCK_MSG_BAD_PACKED_REF_HEADER,
- "'%.*s' does not start with '# pack-refs with: '",
- (int)(eol - start), start);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ goto cleanup;
}
- return 0;
+ string_list_split_in_place(&traits, p, " ", -1);
+ *sorted = unsorted_string_list_has_string(&traits, "sorted");
+
+cleanup:
+ free(tmp_line);
+ string_list_clear(&traits, 0);
+ return ret;
}
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ -1914,8 +1934,68 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return ret;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ const char *start, const char *eof)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ unsigned long line_number = 1;
+ const char *former = NULL;
+ const char *current;
+ const char *eol;
+ int ret = 0;
+
+ if (*start == '#') {
+ eol = memchr(start, '\n', eof - start);
+ start = eol + 1;
+ line_number++;
+ }
+
+ for (; start < eof; line_number++, start = eol + 1) {
+ eol = memchr(start, '\n', eof - start);
+
+ if (*start == '^')
+ continue;
+
+ if (!former) {
+ former = start + hexsz + 1;
+ continue;
+ }
+
+ current = start + hexsz + 1;
+ if (cmp_packed_refname(former, current) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is less than previous refname '%s'";
+
+ eol = memchr(former, '\n', eof - former);
+ strbuf_add(&refname1, former, eol - former);
+ eol = memchr(current, '\n', eof - current);
+ strbuf_add(&refname2, current, eol - current);
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname2.buf, refname1.buf);
+ goto cleanup;
+ }
+ former = current;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
+ unsigned int *sorted,
const char *start, const char *eof)
{
struct strbuf refname = STRBUF_INIT;
@@ -1925,7 +2005,7 @@ static int packed_fsck_ref_content(struct fsck_options *o,
ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
if (*start == '#') {
- ret |= packed_fsck_ref_header(o, start, eol);
+ ret |= packed_fsck_ref_header(o, start, eol, sorted);
start = eol + 1;
line_number++;
@@ -1956,6 +2036,7 @@ static int packed_fsck(struct ref_store *ref_store,
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
struct strbuf packed_ref_content = STRBUF_INIT;
+ unsigned int sorted = 0;
int ret = 0;
int fd;
@@ -1992,8 +2073,11 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
+ if (!ret && sorted)
+ ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
cleanup:
strbuf_release(&packed_ref_content);
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 058a783cb7..f305428f12 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -735,4 +735,91 @@ test_expect_success 'packed-refs content should be checked' '
)
'
+test_expect_success 'packed-ref with sorted trait should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ $tag_1_oid $refname3
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 3: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname1'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $tag_1_oid $refname3
+ ^$tag_1_peeled_oid
+ $branch_2_oid $refname2
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 4: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname3'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
+test_expect_success 'packed-ref without sorted trait should not be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ EOF
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v4 8/8] builtin/fsck: add `git refs verify` child process
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
` (6 preceding siblings ...)
2025-02-14 4:59 ` [PATCH v4 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-02-14 4:59 ` shejialuo
2025-02-14 9:04 ` [PATCH v4 0/8] add more ref consistency checks Karthik Nayak
2025-02-17 15:25 ` [PATCH v5 " shejialuo
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 4:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
Then, introduce the option to allow the user to disable checking ref
database consistency. Put this function in the very first execution
sequence of "git-fsck(1)" due to that we don't want the existing code of
"git-fsck(1)" which would implicitly check the consistency of refs to
die the program.
Last, update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/git-fsck.txt | 7 ++++++-
builtin/fsck.c | 33 +++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 39 ++++++++++++++++++++++++++++++++++++++
3 files changed, 77 insertions(+), 2 deletions(-)
diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 5b82e4605c..5e71a29c3b 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -12,7 +12,7 @@ SYNOPSIS
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
[--[no-]full] [--strict] [--verbose] [--lost-found]
[--[no-]dangling] [--[no-]progress] [--connectivity-only]
- [--[no-]name-objects] [<object>...]
+ [--[no-]name-objects] [--[no-]references] [<object>...]
DESCRIPTION
-----------
@@ -104,6 +104,11 @@ care about this output and want to speed it up further.
progress status even if the standard error stream is not
directed to a terminal.
+--[no-]references::
+ Control whether to check the references database consistency
+ via 'git refs verify'. See linkgit:git-refs[1] for details.
+ The default is to check the references database.
+
CONFIGURATION
-------------
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a4dcb0716..f4f395cfbd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -50,6 +50,7 @@ static int verbose;
static int show_progress = -1;
static int show_dangling = 1;
static int name_objects;
+static int check_references = 1;
#define ERROR_OBJECT 01
#define ERROR_REACHABLE 02
#define ERROR_PACK 04
@@ -905,11 +906,37 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(struct repository *r)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+
+ if (show_progress)
+ progress = start_progress(r, _("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
" [--[no-]dangling] [--[no-]progress] [--connectivity-only]\n"
- " [--[no-]name-objects] [<object>...]"),
+ " [--[no-]name-objects] [--[no-]references] [<object>...]"),
NULL
};
@@ -928,6 +955,7 @@ static struct option fsck_opts[] = {
N_("write dangling objects in .git/lost-found")),
OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),
+ OPT_BOOL(0, "references", &check_references, N_("check reference database consistency")),
OPT_END(),
};
@@ -970,6 +998,9 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ if (check_references)
+ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index f305428f12..22bd847782 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -822,4 +822,43 @@ test_expect_success 'packed-ref without sorted trait should not be checked' '
)
'
+test_expect_success '--[no-]references option should apply to fsck' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ (
+ cd repo &&
+ test_commit default &&
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --references 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --no-references 2>err &&
+ rm $branch_dir_prefix/branch-garbage &&
+ test_must_be_empty err || return 1
+ done
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH v4 0/8] add more ref consistency checks
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
` (7 preceding siblings ...)
2025-02-14 4:59 ` [PATCH v4 8/8] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-02-14 9:04 ` Karthik Nayak
2025-02-14 12:16 ` shejialuo
2025-02-17 15:25 ` [PATCH v5 " shejialuo
9 siblings, 1 reply; 168+ messages in thread
From: Karthik Nayak @ 2025-02-14 9:04 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> Hi All:
>
> This patch enhances the following things:
>
> 1. [PATCH v4 4/8]: update the tests to verify that we don't report any
> errors to the user in some cases. Also, suggested by Junio, make sure
> that we check whether there is a trailing space after "# packed-refs
> with:".
> 2. [PATCH v4 6/8]: instead of greedily calculating the name of the line,
> lazily compute when there is any errors. And use the HERE docs to
> improve the test script.
> 3. [PATCH v4 7/8]: instead of storing the states, we parse the file
> again to check whether the file is sorted to avoid allocating too
> much memory. And use the HERE docs to improve the test script.
> 4. [PATCH v4 8/8]: update the documentation to emphasis the default. And
> add tests to exercise the code.
>
Nit: For someone coming in to review the 4th version directly it would
be really nice to see:
1. Summary of what the patch series is about.
2. Changes built over the last versions.
I know all this information is already spread out over the previous
versions, but would be nice to have it here (in every version rather).
> shejialuo (8):
> t0602: use subshell to ensure working directory unchanged
> builtin/refs: get worktrees without reading head information
> packed-backend: check whether the "packed-refs" is regular file
> packed-backend: add "packed-refs" header consistency check
> packed-backend: check whether the refname contains NUL characters
> packed-backend: add "packed-refs" entry consistency check
> packed-backend: check whether the "packed-refs" is sorted
> builtin/fsck: add `git refs verify` child process
>
> Documentation/fsck-msgids.txt | 14 +
> Documentation/git-fsck.txt | 7 +-
> builtin/fsck.c | 33 +-
> builtin/refs.c | 2 +-
> fsck.h | 4 +
> refs/packed-backend.c | 349 +++++++++-
> t/t0602-reffiles-fsck.sh | 1205 ++++++++++++++++++++-------------
> worktree.c | 5 +
> worktree.h | 6 +
> 9 files changed, 1140 insertions(+), 485 deletions(-)
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 2/8] builtin/refs: get worktrees without reading head information
2025-02-14 4:52 ` [PATCH v4 2/8] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-14 9:19 ` Karthik Nayak
2025-02-14 12:20 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Karthik Nayak @ 2025-02-14 9:19 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 3999 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> In "packed-backend.c", there are some functions such as "create_snapshot"
> and "next_record" which would check the correctness of the content of
> the "packed-ref" file. When anything is bad, the program will die.
>
> It may seem that we have nothing relevant to above feature, because we
> are going to read and parse the raw "packed-ref" file without creating
> the snapshot and using the ref iterator to check the consistency.
>
> However, when using "get_worktrees" in "builtin/refs", we would parse
> the "HEAD" information. If the referent of the "HEAD" is inside the
> "packed-ref", we will call "create_snapshot" function to parse the
> "packed-ref" to get the information. No matter whether the entry of
> "HEAD" in "packed-ref" is correct, "create_snapshot" would call
> "verify_buffer_safe" to check whether there is a newline in the last
> line of the file. If not, the program will die.
>
Nit: while the second paragraph above makes sense in the context of what
we're trying to achieve in this patch series. It doesn't make much sense
for this patch in isolation. Perhaps we want to give some more context
around what we're trying to solve for in the upcoming patches and hence
how it hinders that.
> Although this behavior has no harm for the program, it will
> short-circuit the program. When the users execute "git refs verify" or
> "git fsck", we should avoid reading the head information, which may
> execute the read operation in packed backend with stricter checks to die
> the program. Instead, we should continue to check other parts of the
> "packed-refs" file completely.
>
> Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
> worktrees, 2023-12-29), we have introduced a function
> "get_worktrees_internal" which allows us to get worktrees without
> reading head information.
>
> Create a new exposed function "get_worktrees_without_reading_head", then
> replace the "get_worktrees" in "builtin/refs" with the new created
> function.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> builtin/refs.c | 2 +-
> worktree.c | 5 +++++
> worktree.h | 6 ++++++
> 3 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/refs.c b/builtin/refs.c
> index a29f195834..55ff5dae11 100644
> --- a/builtin/refs.c
> +++ b/builtin/refs.c
> @@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
> git_config(git_fsck_config, &fsck_refs_options);
> prepare_repo_settings(the_repository);
>
> - worktrees = get_worktrees();
> + worktrees = get_worktrees_without_reading_head();
> for (size_t i = 0; worktrees[i]; i++)
> ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
> &fsck_refs_options, worktrees[i]);
> diff --git a/worktree.c b/worktree.c
> index 248bbb39d4..89b7d86cef 100644
> --- a/worktree.c
> +++ b/worktree.c
> @@ -175,6 +175,11 @@ struct worktree **get_worktrees(void)
> return get_worktrees_internal(0);
> }
>
> +struct worktree **get_worktrees_without_reading_head(void)
> +{
> + return get_worktrees_internal(1);
> +}
> +
> const char *get_worktree_git_dir(const struct worktree *wt)
> {
> if (!wt)
> diff --git a/worktree.h b/worktree.h
> index 38145df80f..1ba4a161a0 100644
> --- a/worktree.h
> +++ b/worktree.h
> @@ -30,6 +30,12 @@ struct worktree {
> */
> struct worktree **get_worktrees(void);
>
> +/*
> + * Like `get_worktrees`, but does not read HEAD. This is useful when checking
> + * the consistency, as reading HEAD may not be necessary.
Checking what consistency? We should be a bit more verbose here. You can
mention that skipping HEAD allows to get the worktree without worrying
about failures pertaining to parsing the HEAD ref.
> + */
> +struct worktree **get_worktrees_without_reading_head(void);
> +
> /*
> * Returns 1 if linked worktrees exist, 0 otherwise.
> */
> --
> 2.48.1
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file
2025-02-14 4:52 ` [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-14 9:50 ` Karthik Nayak
2025-02-14 12:37 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Karthik Nayak @ 2025-02-14 9:50 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 4916 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> Although "git-fsck(1)" and "packed-backend.c" will check some
> consistency and correctness of "packed-refs" file, they never check the
Because you say 'some' here, it made me more curious. Could you state
exactly what checks are being done here?
> filetype of the "packed-refs". The user should always use "git
> pack-refs" command to create the raw regular "packed-refs" file, so we
> need to explicitly check this in "git refs verify".
>
Not sure I understand how the start of this last sentence correlates to
the end of it. Is the intention to say that we want to explicitly check
the filetype to ensure that the 'packed-refs' file was only created via
'git pack-refs'? If so, perhaps:
Verify that the 'packed-refs' file has the expected filetype,
confirming it was created by 'git pack-refs'.
> We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
> If the returned "fd" value is less than 0, we could check whether the
> "errno" is "ELOOP" to report an error to the user.
>
> Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
> the user if "packed-refs" is not a regular file.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++----
> t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++
> 2 files changed, 57 insertions(+), 4 deletions(-)
>
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index a7b6f74b6e..6401cecd5f 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -4,6 +4,7 @@
> #include "../git-compat-util.h"
> #include "../config.h"
> #include "../dir.h"
> +#include "../fsck.h"
> #include "../gettext.h"
> #include "../hash.h"
> #include "../hex.h"
> @@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> return empty_ref_iterator_begin();
> }
>
> -static int packed_fsck(struct ref_store *ref_store UNUSED,
> - struct fsck_options *o UNUSED,
> +static int packed_fsck(struct ref_store *ref_store,
> + struct fsck_options *o,
> struct worktree *wt)
> {
> + struct packed_ref_store *refs = packed_downcast(ref_store,
> + REF_STORE_READ, "fsck");
> + int ret = 0;
> + int fd;
>
> if (!is_main_worktree(wt))
> - return 0;
> + goto cleanup;
>
> - return 0;
> + if (o->verbose)
> + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
> +
> + fd = open_nofollow(refs->path, O_RDONLY);
> + if (fd < 0) {
> + /*
> + * If the packed-refs file doesn't exist, there's nothing
> + * to check.
> + */
> + if (errno == ENOENT)
> + goto cleanup;
> +
> + if (errno == ELOOP) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs";
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_FILETYPE,
> + "not a regular file");
> + goto cleanup;
> + }
> +
> + ret = error_errno(_("unable to open %s"), refs->path);
> + goto cleanup;
The paragraph in the commit message:
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Gave me the indication that any error would be reported via
'fsck_report_ref()', but it seems like we are only reporting for
symbolic links. Why is that being singled out?
> + }
> +
> +cleanup:
> + return ret;
> }
>
> struct ref_storage_be refs_be_packed = {
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index cf7a202d0d..42c8d4ca1e 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
> )
> '
>
> +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> + git branch branch-1 &&
> + git branch branch-2 &&
> + git branch branch-3 &&
> + git pack-refs --all &&
> +
> + mv .git/packed-refs .git/packed-refs-back &&
> + ln -sf packed-refs-bak .git/packed-refs &&
This still doesn't make sense to me. 'packed-refs-bak' doesn't exist, is
the intention to symlink '.git/packed-refs' -> something which doesn't
exist?
In that case why even make the effort to build a packed-refs file, could
we simply do 'ln -sf packed-refs-bak .git/packed-refs' in an empty repo?
If not, then 'packed-refs-bak' is definitely a typo and needs to be made
'packed-refs-back' which would go in hand with how we setup the test...
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: packed-refs: badRefFiletype: not a regular file
> + EOF
> + rm .git/packed-refs &&
> + test_cmp expect err
> + )
> +'
> +
> test_done
> --
> 2.48.1
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-14 4:52 ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-14 10:30 ` Karthik Nayak
2025-02-14 12:43 ` shejialuo
2025-02-14 14:01 ` Junio C Hamano
1 sibling, 1 reply; 168+ messages in thread
From: Karthik Nayak @ 2025-02-14 10:30 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano, Michael Haggerty
[-- Attachment #1: Type: text/plain, Size: 9330 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> In "packed-backend.c::create_snapshot", if there is a header (the line
> which starts with '#'), we will check whether the line starts with "#
> pack-refs with:". Before we port this check into "packed_fsck", let's
> fix "create_snapshot" to check the prefix "# packed-ref with: " instead
> of "# packed-ref with:" due to that we will always write a single
> trailing space after the colon.
>
Okay. So we're extending the check to also include the trailing space.
>
> However, we need to consider other situations and discuss whether we
> need to add checks.
>
> 1. If the header does not exist, we should not report an error to the
> user. This is because in older Git version, we never write header in
> the "packed-refs" file. Also, we do allow no header in "packed-refs"
> in runtime.
Makes sense.
> 2. If the header content does not start with "# packed-ref with: ", we
> should report an error just like what "create_snapshot" does. So,
> create a new fsck message "badPackedRefHeader(ERROR)" for this.
> 3. If the header content is not the same as the constant string
> "PACKED_REFS_HEADER". This is expected because we make it extensible
> intentionally. So, there is no need to report.
Do you think it's worthwhile adding a warning/info here? This would
allow users to re-run 'git pack-refs' to ensure that they have a more
up-to date version of 'packed-refs'.
>
> As we have analyzed, we only need to check the case 2 in the above. In
> order to do this, read the "packed-refs" file via "strbuf_read". Like
> what "create_snapshot" and other functions do, we could split the line
> by finding the next newline in the buffer. When we cannot find a
> newline, we could report an error.
>
> So, create a function "packed_fsck_ref_next_line" to find the next
> newline and if there is no such newline, use
> "packedRefEntryNotTerminated(ERROR)" to report an error to the user.
>
> Then, parse the first line to apply the checks. Update the test to
> exercise the code.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> Documentation/fsck-msgids.txt | 8 ++++
> fsck.h | 2 +
> refs/packed-backend.c | 75 ++++++++++++++++++++++++++++++++++-
> t/t0602-reffiles-fsck.sh | 52 ++++++++++++++++++++++++
> 4 files changed, 136 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> index b14bc44ca4..11906f90fd 100644
> --- a/Documentation/fsck-msgids.txt
> +++ b/Documentation/fsck-msgids.txt
> @@ -16,6 +16,10 @@
> `badObjectSha1`::
> (ERROR) An object has a bad sha1.
>
> +`badPackedRefHeader`::
> + (ERROR) The "packed-refs" file contains an invalid
> + header.
> +
> `badParentSha1`::
> (ERROR) A commit object has a bad parent sha1.
>
> @@ -176,6 +180,10 @@
> `nullSha1`::
> (WARN) Tree contains entries pointing to a null sha1.
>
> +`packedRefEntryNotTerminated`::
> + (ERROR) The "packed-refs" file contains an entry that is
> + not terminated by a newline.
> +
> `refMissingNewline`::
> (INFO) A loose ref that does not end with newline(LF). As
> valid implementations of Git never created such a loose ref
> diff --git a/fsck.h b/fsck.h
> index a44c231a5f..67e3c97bc0 100644
> --- a/fsck.h
> +++ b/fsck.h
> @@ -30,6 +30,7 @@ enum fsck_msg_type {
> FUNC(BAD_EMAIL, ERROR) \
> FUNC(BAD_NAME, ERROR) \
> FUNC(BAD_OBJECT_SHA1, ERROR) \
> + FUNC(BAD_PACKED_REF_HEADER, ERROR) \
> FUNC(BAD_PARENT_SHA1, ERROR) \
> FUNC(BAD_REF_CONTENT, ERROR) \
> FUNC(BAD_REF_FILETYPE, ERROR) \
> @@ -53,6 +54,7 @@ enum fsck_msg_type {
> FUNC(MISSING_TYPE, ERROR) \
> FUNC(MISSING_TYPE_ENTRY, ERROR) \
> FUNC(MULTIPLE_AUTHORS, ERROR) \
> + FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
> FUNC(TREE_NOT_SORTED, ERROR) \
> FUNC(UNKNOWN_TYPE, ERROR) \
> FUNC(ZERO_PADDED_DATE, ERROR) \
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 6401cecd5f..ff74ab915e 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
>
> tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
>
> - if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
> + if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
> die_invalid_line(refs->path,
> snapshot->buf,
> snapshot->eof - snapshot->buf);
> @@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> return empty_ref_iterator_begin();
> }
>
> +static int packed_fsck_ref_next_line(struct fsck_options *o,
> + unsigned long line_number, const char *start,
> + const char *eof, const char **eol)
> +{
> + int ret = 0;
> +
> + *eol = memchr(start, '\n', eof - start);
> + if (!*eol) {
> + struct strbuf packed_entry = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> +
> + strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
> + report.path = packed_entry.buf;
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
> + "'%.*s' is not terminated with a newline",
> + (int)(eof - start), start);
> +
> + /*
> + * There is no newline but we still want to parse it to the end of
> + * the buffer.
> + */
> + *eol = eof;
> + strbuf_release(&packed_entry);
> + }
> +
> + return ret;
> +}
> +
> +static int packed_fsck_ref_header(struct fsck_options *o,
> + const char *start, const char *eol)
> +{
> + if (!starts_with(start, "# pack-refs with: ")) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs.header";
> +
> + return fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_PACKED_REF_HEADER,
> + "'%.*s' does not start with '# pack-refs with: '",
> + (int)(eol - start), start);
> + }
> +
> + return 0;
> +}
> +
> +static int packed_fsck_ref_content(struct fsck_options *o,
> + const char *start, const char *eof)
> +{
> + unsigned long line_number = 1;
> + const char *eol;
> + int ret = 0;
> +
> + ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
> + if (*start == '#') {
> + ret |= packed_fsck_ref_header(o, start, eol);
> +
> + start = eol + 1;
> + line_number++;
Why do we increment `line_number` here? There is no usage beyond this.
> + }
> +
> + return ret;
> +}
> +
> static int packed_fsck(struct ref_store *ref_store,
> struct fsck_options *o,
> struct worktree *wt)
> {
> struct packed_ref_store *refs = packed_downcast(ref_store,
> REF_STORE_READ, "fsck");
> + struct strbuf packed_ref_content = STRBUF_INIT;
> int ret = 0;
> int fd;
>
> @@ -1786,7 +1850,16 @@ static int packed_fsck(struct ref_store *ref_store,
> goto cleanup;
> }
>
> + if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> + ret = error_errno(_("unable to read %s"), refs->path);
> + goto cleanup;
> + }
> +
So we want to parse the whole ref content to a buffer, wonder if it
makes more sense to use `strbuf_read_line()` here instead. But let's
carry on.
> + ret = packed_fsck_ref_content(o, packed_ref_content.buf,
> + packed_ref_content.buf + packed_ref_content.len);
> +
We pass the entire content and the EOF to the function.
> cleanup:
> + strbuf_release(&packed_ref_content);
> return ret;
> }
>
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 42c8d4ca1e..30be1982df 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -639,4 +639,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> )
> '
>
> +test_expect_success 'packed-refs header should be checked' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> +
> + git refs verify 2>err &&
> + test_must_be_empty err &&
> +
> + for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
> + "# pack-refs with traits: peeled fully-peeled sorted " \
> + "# pack-refs with a: peeled fully-peeled" \
> + "# pack-refs with:peeled fully-peeled sorted"
> + do
> + printf "%s\n" "$bad_header" >.git/packed-refs &&
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
> + EOF
> + rm .git/packed-refs &&
> + test_cmp expect err || return 1
> + done
> + )
> +'
> +
> +test_expect_success 'packed-refs missing header should not be reported' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> +
> + printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
> + git refs verify 2>err &&
> + test_must_be_empty err
> + )
> +'
> +
> +test_expect_success 'packed-refs unknown traits should not be reported' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> +
> + printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
> + git refs verify 2>err &&
> + test_must_be_empty err
> + )
> +'
> +
> test_done
> --
> 2.48.1
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 0/8] add more ref consistency checks
2025-02-14 9:04 ` [PATCH v4 0/8] add more ref consistency checks Karthik Nayak
@ 2025-02-14 12:16 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 12:16 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
On Fri, Feb 14, 2025 at 01:04:09AM -0800, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > Hi All:
> >
> > This patch enhances the following things:
> >
> > 1. [PATCH v4 4/8]: update the tests to verify that we don't report any
> > errors to the user in some cases. Also, suggested by Junio, make sure
> > that we check whether there is a trailing space after "# packed-refs
> > with:".
> > 2. [PATCH v4 6/8]: instead of greedily calculating the name of the line,
> > lazily compute when there is any errors. And use the HERE docs to
> > improve the test script.
> > 3. [PATCH v4 7/8]: instead of storing the states, we parse the file
> > again to check whether the file is sorted to avoid allocating too
> > much memory. And use the HERE docs to improve the test script.
> > 4. [PATCH v4 8/8]: update the documentation to emphasis the default. And
> > add tests to exercise the code.
> >
>
> Nit: For someone coming in to review the 4th version directly it would
> be really nice to see:
>
> 1. Summary of what the patch series is about.
> 2. Changes built over the last versions.
>
> I know all this information is already spread out over the previous
> versions, but would be nice to have it here (in every version rather).
>
Thanks for your suggestion, I will do this in my later patch.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 2/8] builtin/refs: get worktrees without reading head information
2025-02-14 9:19 ` Karthik Nayak
@ 2025-02-14 12:20 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 12:20 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
On Fri, Feb 14, 2025 at 01:19:53AM -0800, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > In "packed-backend.c", there are some functions such as "create_snapshot"
> > and "next_record" which would check the correctness of the content of
> > the "packed-ref" file. When anything is bad, the program will die.
> >
> > It may seem that we have nothing relevant to above feature, because we
> > are going to read and parse the raw "packed-ref" file without creating
> > the snapshot and using the ref iterator to check the consistency.
> >
> > However, when using "get_worktrees" in "builtin/refs", we would parse
> > the "HEAD" information. If the referent of the "HEAD" is inside the
> > "packed-ref", we will call "create_snapshot" function to parse the
> > "packed-ref" to get the information. No matter whether the entry of
> > "HEAD" in "packed-ref" is correct, "create_snapshot" would call
> > "verify_buffer_safe" to check whether there is a newline in the last
> > line of the file. If not, the program will die.
> >
>
> Nit: while the second paragraph above makes sense in the context of what
> we're trying to achieve in this patch series. It doesn't make much sense
> for this patch in isolation. Perhaps we want to give some more context
> around what we're trying to solve for in the upcoming patches and hence
> how it hinders that.
>
Indeed, I think we should add this paragraph. We need to tell the
context about the motivation.
> > Although this behavior has no harm for the program, it will
> > short-circuit the program. When the users execute "git refs verify" or
> > "git fsck", we should avoid reading the head information, which may
> > execute the read operation in packed backend with stricter checks to die
> > the program. Instead, we should continue to check other parts of the
> > "packed-refs" file completely.
> >
> > Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
> > worktrees, 2023-12-29), we have introduced a function
> > "get_worktrees_internal" which allows us to get worktrees without
> > reading head information.
> >
> > Create a new exposed function "get_worktrees_without_reading_head", then
> > replace the "get_worktrees" in "builtin/refs" with the new created
> > function.
> >
> > Mentored-by: Patrick Steinhardt <ps@pks.im>
> > Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> > Signed-off-by: shejialuo <shejialuo@gmail.com>
> > ---
> > builtin/refs.c | 2 +-
> > worktree.c | 5 +++++
> > worktree.h | 6 ++++++
> > 3 files changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/builtin/refs.c b/builtin/refs.c
> > index a29f195834..55ff5dae11 100644
> > --- a/builtin/refs.c
> > +++ b/builtin/refs.c
> > @@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
> > git_config(git_fsck_config, &fsck_refs_options);
> > prepare_repo_settings(the_repository);
> >
> > - worktrees = get_worktrees();
> > + worktrees = get_worktrees_without_reading_head();
> > for (size_t i = 0; worktrees[i]; i++)
> > ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
> > &fsck_refs_options, worktrees[i]);
> > diff --git a/worktree.c b/worktree.c
> > index 248bbb39d4..89b7d86cef 100644
> > --- a/worktree.c
> > +++ b/worktree.c
> > @@ -175,6 +175,11 @@ struct worktree **get_worktrees(void)
> > return get_worktrees_internal(0);
> > }
> >
> > +struct worktree **get_worktrees_without_reading_head(void)
> > +{
> > + return get_worktrees_internal(1);
> > +}
> > +
> > const char *get_worktree_git_dir(const struct worktree *wt)
> > {
> > if (!wt)
> > diff --git a/worktree.h b/worktree.h
> > index 38145df80f..1ba4a161a0 100644
> > --- a/worktree.h
> > +++ b/worktree.h
> > @@ -30,6 +30,12 @@ struct worktree {
> > */
> > struct worktree **get_worktrees(void);
> >
> > +/*
> > + * Like `get_worktrees`, but does not read HEAD. This is useful when checking
> > + * the consistency, as reading HEAD may not be necessary.
>
> Checking what consistency? We should be a bit more verbose here. You can
> mention that skipping HEAD allows to get the worktree without worrying
> about failures pertaining to parsing the HEAD ref.
>
Good idea, I will improve this in the next version.
> > + */
> > +struct worktree **get_worktrees_without_reading_head(void);
> > +
> > /*
> > * Returns 1 if linked worktrees exist, 0 otherwise.
> > */
> > --
> > 2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file
2025-02-14 9:50 ` Karthik Nayak
@ 2025-02-14 12:37 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 12:37 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
On Fri, Feb 14, 2025 at 01:50:26AM -0800, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > Although "git-fsck(1)" and "packed-backend.c" will check some
> > consistency and correctness of "packed-refs" file, they never check the
>
> Because you say 'some' here, it made me more curious. Could you state
> exactly what checks are being done here?
>
Well, I don't think we need to elaborate on this at now for the
following two reasons:
1. We will explain this in the later patches.
2. Here I just want to emphasis that it does not check the filetype.
> > filetype of the "packed-refs". The user should always use "git
> > pack-refs" command to create the raw regular "packed-refs" file, so we
> > need to explicitly check this in "git refs verify".
> >
>
> Not sure I understand how the start of this last sentence correlates to
> the end of it. Is the intention to say that we want to explicitly check
> the filetype to ensure that the 'packed-refs' file was only created via
> 'git pack-refs'? If so, perhaps:
>
> Verify that the 'packed-refs' file has the expected filetype,
> confirming it was created by 'git pack-refs'.
>
Thanks for the suggestion, I will improve this in the next version.
> > We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
> > If the returned "fd" value is less than 0, we could check whether the
> > "errno" is "ELOOP" to report an error to the user.
> >
> > Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
> > the user if "packed-refs" is not a regular file.
> >
> > Mentored-by: Patrick Steinhardt <ps@pks.im>
> > Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> > Signed-off-by: shejialuo <shejialuo@gmail.com>
> > ---
> > refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++----
> > t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++
> > 2 files changed, 57 insertions(+), 4 deletions(-)
> >
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index a7b6f74b6e..6401cecd5f 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -4,6 +4,7 @@
> > #include "../git-compat-util.h"
> > #include "../config.h"
> > #include "../dir.h"
> > +#include "../fsck.h"
> > #include "../gettext.h"
> > #include "../hash.h"
> > #include "../hex.h"
> > @@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> > return empty_ref_iterator_begin();
> > }
> >
> > -static int packed_fsck(struct ref_store *ref_store UNUSED,
> > - struct fsck_options *o UNUSED,
> > +static int packed_fsck(struct ref_store *ref_store,
> > + struct fsck_options *o,
> > struct worktree *wt)
> > {
> > + struct packed_ref_store *refs = packed_downcast(ref_store,
> > + REF_STORE_READ, "fsck");
> > + int ret = 0;
> > + int fd;
> >
> > if (!is_main_worktree(wt))
> > - return 0;
> > + goto cleanup;
> >
> > - return 0;
> > + if (o->verbose)
> > + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
> > +
> > + fd = open_nofollow(refs->path, O_RDONLY);
> > + if (fd < 0) {
> > + /*
> > + * If the packed-refs file doesn't exist, there's nothing
> > + * to check.
> > + */
> > + if (errno == ENOENT)
> > + goto cleanup;
> > +
> > + if (errno == ELOOP) {
> > + struct fsck_ref_report report = { 0 };
> > + report.path = "packed-refs";
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_BAD_REF_FILETYPE,
> > + "not a regular file");
> > + goto cleanup;
> > + }
> > +
> > + ret = error_errno(_("unable to open %s"), refs->path);
> > + goto cleanup;
>
> The paragraph in the commit message:
>
> Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
> the user if "packed-refs" is not a regular file.
>
> Gave me the indication that any error would be reported via
> 'fsck_report_ref()', but it seems like we are only reporting for
> symbolic links. Why is that being singled out?
>
IIRC, when Patrick told me in first version that if I first stat the
file type and then use the `strbuf_read_file` to read the content, there
is a corner case that the file could be converted into symlink between
the `stat` and read.
So, I use `open_nofollow` to avoid this situation. (Actually, this could
not be avoided because in Windows, we would first stat the file and
then open the file due to that there is no "O_NOFOLLOW" flag for Windows).
I will find a solution to do this in the next version.
> > + }
> > +
> > +cleanup:
> > + return ret;
> > }
> >
> > struct ref_storage_be refs_be_packed = {
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index cf7a202d0d..42c8d4ca1e 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
> > )
> > '
> >
> > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > + (
> > + cd repo &&
> > + test_commit default &&
> > + git branch branch-1 &&
> > + git branch branch-2 &&
> > + git branch branch-3 &&
> > + git pack-refs --all &&
> > +
> > + mv .git/packed-refs .git/packed-refs-back &&
> > + ln -sf packed-refs-bak .git/packed-refs &&
>
> This still doesn't make sense to me. 'packed-refs-bak' doesn't exist, is
> the intention to symlink '.git/packed-refs' -> something which doesn't
> exist?
>
> In that case why even make the effort to build a packed-refs file, could
> we simply do 'ln -sf packed-refs-bak .git/packed-refs' in an empty repo?
>
You are correct. My intention is not this. If the "packed-refs" is a
symlink and points to file which we can successfully parse. Current Git
won't complain. So my motivation here is to imitate this situation.
> If not, then 'packed-refs-bak' is definitely a typo and needs to be made
> 'packed-refs-back' which would go in hand with how we setup the test...
>
Thanks for noticing this problem. I definitely made a mistake to type the
"packed-refs-back" to "packed-refs-bak".
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-14 10:30 ` Karthik Nayak
@ 2025-02-14 12:43 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-14 12:43 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano, Michael Haggerty
On Fri, Feb 14, 2025 at 02:30:45AM -0800, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
[snip]
> > 2. If the header content does not start with "# packed-ref with: ", we
> > should report an error just like what "create_snapshot" does. So,
> > create a new fsck message "badPackedRefHeader(ERROR)" for this.
> > 3. If the header content is not the same as the constant string
> > "PACKED_REFS_HEADER". This is expected because we make it extensible
> > intentionally. So, there is no need to report.
>
> Do you think it's worthwhile adding a warning/info here? This would
> allow users to re-run 'git pack-refs' to ensure that they have a more
> up-to date version of 'packed-refs'.
>
I somehow agree with you here. But Junio worries about the
compatibility. You could see [1] about this discussion:
[1] https://lore.kernel.org/git/xmqq1pwkdt7r.fsf@gitster.g/
[snip]
> > +static int packed_fsck_ref_content(struct fsck_options *o,
> > + const char *start, const char *eof)
> > +{
> > + unsigned long line_number = 1;
> > + const char *eol;
> > + int ret = 0;
> > +
> > + ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
> > + if (*start == '#') {
> > + ret |= packed_fsck_ref_header(o, start, eol);
> > +
> > + start = eol + 1;
> > + line_number++;
>
> Why do we increment `line_number` here? There is no usage beyond this.
>
We will use this variable when iterating the next line (ref entries). It
will be used in next patch.
> > + }
> > +
> > + return ret;
> > +}
> > +
> > static int packed_fsck(struct ref_store *ref_store,
> > struct fsck_options *o,
> > struct worktree *wt)
> > {
> > struct packed_ref_store *refs = packed_downcast(ref_store,
> > REF_STORE_READ, "fsck");
> > + struct strbuf packed_ref_content = STRBUF_INIT;
> > int ret = 0;
> > int fd;
> >
> > @@ -1786,7 +1850,16 @@ static int packed_fsck(struct ref_store *ref_store,
> > goto cleanup;
> > }
> >
> > + if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> > + ret = error_errno(_("unable to read %s"), refs->path);
> > + goto cleanup;
> > + }
> > +
>
> So we want to parse the whole ref content to a buffer, wonder if it
> makes more sense to use `strbuf_read_line()` here instead. But let's
> carry on.
>
We may use `strbuf_read_line`. But I don't want to do this. My check
logic is the same as the parse logic ("create_snapshot" and "next_record").
I want to keep the logic nearly the same. So maybe one day, we may
refactor the code to make the parse and check use the same code. But at
now, this is difficult.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-14 4:52 ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-14 10:30 ` Karthik Nayak
@ 2025-02-14 14:01 ` Junio C Hamano
1 sibling, 0 replies; 168+ messages in thread
From: Junio C Hamano @ 2025-02-14 14:01 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> In "packed-backend.c::create_snapshot", if there is a header (the line
> which starts with '#'), we will check whether the line starts with "#
> pack-refs with:". Before we port this check into "packed_fsck", let's
> fix "create_snapshot" to check the prefix "# packed-ref with: " instead
> of "# packed-ref with:" due to that we will always write a single
> trailing space after the colon.
A more important reason to be more strict is not "we will always
write", but "we HAVE ALWAYS written", I think.
> However, we need to consider other situations and discuss whether we
> need to add checks.
>
> 1. If the header does not exist, we should not report an error to the
> user. This is because in older Git version, we never write header in
> the "packed-refs" file. Also, we do allow no header in "packed-refs"
> in runtime.
Yes.
> 2. If the header content does not start with "# packed-ref with: ", we
> should report an error just like what "create_snapshot" does. So,
> create a new fsck message "badPackedRefHeader(ERROR)" for this.
OK.
> 3. If the header content is not the same as the constant string
> "PACKED_REFS_HEADER". This is expected because we make it extensible
> intentionally. So, there is no need to report.
Nor there is any need to check for literal equality with the
constant string. We may want to split the traits that are recorded
on the "with:" line and see if there are ones that we do not
recognise if only for curiosity, but because create_snapshot(), which
is the only run-time consumer of this information, only uses the
ones it recognises while ignoring everything else, presence of an
unknown trait is not an error- or even warning-worthy event. Unless
we are curious and want to emit "info" level message, there is not
much point in checking the remainder of the header.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:23 ` shejialuo
@ 2025-02-17 13:16 ` shejialuo
1 sibling, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-17 13:16 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Thu, Jan 16, 2025 at 02:57:37PM +0100, Patrick Steinhardt wrote:
[snip]
> > @@ -1779,7 +1867,24 @@ static int packed_fsck(struct ref_store *ref_store,
> > goto cleanup;
> > }
> >
> > + if (strbuf_read_file(&packed_ref_content, refs->path, 0) < 0) {
> > + /*
> > + * Although we have checked that the file exists, there is a possibility
> > + * that it has been removed between the lstat() and the read attempt by
> > + * another process. In that case, we should not report an error.
> > + */
> > + if (errno == ENOENT)
> > + goto cleanup;
>
> Unlikely, but good to guard us against that condition regardless. It's
> still not entirely race-free though because the file could meanwhile
> have changed into a symlink, and we wouldn't notice now. We could fix
> that by using open(O_NOFOLLOW), fstat the returne file descriptor and
> then use `strbuf_read()` to slurp in the file.
>
I have been looking back to the original discussion. I will follow this
advice which eventually avoids the race.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v5 0/8] add more ref consistency checks
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
` (8 preceding siblings ...)
2025-02-14 9:04 ` [PATCH v4 0/8] add more ref consistency checks Karthik Nayak
@ 2025-02-17 15:25 ` shejialuo
2025-02-17 15:27 ` [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
` (9 more replies)
9 siblings, 10 replies; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:25 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi All:
This changes enhances the following things:
1. [PATCH v5 2/8]: enhance the comment suggested by Karthik.
2. [PATCH v5 3/8]: use lstat to check whether the filetype of
"packed-ref" is a regular file instead of using `open_nofollow`
to check. And also enhance the commit message suggested by Karthik.
3. [PATCH v5 4/8]: move "open_nofollow" in original [PATCH v4 3/8] to
this.
Also, I rebase due to the conflict that all *.txt files have been
renamed to *.adoc. However, I don't know whether this is a real
conflict. But I decide to rebase to make the life of Junio easy.
Thanks,
Jialuo
---
This series mainly does the following things:
1. Fix subshell issues
2. Add ref checks for packed-backend.
1. Check whether the filetype of "packed-refs" is correct.
2. Check whether the syntax of "packed-refs" is correct by using the
rules from "packed-backend.c::create_snapshot" and
"packed-backend.c::next_record".
3. Check whether the pointed object exists and whether the
"packed-refs" file is sorted.
3. Call "git refs verify" for "git-fsck(1)".
shejialuo (8):
t0602: use subshell to ensure working directory unchanged
builtin/refs: get worktrees without reading head information
packed-backend: check whether the "packed-refs" is regular file
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NUL characters
packed-backend: add "packed-refs" entry consistency check
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.adoc | 14 +
Documentation/git-fsck.adoc | 7 +-
builtin/fsck.c | 33 +-
builtin/refs.c | 2 +-
fsck.h | 4 +
refs/packed-backend.c | 369 +++++++++-
t/t0602-reffiles-fsck.sh | 1205 +++++++++++++++++++-------------
worktree.c | 5 +
worktree.h | 7 +
9 files changed, 1161 insertions(+), 485 deletions(-)
Range-diff against v4:
1: 20889b7b18 = 1: b3952d80a2 t0602: use subshell to ensure working directory unchanged
2: 9d7780e953 ! 2: 3695586f58 builtin/refs: get worktrees without reading head information
@@ worktree.h: struct worktree {
struct worktree **get_worktrees(void);
+/*
-+ * Like `get_worktrees`, but does not read HEAD. This is useful when checking
-+ * the consistency, as reading HEAD may not be necessary.
++ * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
++ * get the worktree without worrying about failures pertaining to parsing
++ * the HEAD ref. This is useful when we want to check the ref db consistency.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
3: 44d26f6440 ! 3: cbaae00e8b packed-backend: check whether the "packed-refs" is regular file
@@ Commit message
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
- filetype of the "packed-refs". The user should always use "git
- pack-refs" command to create the raw regular "packed-refs" file, so we
- need to explicitly check this in "git refs verify".
+ filetype of the "packed-refs". Let's verify that the "packed-refs" has
+ the expected filetype, confirming it is created by "git pack-refs"
+ command.
- We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
- If the returned "fd" value is less than 0, we could check whether the
- "errno" is "ELOOP" to report an error to the user.
+ Use "lstat" to check the file mode. If we cannot check the file status
+ due to there is no such file this is OK because there is a possibility
+ that there is no "packed-refs" in the repo.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
++ struct stat st;
+ int ret = 0;
-+ int fd;
if (!is_main_worktree(wt))
- return 0;
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
-+ fd = open_nofollow(refs->path, O_RDONLY);
-+ if (fd < 0) {
++ if (lstat(refs->path, &st) < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
++ ret = error_errno(_("unable to stat %s"), refs->path);
++ goto cleanup;
++ }
+
-+ if (errno == ELOOP) {
-+ struct fsck_ref_report report = { 0 };
-+ report.path = "packed-refs";
-+ ret = fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_REF_FILETYPE,
-+ "not a regular file");
-+ goto cleanup;
-+ }
-+
-+ ret = error_errno(_("unable to open %s"), refs->path);
++ if (!S_ISREG(st.st_mode)) {
++ struct fsck_ref_report report = { 0 };
++ report.path = "packed-refs";
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_REF_FILETYPE,
++ "not a regular file");
+ goto cleanup;
+ }
+
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref content checks should work wi
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
-+ ln -sf packed-refs-bak .git/packed-refs &&
++ ln -sf packed-refs-back .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
4: 976c5baba0 ! 4: b9ce8734ac packed-backend: add "packed-refs" header consistency check
@@ Commit message
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER". This is expected because we make it extensible
- intentionally. So, there is no need to report.
+ intentionally and runtime "create_snapshot" won't complain about
+ unknown traits. In order to align with the runtime behavior. There is
+ no need to report.
As we have analyzed, we only need to check the case 2 in the above. In
- order to do this, read the "packed-refs" file via "strbuf_read". Like
+ order to do this, use "open_nofollow" function to get the file
+ descriptor and then read the "packed-refs" file via "strbuf_read". Like
what "create_snapshot" and other functions do, we could split the line
by finding the next newline in the buffer. When we cannot find a
newline, we could report an error.
@@ Commit message
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
- ## Documentation/fsck-msgids.txt ##
+ ## Documentation/fsck-msgids.adoc ##
@@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
+ struct stat st;
++ int fd;
int ret = 0;
- int fd;
+ if (!is_main_worktree(wt))
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
++ /*
++ * There is a chance that "packed-refs" file is removed or converted to
++ * a symlink after filetype check and before open. So we need to avoid
++ * this race condition by opening the file.
++ */
++ fd = open_nofollow(refs->path, O_RDONLY);
++ if (fd < 0) {
++ if (errno == ENOENT)
++ goto cleanup;
++
++ if (errno == ELOOP) {
++ struct fsck_ref_report report = { 0 };
++ report.path = "packed-refs";
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_REF_FILETYPE,
++ "not a regular file");
++ goto cleanup;
++ }
++ }
++
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read %s"), refs->path);
+ goto cleanup;
5: b66f142d7f = 5: 9f638b3adf packed-backend: check whether the refname contains NUL characters
6: f68028e171 ! 6: 2c5395bdd0 packed-backend: add "packed-refs" entry consistency check
@@ Commit message
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
- ## Documentation/fsck-msgids.txt ##
+ ## Documentation/fsck-msgids.adoc ##
@@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
@@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o,
+ (int)(eol - p), p);
+ goto cleanup;
+ }
++
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
7: 4a7adf293f ! 7: 648404c60d packed-backend: check whether the "packed-refs" is sorted
@@ Commit message
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
- ## Documentation/fsck-msgids.txt ##
+ ## Documentation/fsck-msgids.adoc ##
@@
(ERROR) The "packed-refs" file contains an entry that is
not terminated by a newline.
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
REF_STORE_READ, "fsck");
struct strbuf packed_ref_content = STRBUF_INIT;
+ unsigned int sorted = 0;
+ struct stat st;
+- int fd;
int ret = 0;
- int fd;
++ int fd;
+ if (!is_main_worktree(wt))
+ goto cleanup;
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
8: 2dd3437478 ! 8: 4dbbacf44b builtin/fsck: add `git refs verify` child process
@@ Commit message
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
- ## Documentation/git-fsck.txt ##
-@@ Documentation/git-fsck.txt: SYNOPSIS
+ ## Documentation/git-fsck.adoc ##
+@@ Documentation/git-fsck.adoc: SYNOPSIS
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
[--[no-]full] [--strict] [--verbose] [--lost-found]
[--[no-]dangling] [--[no-]progress] [--connectivity-only]
@@ Documentation/git-fsck.txt: SYNOPSIS
DESCRIPTION
-----------
-@@ Documentation/git-fsck.txt: care about this output and want to speed it up further.
+@@ Documentation/git-fsck.adoc: care about this output and want to speed it up further.
progress status even if the standard error stream is not
directed to a terminal.
--
2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged
2025-02-17 15:25 ` [PATCH v5 " shejialuo
@ 2025-02-17 15:27 ` shejialuo
2025-02-17 15:27 ` [PATCH v5 2/8] builtin/refs: get worktrees without reading head information shejialuo
` (8 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:27 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
For every test, we would execute the command "cd repo" in the first but
we never execute the command "cd .." to restore the working directory.
However, it's either not a good idea use above way. Because if any test
fails between "cd repo" and "cd ..", the "cd .." will never be reached.
And we cannot correctly restore the working directory.
Let's use subshell to ensure that the current working directory could be
restored to the correct path.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
1 file changed, 494 insertions(+), 473 deletions(-)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..cf7a202d0d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -14,222 +14,229 @@ test_expect_success 'ref name should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
-
- git commit --allow-empty -m initial &&
- git checkout -b default-branch &&
- git tag default-tag &&
- git tag multi_hierarchy/default-tag &&
-
- cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
- git refs verify 2>err &&
- test_must_be_empty err &&
- rm $branch_dir_prefix/@ &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
- git refs verify 2>err &&
- rm $tag_dir_prefix/tag-1.lock &&
- test_must_be_empty err &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/.lock: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/.lock &&
- test_cmp expect err &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/$refname: badRefName: invalid refname format
- EOF
- rm "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ (
+ cd repo &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ git commit --allow-empty -m initial &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_cmp expect err || return 1
- done &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- mkdir "$branch_dir_prefix/$refname" &&
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ error: refs/tags/.lock: badRefName: invalid refname format
EOF
- rm -r "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done
+ rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=warn refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=ignore refs verify 2>err &&
- test_must_be_empty err
+ (
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/.branch-1: badRefName: invalid refname format
+ EOF
+ rm $branch_dir_prefix/.branch-1 &&
+ test_cmp expect err &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=ignore refs verify 2>err &&
+ test_must_be_empty err
+ )
'
test_expect_success 'ref name check should work for multiple worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
-
- cd repo &&
- test_commit initial &&
- git checkout -b branch-1 &&
- test_commit second &&
- git checkout -b branch-2 &&
- test_commit third &&
- git checkout -b branch-3 &&
- git worktree add ./worktree-1 branch-1 &&
- git worktree add ./worktree-2 branch-2 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
- (
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
(
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
-
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err &&
-
- for worktree in "worktree-1" "worktree-2"
- do
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
(
- cd $worktree &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err || return 1
- )
- done
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+ )
'
test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- git refs verify 2>err &&
- test_must_be_empty err &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- for trailing_content in " garbage" " more garbage"
- do
- printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $branch_dir_prefix/branch-garbage &&
- test_cmp expect err || return 1
- done &&
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- '\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err &&
- printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- garbage'\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
+ )
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -237,99 +244,103 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- bad_content_1=$(git rev-parse main)x &&
- bad_content_2=xfsazqfxcadas &&
- bad_content_3=Xfsazqfxcadas &&
- printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
- printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
- printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
- error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
- error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'textual symref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for good_referent in "refs/heads/branch" "HEAD"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
- do
- printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-1 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-2 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-3 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-complicated &&
- test_cmp expect err
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+ )
'
test_expect_success 'textual symref content should be checked (aggregate)' '
@@ -337,32 +348,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
- printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'the target of the textual symref should be checked' '
@@ -370,28 +383,30 @@ test_expect_success 'the target of the textual symref should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
- git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
-
- for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
- do
- printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
- EOF
- rm $branch_dir_prefix/branch-bad-1 &&
- test_cmp expect err || return 1
- done
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked' '
@@ -399,201 +414,207 @@ test_expect_success SYMLINKS 'symlink symref content should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
- test_cmp expect err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- main_worktree_refdir_prefix=.git/refs/heads &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
-
- ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree2_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $main_worktree_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- for bad_referent_name in ".tag" "branch "
- do
- ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
- test_cmp expect err || return 1
- done
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
- test_must_fail git refs verify 2>err &&
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $worktree1_refdir_prefix/bad-branch-1 &&
- test_cmp expect err || return 1
- done &&
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
- test_must_fail git refs verify 2>err &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $worktree2_refdir_prefix/bad-branch-2 &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree1_refdir_prefix/branch-garbage &&
- test_cmp expect err
+ rm $worktree1_refdir_prefix/branch-garbage &&
+ test_cmp expect err
+ )
'
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v5 2/8] builtin/refs: get worktrees without reading head information
2025-02-17 15:25 ` [PATCH v5 " shejialuo
2025-02-17 15:27 ` [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-02-17 15:27 ` shejialuo
2025-02-25 8:26 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
` (7 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:27 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we should avoid reading the head information, which may
execute the read operation in packed backend with stricter checks to die
the program. Instead, we should continue to check other parts of the
"packed-refs" file completely.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head information.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 7 +++++++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index d4a68c9c23..d23482a746 100644
--- a/worktree.c
+++ b/worktree.c
@@ -198,6 +198,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..f7003a9c12 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,13 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
+ * get the worktree without worrying about failures pertaining to parsing
+ * the HEAD ref. This is useful when we want to check the ref db consistency.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file
2025-02-17 15:25 ` [PATCH v5 " shejialuo
2025-02-17 15:27 ` [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-17 15:27 ` [PATCH v5 2/8] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-17 15:27 ` shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
` (6 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:27 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". Let's verify that the "packed-refs" has
the expected filetype, confirming it is created by "git pack-refs"
command.
Use "lstat" to check the file mode. If we cannot check the file status
due to there is no such file this is OK because there is a possibility
that there is no "packed-refs" in the repo.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 37 +++++++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++
2 files changed, 55 insertions(+), 4 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a7b6f74b6e..8140a31d07 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../git-compat-util.h"
#include "../config.h"
#include "../dir.h"
+#include "../fsck.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1748,15 +1749,43 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ struct stat st;
+ int ret = 0;
if (!is_main_worktree(wt))
- return 0;
+ goto cleanup;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ if (lstat(refs->path, &st) < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+ ret = error_errno(_("unable to stat %s"), refs->path);
+ goto cleanup;
+ }
+
+ if (!S_ISREG(st.st_mode)) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+cleanup:
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index cf7a202d0d..e65ca341cd 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
)
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-back .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-17 15:25 ` [PATCH v5 " shejialuo
` (2 preceding siblings ...)
2025-02-17 15:27 ` [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-17 15:27 ` shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
` (5 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:27 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with:". Before we port this check into "packed_fsck", let's
fix "create_snapshot" to check the prefix "# packed-ref with: " instead
of "# packed-ref with:" due to that we will always write a single
trailing space after the colon.
However, we need to consider other situations and discuss whether we
need to add checks.
1. If the header does not exist, we should not report an error to the
user. This is because in older Git version, we never write header in
the "packed-refs" file. Also, we do allow no header in "packed-refs"
in runtime.
2. If the header content does not start with "# packed-ref with: ", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER". This is expected because we make it extensible
intentionally and runtime "create_snapshot" won't complain about
unknown traits. In order to align with the runtime behavior. There is
no need to report.
As we have analyzed, we only need to check the case 2 in the above. In
order to do this, use "open_nofollow" function to get the file
descriptor and then read the "packed-refs" file via "strbuf_read". Like
what "create_snapshot" and other functions do, we could split the line
by finding the next newline in the buffer. When we cannot find a
newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
Then, parse the first line to apply the checks. Update the test to
exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 8 +++
fsck.h | 2 +
refs/packed-backend.c | 96 +++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 52 ++++++++++++++++++
4 files changed, 157 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index b14bc44ca4..11906f90fd 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,10 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index a44c231a5f..67e3c97bc0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 8140a31d07..09eb3886c3 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
+ if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
die_invalid_line(refs->path,
snapshot->buf,
snapshot->eof - snapshot->buf);
@@ -1749,13 +1749,78 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ unsigned long line_number, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ strbuf_release(&packed_entry);
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o,
+ const char *start, const char *eol)
+{
+ if (!starts_with(start, "# pack-refs with: ")) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ }
+
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
struct stat st;
+ int fd;
int ret = 0;
if (!is_main_worktree(wt))
@@ -1784,7 +1849,36 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ /*
+ * There is a chance that "packed-refs" file is removed or converted to
+ * a symlink after filetype check and before open. So we need to avoid
+ * this race condition by opening the file.
+ */
+ fd = open_nofollow(refs->path, O_RDONLY);
+ if (fd < 0) {
+ if (errno == ENOENT)
+ goto cleanup;
+
+ if (errno == ELOOP) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+ }
+
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read %s"), refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index e65ca341cd..e055c36e74 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -639,4 +639,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
)
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled" \
+ "# pack-refs with:peeled fully-peeled sorted"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
+
+test_expect_success 'packed-refs missing header should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
+test_expect_success 'packed-refs unknown traits should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v5 5/8] packed-backend: check whether the refname contains NUL characters
2025-02-17 15:25 ` [PATCH v5 " shejialuo
` (3 preceding siblings ...)
2025-02-17 15:27 ` [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-17 15:27 ` shejialuo
2025-02-17 15:28 ` [PATCH v5 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
` (4 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:27 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
However, it is reported in [1], we cannot catch some corruption. But we
already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.
In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.
Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 09eb3886c3..5edd2136bb 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,21 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+ return !!memchr(refname->buf, '\0', refname->len);
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -895,6 +910,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_nul(&iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v5 6/8] packed-backend: add "packed-refs" entry consistency check
2025-02-17 15:25 ` [PATCH v5 " shejialuo
` (4 preceding siblings ...)
2025-02-17 15:27 ` [PATCH v5 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-02-17 15:28 ` shejialuo
2025-02-17 15:28 ` [PATCH v5 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
` (3 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:28 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry to inspect whether the oid is
not correct. Then, check whether the next character is oid. Then
check the refname.
2. If the next line starts with '^', it would continue to parse the
peeled oid and check whether the last character is '\n'.
As we decide to implement the ref consistency check for "packed-refs",
let's port these two checks and update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 122 ++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 44 ++++++++++++
4 files changed, 169 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 11906f90fd..02a7bf0503 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 67e3c97bc0..14d70f6653 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 5edd2136bb..c7138aefff 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1812,9 +1812,114 @@ static int packed_fsck_ref_header(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+ int ret = 0;
+
+ /*
+ * Skip the '^' and parse the peeled oid.
+ */
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p != eol) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
+ int ret = 0;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p == eol || !isspace(*p)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+ goto cleanup;
+ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
+ if (refname_contains_nul(refname)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
+ }
+
+ if (check_refname_format(refname->buf, 0)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
+ struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
const char *eol;
int ret = 0;
@@ -1827,6 +1932,21 @@ static int packed_fsck_ref_content(struct fsck_options *o,
line_number++;
}
+ while (start < eof) {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, line_number, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
+ strbuf_release(&refname);
return ret;
}
@@ -1892,7 +2012,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index e055c36e74..7421cc1e7f 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -691,4 +691,48 @@ test_expect_success 'packed-refs unknown traits should not be reported' '
)
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $short_oid refs/heads/branch-1
+ ${branch_1_oid}x
+ $branch_2_oid refs/heads/bad-branch
+ $branch_2_oid refs/heads/branch.
+ $tag_1_oid refs/tags/annotated-tag-3
+ ^$short_oid
+ $tag_2_oid refs/tags/annotated-tag-4.
+ ^$tag_2_peeled_oid garbage
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v5 7/8] packed-backend: check whether the "packed-refs" is sorted
2025-02-17 15:25 ` [PATCH v5 " shejialuo
` (5 preceding siblings ...)
2025-02-17 15:28 ` [PATCH v5 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-17 15:28 ` shejialuo
2025-02-17 15:28 ` [PATCH v5 8/8] builtin/fsck: add `git refs verify` child process shejialuo
` (2 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:28 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
When there is a "sorted" trait in the header of the "packed-refs" file,
it means that each entry is sorted increasingly by comparing the
refname. We should add checks to verify whether the "packed-refs" is
sorted in this case.
Update the "packed_fsck_ref_header" to know whether there is a "sorted"
trail in the header. It may seem that we could record all refnames
during the parsing process and then compare later. However, this is not
a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. We cannot reuse the existing compare function "cmp_packed_ref_records"
which cause repetition.
Because "cmp_packed_ref_records" needs an extra parameter "struct
snaphost", extract the common part into a new function
"cmp_packed_ref_records" to reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to parse the file
again and user the new fsck message "packedRefUnsorted(ERROR)" to report
to the user if the file is not sorted.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 118 ++++++++++++++++++++++++++++-----
t/t0602-reffiles-fsck.sh | 87 ++++++++++++++++++++++++
4 files changed, 192 insertions(+), 17 deletions(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 02a7bf0503..9601fff228 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -187,6 +187,9 @@
(ERROR) The "packed-refs" file contains an entry that is
not terminated by a newline.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 14d70f6653..19f3cb2773 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index c7138aefff..ae04d8ae80 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,9 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
-{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+static int cmp_packed_refname(const char *r1, const char *r2)
+{
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +317,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1797,19 +1803,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
}
static int packed_fsck_ref_header(struct fsck_options *o,
- const char *start, const char *eol)
+ const char *start, const char *eol,
+ unsigned int *sorted)
{
- if (!starts_with(start, "# pack-refs with: ")) {
+ struct string_list traits = STRING_LIST_INIT_NODUP;
+ char *tmp_line;
+ int ret = 0;
+ char *p;
+
+ tmp_line = xmemdupz(start, eol - start);
+ if (!skip_prefix(tmp_line, "# pack-refs with: ", (const char **)&p)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs.header";
- return fsck_report_ref(o, &report,
- FSCK_MSG_BAD_PACKED_REF_HEADER,
- "'%.*s' does not start with '# pack-refs with: '",
- (int)(eol - start), start);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ goto cleanup;
}
- return 0;
+ string_list_split_in_place(&traits, p, " ", -1);
+ *sorted = unsorted_string_list_has_string(&traits, "sorted");
+
+cleanup:
+ free(tmp_line);
+ string_list_clear(&traits, 0);
+ return ret;
}
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ -1915,8 +1935,68 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return ret;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ const char *start, const char *eof)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ unsigned long line_number = 1;
+ const char *former = NULL;
+ const char *current;
+ const char *eol;
+ int ret = 0;
+
+ if (*start == '#') {
+ eol = memchr(start, '\n', eof - start);
+ start = eol + 1;
+ line_number++;
+ }
+
+ for (; start < eof; line_number++, start = eol + 1) {
+ eol = memchr(start, '\n', eof - start);
+
+ if (*start == '^')
+ continue;
+
+ if (!former) {
+ former = start + hexsz + 1;
+ continue;
+ }
+
+ current = start + hexsz + 1;
+ if (cmp_packed_refname(former, current) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is less than previous refname '%s'";
+
+ eol = memchr(former, '\n', eof - former);
+ strbuf_add(&refname1, former, eol - former);
+ eol = memchr(current, '\n', eof - current);
+ strbuf_add(&refname2, current, eol - current);
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname2.buf, refname1.buf);
+ goto cleanup;
+ }
+ former = current;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
+ unsigned int *sorted,
const char *start, const char *eof)
{
struct strbuf refname = STRBUF_INIT;
@@ -1926,7 +2006,7 @@ static int packed_fsck_ref_content(struct fsck_options *o,
ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
if (*start == '#') {
- ret |= packed_fsck_ref_header(o, start, eol);
+ ret |= packed_fsck_ref_header(o, start, eol, sorted);
start = eol + 1;
line_number++;
@@ -1957,9 +2037,10 @@ static int packed_fsck(struct ref_store *ref_store,
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
struct strbuf packed_ref_content = STRBUF_INIT;
+ unsigned int sorted = 0;
struct stat st;
- int fd;
int ret = 0;
+ int fd;
if (!is_main_worktree(wt))
goto cleanup;
@@ -2012,8 +2093,11 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
+ if (!ret && sorted)
+ ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
cleanup:
strbuf_release(&packed_ref_content);
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 7421cc1e7f..28dc8dcddc 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -735,4 +735,91 @@ test_expect_success 'packed-refs content should be checked' '
)
'
+test_expect_success 'packed-ref with sorted trait should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ $tag_1_oid $refname3
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 3: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname1'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $tag_1_oid $refname3
+ ^$tag_1_peeled_oid
+ $branch_2_oid $refname2
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 4: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname3'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
+test_expect_success 'packed-ref without sorted trait should not be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ EOF
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v5 8/8] builtin/fsck: add `git refs verify` child process
2025-02-17 15:25 ` [PATCH v5 " shejialuo
` (6 preceding siblings ...)
2025-02-17 15:28 ` [PATCH v5 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-02-17 15:28 ` shejialuo
2025-02-25 8:27 ` [PATCH v5 0/8] add more ref consistency checks Patrick Steinhardt
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-17 15:28 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
Then, introduce the option to allow the user to disable checking ref
database consistency. Put this function in the very first execution
sequence of "git-fsck(1)" due to that we don't want the existing code of
"git-fsck(1)" which would implicitly check the consistency of refs to
die the program.
Last, update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/git-fsck.adoc | 7 ++++++-
builtin/fsck.c | 33 ++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 39 +++++++++++++++++++++++++++++++++++++
3 files changed, 77 insertions(+), 2 deletions(-)
diff --git a/Documentation/git-fsck.adoc b/Documentation/git-fsck.adoc
index 8f32800a83..11203ba925 100644
--- a/Documentation/git-fsck.adoc
+++ b/Documentation/git-fsck.adoc
@@ -12,7 +12,7 @@ SYNOPSIS
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
[--[no-]full] [--strict] [--verbose] [--lost-found]
[--[no-]dangling] [--[no-]progress] [--connectivity-only]
- [--[no-]name-objects] [<object>...]
+ [--[no-]name-objects] [--[no-]references] [<object>...]
DESCRIPTION
-----------
@@ -104,6 +104,11 @@ care about this output and want to speed it up further.
progress status even if the standard error stream is not
directed to a terminal.
+--[no-]references::
+ Control whether to check the references database consistency
+ via 'git refs verify'. See linkgit:git-refs[1] for details.
+ The default is to check the references database.
+
CONFIGURATION
-------------
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a4dcb0716..f4f395cfbd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -50,6 +50,7 @@ static int verbose;
static int show_progress = -1;
static int show_dangling = 1;
static int name_objects;
+static int check_references = 1;
#define ERROR_OBJECT 01
#define ERROR_REACHABLE 02
#define ERROR_PACK 04
@@ -905,11 +906,37 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(struct repository *r)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+
+ if (show_progress)
+ progress = start_progress(r, _("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
" [--[no-]dangling] [--[no-]progress] [--connectivity-only]\n"
- " [--[no-]name-objects] [<object>...]"),
+ " [--[no-]name-objects] [--[no-]references] [<object>...]"),
NULL
};
@@ -928,6 +955,7 @@ static struct option fsck_opts[] = {
N_("write dangling objects in .git/lost-found")),
OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),
+ OPT_BOOL(0, "references", &check_references, N_("check reference database consistency")),
OPT_END(),
};
@@ -970,6 +998,9 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ if (check_references)
+ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 28dc8dcddc..42e8a84739 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -822,4 +822,43 @@ test_expect_success 'packed-ref without sorted trait should not be checked' '
)
'
+test_expect_success '--[no-]references option should apply to fsck' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ (
+ cd repo &&
+ test_commit default &&
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --references 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --no-references 2>err &&
+ rm $branch_dir_prefix/branch-garbage &&
+ test_must_be_empty err || return 1
+ done
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH v5 2/8] builtin/refs: get worktrees without reading head information
2025-02-17 15:27 ` [PATCH v5 2/8] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-25 8:26 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-25 8:26 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Mon, Feb 17, 2025 at 11:27:34PM +0800, shejialuo wrote:
> diff --git a/worktree.h b/worktree.h
> index 38145df80f..f7003a9c12 100644
> --- a/worktree.h
> +++ b/worktree.h
> @@ -30,6 +30,13 @@ struct worktree {
> */
> struct worktree **get_worktrees(void);
>
> +/*
> + * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
> + * get the worktree without worrying about failures pertaining to parsing
> + * the HEAD ref. This is useful when we want to check the ref db consistency.
Nit, not worth a reroll: this is highly specific to what you're doing.
How about: "This is useful in contexts where it is assumed that the
refdb may not be in a consistent state." That would also include cases
like e.g. `repair_worktrees()`.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file
2025-02-17 15:27 ` [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-25 8:27 ` Patrick Steinhardt
0 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-25 8:27 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Mon, Feb 17, 2025 at 11:27:42PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index a7b6f74b6e..8140a31d07 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1748,15 +1749,43 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> return empty_ref_iterator_begin();
> }
>
> -static int packed_fsck(struct ref_store *ref_store UNUSED,
> - struct fsck_options *o UNUSED,
> +static int packed_fsck(struct ref_store *ref_store,
> + struct fsck_options *o,
> struct worktree *wt)
> {
> + struct packed_ref_store *refs = packed_downcast(ref_store,
> + REF_STORE_READ, "fsck");
> + struct stat st;
> + int ret = 0;
>
> if (!is_main_worktree(wt))
> - return 0;
> + goto cleanup;
>
> - return 0;
> + if (o->verbose)
> + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
> +
> + if (lstat(refs->path, &st) < 0) {
> + /*
> + * If the packed-refs file doesn't exist, there's nothing
> + * to check.
> + */
> + if (errno == ENOENT)
> + goto cleanup;
> + ret = error_errno(_("unable to stat %s"), refs->path);
Nit: We should quote the file name: "unable to stat '%s'".
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-17 15:27 ` [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-25 8:27 ` Patrick Steinhardt
2025-02-25 12:34 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-25 8:27 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Mon, Feb 17, 2025 at 11:27:50PM +0800, shejialuo wrote:
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 8140a31d07..09eb3886c3 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
>
> tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
>
> - if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
> + if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
> die_invalid_line(refs->path,
> snapshot->buf,
> snapshot->eof - snapshot->buf);
I know that Junio pointed out that we should check for a trailing space
after the colon. But do we really feel comfortable to tighten the check
like this now? If there was any broken writer of the format that does
not include the whitespace we'd now be unable to parse their output.
I scanned through a couple of third-party clients:
- libgit2 is fine and always writes the space. It also expects the
whitespace to exist.
- JGit does not expect the header to have a trailing space, but
expects the "peeled" capability to have a leading space, which is
mostly equivalent because that capability is typically the first one
we write. It always writes the space.
- gitoxide expects the space to exist and writes it.
- go-git doesn't even seem to care about the header? Dunno, maybe I
was just not able to locate the relevant code.
So yes, we should be fine, and the fact that other implementations
expect the space to exist indicates that being more thorough here is a
good thing. It might be a good idea though to split out this change into
a separate commit and then provide more reasoning _why_ it is fine,
including the above info about alternate implementations.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v5 0/8] add more ref consistency checks
2025-02-17 15:25 ` [PATCH v5 " shejialuo
` (7 preceding siblings ...)
2025-02-17 15:28 ` [PATCH v5 8/8] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-02-25 8:27 ` Patrick Steinhardt
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
9 siblings, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-25 8:27 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Mon, Feb 17, 2025 at 11:25:25PM +0800, shejialuo wrote:
> Hi All:
>
> This changes enhances the following things:
>
> 1. [PATCH v5 2/8]: enhance the comment suggested by Karthik.
> 2. [PATCH v5 3/8]: use lstat to check whether the filetype of
> "packed-ref" is a regular file instead of using `open_nofollow`
> to check. And also enhance the commit message suggested by Karthik.
> 3. [PATCH v5 4/8]: move "open_nofollow" in original [PATCH v4 3/8] to
> this.
>
> Also, I rebase due to the conflict that all *.txt files have been
> renamed to *.adoc. However, I don't know whether this is a real
> conflict. But I decide to rebase to make the life of Junio easy.
I've got a couple of small nits, but overall I think this series should
be almost ready. Thanks!
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check
2025-02-25 8:27 ` Patrick Steinhardt
@ 2025-02-25 12:34 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 12:34 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Tue, Feb 25, 2025 at 09:27:03AM +0100, Patrick Steinhardt wrote:
> On Mon, Feb 17, 2025 at 11:27:50PM +0800, shejialuo wrote:
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index 8140a31d07..09eb3886c3 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
> >
> > tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
> >
> > - if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
> > + if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
> > die_invalid_line(refs->path,
> > snapshot->buf,
> > snapshot->eof - snapshot->buf);
>
> I know that Junio pointed out that we should check for a trailing space
> after the colon. But do we really feel comfortable to tighten the check
> like this now? If there was any broken writer of the format that does
> not include the whitespace we'd now be unable to parse their output.
>
> I scanned through a couple of third-party clients:
>
> - libgit2 is fine and always writes the space. It also expects the
> whitespace to exist.
>
> - JGit does not expect the header to have a trailing space, but
> expects the "peeled" capability to have a leading space, which is
> mostly equivalent because that capability is typically the first one
> we write. It always writes the space.
>
> - gitoxide expects the space to exist and writes it.
>
> - go-git doesn't even seem to care about the header? Dunno, maybe I
> was just not able to locate the relevant code.
I have searched the code. The go-git implement "git pack-refs" in
`PackRefs`. go-git never writes header for "packed-refs" file.
Thanks for this wonderful suggestion.
>
> So yes, we should be fine, and the fact that other implementations
> expect the space to exist indicates that being more thorough here is a
> good thing. It might be a good idea though to split out this change into
> a separate commit and then provide more reasoning _why_ it is fine,
> including the above info about alternate implementations.
>
Yes, I agree that we should split out this change. Let me do this.
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v6 0/9] add more ref consistency checks
2025-02-17 15:25 ` [PATCH v5 " shejialuo
` (8 preceding siblings ...)
2025-02-25 8:27 ` [PATCH v5 0/8] add more ref consistency checks Patrick Steinhardt
@ 2025-02-25 13:19 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
` (9 more replies)
9 siblings, 10 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:19 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi All:
This changes enhances the following things (v6-changed):
1. [PATCH v6 2/9]: enhance the comment.
2. [PATCH v6 3/9]: use '' to quote the file in the print message.
2. [PATCH v6 4/9]: a new commit message to explain why we can tighten
the rule.
Thanks,
Jialuo
---
This series mainly does the following things:
1. Fix subshell issues
2. Add ref checks for packed-backend.
1. Check whether the filetype of "packed-refs" is correct.
2. Check whether the syntax of "packed-refs" is correct by using the
rules from "packed-backend.c::create_snapshot" and
"packed-backend.c::next_record".
3. Check whether the pointed object exists and whether the
"packed-refs" file is sorted.
3. Call "git refs verify" for "git-fsck(1)".
shejialuo (9):
t0602: use subshell to ensure working directory unchanged
builtin/refs: get worktrees without reading head information
packed-backend: check whether the "packed-refs" is regular file
packed-backend: check if header starts with "# pack-refs with: "
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NUL characters
packed-backend: add "packed-refs" entry consistency check
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.adoc | 14 +
Documentation/git-fsck.adoc | 7 +-
builtin/fsck.c | 33 +-
builtin/refs.c | 2 +-
fsck.h | 4 +
refs/packed-backend.c | 369 +++++++++-
t/t0602-reffiles-fsck.sh | 1205 +++++++++++++++++++-------------
worktree.c | 5 +
worktree.h | 8 +
9 files changed, 1162 insertions(+), 485 deletions(-)
Range-diff against v5:
1: b3952d80a2 = 1: b3952d80a2 t0602: use subshell to ensure working directory unchanged
2: 3695586f58 ! 2: fa5ce20bb7 builtin/refs: get worktrees without reading head information
@@ worktree.h: struct worktree {
+/*
+ * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
+ * get the worktree without worrying about failures pertaining to parsing
-+ * the HEAD ref. This is useful when we want to check the ref db consistency.
++ * the HEAD ref. This is useful in contexts where it is assumed that the
++ * refdb may not be in a consistent state.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
3: cbaae00e8b ! 3: 787645a700 packed-backend: check whether the "packed-refs" is regular file
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ */
+ if (errno == ENOENT)
+ goto cleanup;
-+ ret = error_errno(_("unable to stat %s"), refs->path);
++ ret = error_errno(_("unable to stat '%s'"), refs->path);
+ goto cleanup;
+ }
+
-: ---------- > 4: f097e0f093 packed-backend: check if header starts with "# pack-refs with: "
4: b9ce8734ac ! 5: a589a38b68 packed-backend: add "packed-refs" header consistency check
@@ Commit message
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
- pack-refs with:". Before we port this check into "packed_fsck", let's
- fix "create_snapshot" to check the prefix "# packed-ref with: " instead
- of "# packed-ref with:" due to that we will always write a single
- trailing space after the colon.
-
- However, we need to consider other situations and discuss whether we
- need to add checks.
+ pack-refs with: ". However, we need to consider other situations and
+ discuss whether we need to add checks.
1. If the header does not exist, we should not report an error to the
user. This is because in older Git version, we never write header in
@@ fsck.h: enum fsck_msg_type {
FUNC(ZERO_PADDED_DATE, ERROR) \
## refs/packed-backend.c ##
-@@ refs/packed-backend.c: static struct snapshot *create_snapshot(struct packed_ref_store *refs)
-
- tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
-
-- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
-+ if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
- die_invalid_line(refs->path,
- snapshot->buf,
- snapshot->eof - snapshot->buf);
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
5: 9f638b3adf = 6: 7255c2b597 packed-backend: check whether the refname contains NUL characters
6: 2c5395bdd0 = 7: 7794a2ebfd packed-backend: add "packed-refs" entry consistency check
7: 648404c60d = 8: 2a9138b14d packed-backend: check whether the "packed-refs" is sorted
8: 4dbbacf44b = 9: ccde32491f builtin/fsck: add `git refs verify` child process
--
2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
@ 2025-02-25 13:21 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 2/9] builtin/refs: get worktrees without reading head information shejialuo
` (8 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:21 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
For every test, we would execute the command "cd repo" in the first but
we never execute the command "cd .." to restore the working directory.
However, it's either not a good idea use above way. Because if any test
fails between "cd repo" and "cd ..", the "cd .." will never be reached.
And we cannot correctly restore the working directory.
Let's use subshell to ensure that the current working directory could be
restored to the correct path.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
1 file changed, 494 insertions(+), 473 deletions(-)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..cf7a202d0d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -14,222 +14,229 @@ test_expect_success 'ref name should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
-
- git commit --allow-empty -m initial &&
- git checkout -b default-branch &&
- git tag default-tag &&
- git tag multi_hierarchy/default-tag &&
-
- cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
- git refs verify 2>err &&
- test_must_be_empty err &&
- rm $branch_dir_prefix/@ &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
- git refs verify 2>err &&
- rm $tag_dir_prefix/tag-1.lock &&
- test_must_be_empty err &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/.lock: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/.lock &&
- test_cmp expect err &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/$refname: badRefName: invalid refname format
- EOF
- rm "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ (
+ cd repo &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ git commit --allow-empty -m initial &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_cmp expect err || return 1
- done &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- mkdir "$branch_dir_prefix/$refname" &&
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ error: refs/tags/.lock: badRefName: invalid refname format
EOF
- rm -r "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done
+ rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=warn refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=ignore refs verify 2>err &&
- test_must_be_empty err
+ (
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/.branch-1: badRefName: invalid refname format
+ EOF
+ rm $branch_dir_prefix/.branch-1 &&
+ test_cmp expect err &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=ignore refs verify 2>err &&
+ test_must_be_empty err
+ )
'
test_expect_success 'ref name check should work for multiple worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
-
- cd repo &&
- test_commit initial &&
- git checkout -b branch-1 &&
- test_commit second &&
- git checkout -b branch-2 &&
- test_commit third &&
- git checkout -b branch-3 &&
- git worktree add ./worktree-1 branch-1 &&
- git worktree add ./worktree-2 branch-2 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
- (
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
(
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
-
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err &&
-
- for worktree in "worktree-1" "worktree-2"
- do
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
(
- cd $worktree &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err || return 1
- )
- done
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+ )
'
test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- git refs verify 2>err &&
- test_must_be_empty err &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- for trailing_content in " garbage" " more garbage"
- do
- printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $branch_dir_prefix/branch-garbage &&
- test_cmp expect err || return 1
- done &&
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- '\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err &&
- printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- garbage'\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
+ )
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -237,99 +244,103 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- bad_content_1=$(git rev-parse main)x &&
- bad_content_2=xfsazqfxcadas &&
- bad_content_3=Xfsazqfxcadas &&
- printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
- printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
- printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
- error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
- error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'textual symref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for good_referent in "refs/heads/branch" "HEAD"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
- do
- printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-1 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-2 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-3 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-complicated &&
- test_cmp expect err
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+ )
'
test_expect_success 'textual symref content should be checked (aggregate)' '
@@ -337,32 +348,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
- printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'the target of the textual symref should be checked' '
@@ -370,28 +383,30 @@ test_expect_success 'the target of the textual symref should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
- git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
-
- for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
- do
- printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
- EOF
- rm $branch_dir_prefix/branch-bad-1 &&
- test_cmp expect err || return 1
- done
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked' '
@@ -399,201 +414,207 @@ test_expect_success SYMLINKS 'symlink symref content should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
- test_cmp expect err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- main_worktree_refdir_prefix=.git/refs/heads &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
-
- ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree2_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $main_worktree_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- for bad_referent_name in ".tag" "branch "
- do
- ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
- test_cmp expect err || return 1
- done
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
- test_must_fail git refs verify 2>err &&
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $worktree1_refdir_prefix/bad-branch-1 &&
- test_cmp expect err || return 1
- done &&
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
- test_must_fail git refs verify 2>err &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $worktree2_refdir_prefix/bad-branch-2 &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree1_refdir_prefix/branch-garbage &&
- test_cmp expect err
+ rm $worktree1_refdir_prefix/branch-garbage &&
+ test_cmp expect err
+ )
'
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 2/9] builtin/refs: get worktrees without reading head information
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
2025-02-25 13:21 ` [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-02-25 13:21 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
` (7 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:21 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we should avoid reading the head information, which may
execute the read operation in packed backend with stricter checks to die
the program. Instead, we should continue to check other parts of the
"packed-refs" file completely.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head information.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 8 ++++++++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index d4a68c9c23..d23482a746 100644
--- a/worktree.c
+++ b/worktree.c
@@ -198,6 +198,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..a305c7e2c7 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,14 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
+ * get the worktree without worrying about failures pertaining to parsing
+ * the HEAD ref. This is useful in contexts where it is assumed that the
+ * refdb may not be in a consistent state.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
2025-02-25 13:21 ` [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-25 13:21 ` [PATCH v6 2/9] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-25 13:21 ` shejialuo
2025-02-25 17:44 ` Junio C Hamano
2025-02-25 13:21 ` [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
` (6 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:21 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". Let's verify that the "packed-refs" has
the expected filetype, confirming it is created by "git pack-refs"
command.
Use "lstat" to check the file mode. If we cannot check the file status
due to there is no such file this is OK because there is a possibility
that there is no "packed-refs" in the repo.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 37 +++++++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++
2 files changed, 55 insertions(+), 4 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a7b6f74b6e..6c118119a0 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../git-compat-util.h"
#include "../config.h"
#include "../dir.h"
+#include "../fsck.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1748,15 +1749,43 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ struct stat st;
+ int ret = 0;
if (!is_main_worktree(wt))
- return 0;
+ goto cleanup;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ if (lstat(refs->path, &st) < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+ ret = error_errno(_("unable to stat '%s'"), refs->path);
+ goto cleanup;
+ }
+
+ if (!S_ISREG(st.st_mode)) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+cleanup:
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index cf7a202d0d..e65ca341cd 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' '
)
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-back .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: "
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
` (2 preceding siblings ...)
2025-02-25 13:21 ` [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-25 13:21 ` shejialuo
2025-02-26 8:08 ` Patrick Steinhardt
2025-02-25 13:21 ` [PATCH v6 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
` (5 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:21 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We always write a space after "# pack-refs with:". However, when
creating the packed-ref snapshot, we only check whether the header
starts with "# pack-refs with:". However, we need to make sure that we
would not break compatibility by tightening the rule. The following is
how some third-party libraries handle the header of "packed-ref" file.
1. libgit2 is fine and always writes the space. It also expects the
whitespace to exist.
2. JGit does not expect th header to have a trailing space, but expects
the "peeled" capability to have a leading space, which is mostly
equivalent because that capability is typically the first one we
write. It always writes the space.
3. gitoxide expects the space t exist and writes it.
4. go-git doesn't create the header by default.
So, we are safe to tighten the rule by checking whether the header
starts with "# pack-refs with: ".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6c118119a0..9dabb5e556 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
+ if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
die_invalid_line(refs->path,
snapshot->buf,
snapshot->eof - snapshot->buf);
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 5/9] packed-backend: add "packed-refs" header consistency check
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
` (3 preceding siblings ...)
2025-02-25 13:21 ` [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
@ 2025-02-25 13:21 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
` (4 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:21 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with: ". However, we need to consider other situations and
discuss whether we need to add checks.
1. If the header does not exist, we should not report an error to the
user. This is because in older Git version, we never write header in
the "packed-refs" file. Also, we do allow no header in "packed-refs"
in runtime.
2. If the header content does not start with "# packed-ref with: ", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER". This is expected because we make it extensible
intentionally and runtime "create_snapshot" won't complain about
unknown traits. In order to align with the runtime behavior. There is
no need to report.
As we have analyzed, we only need to check the case 2 in the above. In
order to do this, use "open_nofollow" function to get the file
descriptor and then read the "packed-refs" file via "strbuf_read". Like
what "create_snapshot" and other functions do, we could split the line
by finding the next newline in the buffer. When we cannot find a
newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
Then, parse the first line to apply the checks. Update the test to
exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 8 +++
fsck.h | 2 +
refs/packed-backend.c | 94 ++++++++++++++++++++++++++++++++++
t/t0602-reffiles-fsck.sh | 52 +++++++++++++++++++
4 files changed, 156 insertions(+)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index b14bc44ca4..11906f90fd 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,10 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index a44c231a5f..67e3c97bc0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 9dabb5e556..4891c86a5a 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1749,13 +1749,78 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ unsigned long line_number, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ strbuf_release(&packed_entry);
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o,
+ const char *start, const char *eol)
+{
+ if (!starts_with(start, "# pack-refs with: ")) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ }
+
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
struct stat st;
+ int fd;
int ret = 0;
if (!is_main_worktree(wt))
@@ -1784,7 +1849,36 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ /*
+ * There is a chance that "packed-refs" file is removed or converted to
+ * a symlink after filetype check and before open. So we need to avoid
+ * this race condition by opening the file.
+ */
+ fd = open_nofollow(refs->path, O_RDONLY);
+ if (fd < 0) {
+ if (errno == ENOENT)
+ goto cleanup;
+
+ if (errno == ELOOP) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+ }
+
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read %s"), refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index e65ca341cd..e055c36e74 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -639,4 +639,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
)
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled" \
+ "# pack-refs with:peeled fully-peeled sorted"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
+
+test_expect_success 'packed-refs missing header should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
+test_expect_success 'packed-refs unknown traits should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 6/9] packed-backend: check whether the refname contains NUL characters
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
` (4 preceding siblings ...)
2025-02-25 13:21 ` [PATCH v6 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-25 13:21 ` shejialuo
2025-02-25 13:22 ` [PATCH v6 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
` (3 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:21 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
However, it is reported in [1], we cannot catch some corruption. But we
already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.
In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.
Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 4891c86a5a..a74ee57776 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,21 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+ return !!memchr(refname->buf, '\0', refname->len);
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -895,6 +910,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_nul(&iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 7/9] packed-backend: add "packed-refs" entry consistency check
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
` (5 preceding siblings ...)
2025-02-25 13:21 ` [PATCH v6 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-02-25 13:22 ` shejialuo
2025-02-25 13:22 ` [PATCH v6 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
` (2 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:22 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry to inspect whether the oid is
not correct. Then, check whether the next character is oid. Then
check the refname.
2. If the next line starts with '^', it would continue to parse the
peeled oid and check whether the last character is '\n'.
As we decide to implement the ref consistency check for "packed-refs",
let's port these two checks and update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 122 ++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 44 ++++++++++++
4 files changed, 169 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 11906f90fd..02a7bf0503 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 67e3c97bc0..14d70f6653 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a74ee57776..dd3f7ab255 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1812,9 +1812,114 @@ static int packed_fsck_ref_header(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+ int ret = 0;
+
+ /*
+ * Skip the '^' and parse the peeled oid.
+ */
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p != eol) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
+ int ret = 0;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p == eol || !isspace(*p)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+ goto cleanup;
+ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
+ if (refname_contains_nul(refname)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
+ }
+
+ if (check_refname_format(refname->buf, 0)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
+ struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
const char *eol;
int ret = 0;
@@ -1827,6 +1932,21 @@ static int packed_fsck_ref_content(struct fsck_options *o,
line_number++;
}
+ while (start < eof) {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, line_number, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
+ strbuf_release(&refname);
return ret;
}
@@ -1892,7 +2012,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index e055c36e74..7421cc1e7f 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -691,4 +691,48 @@ test_expect_success 'packed-refs unknown traits should not be reported' '
)
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $short_oid refs/heads/branch-1
+ ${branch_1_oid}x
+ $branch_2_oid refs/heads/bad-branch
+ $branch_2_oid refs/heads/branch.
+ $tag_1_oid refs/tags/annotated-tag-3
+ ^$short_oid
+ $tag_2_oid refs/tags/annotated-tag-4.
+ ^$tag_2_peeled_oid garbage
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 8/9] packed-backend: check whether the "packed-refs" is sorted
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
` (6 preceding siblings ...)
2025-02-25 13:22 ` [PATCH v6 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-25 13:22 ` shejialuo
2025-02-25 13:22 ` [PATCH v6 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:22 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
When there is a "sorted" trait in the header of the "packed-refs" file,
it means that each entry is sorted increasingly by comparing the
refname. We should add checks to verify whether the "packed-refs" is
sorted in this case.
Update the "packed_fsck_ref_header" to know whether there is a "sorted"
trail in the header. It may seem that we could record all refnames
during the parsing process and then compare later. However, this is not
a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. We cannot reuse the existing compare function "cmp_packed_ref_records"
which cause repetition.
Because "cmp_packed_ref_records" needs an extra parameter "struct
snaphost", extract the common part into a new function
"cmp_packed_ref_records" to reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to parse the file
again and user the new fsck message "packedRefUnsorted(ERROR)" to report
to the user if the file is not sorted.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 118 ++++++++++++++++++++++++++++-----
t/t0602-reffiles-fsck.sh | 87 ++++++++++++++++++++++++
4 files changed, 192 insertions(+), 17 deletions(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 02a7bf0503..9601fff228 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -187,6 +187,9 @@
(ERROR) The "packed-refs" file contains an entry that is
not terminated by a newline.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 14d70f6653..19f3cb2773 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index dd3f7ab255..75f28e283a 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,9 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
-{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+static int cmp_packed_refname(const char *r1, const char *r2)
+{
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +317,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1797,19 +1803,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
}
static int packed_fsck_ref_header(struct fsck_options *o,
- const char *start, const char *eol)
+ const char *start, const char *eol,
+ unsigned int *sorted)
{
- if (!starts_with(start, "# pack-refs with: ")) {
+ struct string_list traits = STRING_LIST_INIT_NODUP;
+ char *tmp_line;
+ int ret = 0;
+ char *p;
+
+ tmp_line = xmemdupz(start, eol - start);
+ if (!skip_prefix(tmp_line, "# pack-refs with: ", (const char **)&p)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs.header";
- return fsck_report_ref(o, &report,
- FSCK_MSG_BAD_PACKED_REF_HEADER,
- "'%.*s' does not start with '# pack-refs with: '",
- (int)(eol - start), start);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ goto cleanup;
}
- return 0;
+ string_list_split_in_place(&traits, p, " ", -1);
+ *sorted = unsorted_string_list_has_string(&traits, "sorted");
+
+cleanup:
+ free(tmp_line);
+ string_list_clear(&traits, 0);
+ return ret;
}
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ -1915,8 +1935,68 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return ret;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ const char *start, const char *eof)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ unsigned long line_number = 1;
+ const char *former = NULL;
+ const char *current;
+ const char *eol;
+ int ret = 0;
+
+ if (*start == '#') {
+ eol = memchr(start, '\n', eof - start);
+ start = eol + 1;
+ line_number++;
+ }
+
+ for (; start < eof; line_number++, start = eol + 1) {
+ eol = memchr(start, '\n', eof - start);
+
+ if (*start == '^')
+ continue;
+
+ if (!former) {
+ former = start + hexsz + 1;
+ continue;
+ }
+
+ current = start + hexsz + 1;
+ if (cmp_packed_refname(former, current) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is less than previous refname '%s'";
+
+ eol = memchr(former, '\n', eof - former);
+ strbuf_add(&refname1, former, eol - former);
+ eol = memchr(current, '\n', eof - current);
+ strbuf_add(&refname2, current, eol - current);
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname2.buf, refname1.buf);
+ goto cleanup;
+ }
+ former = current;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
+ unsigned int *sorted,
const char *start, const char *eof)
{
struct strbuf refname = STRBUF_INIT;
@@ -1926,7 +2006,7 @@ static int packed_fsck_ref_content(struct fsck_options *o,
ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
if (*start == '#') {
- ret |= packed_fsck_ref_header(o, start, eol);
+ ret |= packed_fsck_ref_header(o, start, eol, sorted);
start = eol + 1;
line_number++;
@@ -1957,9 +2037,10 @@ static int packed_fsck(struct ref_store *ref_store,
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
struct strbuf packed_ref_content = STRBUF_INIT;
+ unsigned int sorted = 0;
struct stat st;
- int fd;
int ret = 0;
+ int fd;
if (!is_main_worktree(wt))
goto cleanup;
@@ -2012,8 +2093,11 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
+ if (!ret && sorted)
+ ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
cleanup:
strbuf_release(&packed_ref_content);
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 7421cc1e7f..28dc8dcddc 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -735,4 +735,91 @@ test_expect_success 'packed-refs content should be checked' '
)
'
+test_expect_success 'packed-ref with sorted trait should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ $tag_1_oid $refname3
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 3: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname1'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $tag_1_oid $refname3
+ ^$tag_1_peeled_oid
+ $branch_2_oid $refname2
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 4: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname3'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
+test_expect_success 'packed-ref without sorted trait should not be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ EOF
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v6 9/9] builtin/fsck: add `git refs verify` child process
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
` (7 preceding siblings ...)
2025-02-25 13:22 ` [PATCH v6 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-02-25 13:22 ` shejialuo
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-25 13:22 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
Then, introduce the option to allow the user to disable checking ref
database consistency. Put this function in the very first execution
sequence of "git-fsck(1)" due to that we don't want the existing code of
"git-fsck(1)" which would implicitly check the consistency of refs to
die the program.
Last, update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/git-fsck.adoc | 7 ++++++-
builtin/fsck.c | 33 ++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 39 +++++++++++++++++++++++++++++++++++++
3 files changed, 77 insertions(+), 2 deletions(-)
diff --git a/Documentation/git-fsck.adoc b/Documentation/git-fsck.adoc
index 8f32800a83..11203ba925 100644
--- a/Documentation/git-fsck.adoc
+++ b/Documentation/git-fsck.adoc
@@ -12,7 +12,7 @@ SYNOPSIS
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
[--[no-]full] [--strict] [--verbose] [--lost-found]
[--[no-]dangling] [--[no-]progress] [--connectivity-only]
- [--[no-]name-objects] [<object>...]
+ [--[no-]name-objects] [--[no-]references] [<object>...]
DESCRIPTION
-----------
@@ -104,6 +104,11 @@ care about this output and want to speed it up further.
progress status even if the standard error stream is not
directed to a terminal.
+--[no-]references::
+ Control whether to check the references database consistency
+ via 'git refs verify'. See linkgit:git-refs[1] for details.
+ The default is to check the references database.
+
CONFIGURATION
-------------
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a4dcb0716..f4f395cfbd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -50,6 +50,7 @@ static int verbose;
static int show_progress = -1;
static int show_dangling = 1;
static int name_objects;
+static int check_references = 1;
#define ERROR_OBJECT 01
#define ERROR_REACHABLE 02
#define ERROR_PACK 04
@@ -905,11 +906,37 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(struct repository *r)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+
+ if (show_progress)
+ progress = start_progress(r, _("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
" [--[no-]dangling] [--[no-]progress] [--connectivity-only]\n"
- " [--[no-]name-objects] [<object>...]"),
+ " [--[no-]name-objects] [--[no-]references] [<object>...]"),
NULL
};
@@ -928,6 +955,7 @@ static struct option fsck_opts[] = {
N_("write dangling objects in .git/lost-found")),
OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),
+ OPT_BOOL(0, "references", &check_references, N_("check reference database consistency")),
OPT_END(),
};
@@ -970,6 +998,9 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ if (check_references)
+ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 28dc8dcddc..42e8a84739 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -822,4 +822,43 @@ test_expect_success 'packed-ref without sorted trait should not be checked' '
)
'
+test_expect_success '--[no-]references option should apply to fsck' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ (
+ cd repo &&
+ test_commit default &&
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --references 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --no-references 2>err &&
+ rm $branch_dir_prefix/branch-garbage &&
+ test_must_be_empty err || return 1
+ done
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-25 13:21 ` [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-25 17:44 ` Junio C Hamano
2025-02-26 12:05 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-02-25 17:44 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> Although "git-fsck(1)" and "packed-backend.c" will check some
> consistency and correctness of "packed-refs" file, they never check the
> filetype of the "packed-refs". Let's verify that the "packed-refs" has
> the expected filetype, confirming it is created by "git pack-refs"
> command.
>
> Use "lstat" to check the file mode. If we cannot check the file status
> due to there is no such file this is OK because there is a possibility
> that there is no "packed-refs" in the repo.
Can this be done _after_ the open_nofollow() check you had in the
previous round noticed a problem? Even though we are trying to
notice and find problems in the given repository, it is generally
a good idea to optimize for the more common case (i.e. the file is a
regular one and not a symbolic link or directory or anything funny).
Something along the lines of
fd = open_nofollow(...);
if (fd < 0) {
lstat() to inspect the details
} else if (fstat(fd, &st) < 0) {
... cannot tell what we opened ...
} else if (!S_ISREG(st.st_mode)) {
... we opened something funny ...
} else {
... the thing is a regular file as expected ...
}
perhaps?
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: "
2025-02-25 13:21 ` [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
@ 2025-02-26 8:08 ` Patrick Steinhardt
2025-02-26 12:28 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-26 8:08 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Tue, Feb 25, 2025 at 09:21:41PM +0800, shejialuo wrote:
> We always write a space after "# pack-refs with:". However, when
> creating the packed-ref snapshot, we only check whether the header
> starts with "# pack-refs with:". However, we need to make sure that we
> would not break compatibility by tightening the rule. The following is
> how some third-party libraries handle the header of "packed-ref" file.
>
> 1. libgit2 is fine and always writes the space. It also expects the
> whitespace to exist.
> 2. JGit does not expect th header to have a trailing space, but expects
> the "peeled" capability to have a leading space, which is mostly
> equivalent because that capability is typically the first one we
> write. It always writes the space.
> 3. gitoxide expects the space t exist and writes it.
> 4. go-git doesn't create the header by default.
>
> So, we are safe to tighten the rule by checking whether the header
> starts with "# pack-refs with: ".
The commit message nicely describes why it's safe to do the change, but
it doesn't describe why it's something we _want_ to do.
Ideally, we'd be able to argue with a technical spec of the format, but
unless I'm mistaken such a document does not exist. The next-best thing
is to do what everyone can agree on, and that seems to be to both write
and expect a space after the colon. By not following consensus that
exists in other libraries we're being more loose.
So if we for example started to stop writing the space due to a bug,
we'd still continue to parse the header alright and thus not notice the
problem, but now we have broken other implementations. That may be a
good enough justification for the change itself.
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-25 17:44 ` Junio C Hamano
@ 2025-02-26 12:05 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 12:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Tue, Feb 25, 2025 at 09:44:12AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > Although "git-fsck(1)" and "packed-backend.c" will check some
> > consistency and correctness of "packed-refs" file, they never check the
> > filetype of the "packed-refs". Let's verify that the "packed-refs" has
> > the expected filetype, confirming it is created by "git pack-refs"
> > command.
> >
> > Use "lstat" to check the file mode. If we cannot check the file status
> > due to there is no such file this is OK because there is a possibility
> > that there is no "packed-refs" in the repo.
>
> Can this be done _after_ the open_nofollow() check you had in the
> previous round noticed a problem? Even though we are trying to
> notice and find problems in the given repository, it is generally
> a good idea to optimize for the more common case (i.e. the file is a
> regular one and not a symbolic link or directory or anything funny).
> Something along the lines of
>
> fd = open_nofollow(...);
> if (fd < 0) {
> lstat() to inspect the details
> } else if (fstat(fd, &st) < 0) {
> ... cannot tell what we opened ...
> } else if (!S_ISREG(st.st_mode)) {
> ... we opened something funny ...
> } else {
> ... the thing is a regular file as expected ...
> }
>
Good idea, by using this way, the code would be more clean. I will
improve this in the next version.
> perhaps?
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: "
2025-02-26 8:08 ` Patrick Steinhardt
@ 2025-02-26 12:28 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 12:28 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano, Michael Haggerty
On Wed, Feb 26, 2025 at 09:08:32AM +0100, Patrick Steinhardt wrote:
> On Tue, Feb 25, 2025 at 09:21:41PM +0800, shejialuo wrote:
> > We always write a space after "# pack-refs with:". However, when
> > creating the packed-ref snapshot, we only check whether the header
> > starts with "# pack-refs with:". However, we need to make sure that we
> > would not break compatibility by tightening the rule. The following is
> > how some third-party libraries handle the header of "packed-ref" file.
> >
> > 1. libgit2 is fine and always writes the space. It also expects the
> > whitespace to exist.
> > 2. JGit does not expect th header to have a trailing space, but expects
> > the "peeled" capability to have a leading space, which is mostly
> > equivalent because that capability is typically the first one we
> > write. It always writes the space.
> > 3. gitoxide expects the space t exist and writes it.
> > 4. go-git doesn't create the header by default.
> >
> > So, we are safe to tighten the rule by checking whether the header
> > starts with "# pack-refs with: ".
>
> The commit message nicely describes why it's safe to do the change, but
> it doesn't describe why it's something we _want_ to do.
>
Yes, as you have said below. We don't have document about the header
format. It's an internal implementation of Git.
> Ideally, we'd be able to argue with a technical spec of the format, but
> unless I'm mistaken such a document does not exist. The next-best thing
> is to do what everyone can agree on, and that seems to be to both write
> and expect a space after the colon. By not following consensus that
> exists in other libraries we're being more loose.
>
> So if we for example started to stop writing the space due to a bug,
> we'd still continue to parse the header alright and thus not notice the
> problem, but now we have broken other implementations. That may be a
> good enough justification for the change itself.
>
Thanks, I will improve the commit message in the next version.
> Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v7 0/9] add more ref consistency checks
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
` (8 preceding siblings ...)
2025-02-25 13:22 ` [PATCH v6 9/9] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-02-26 13:48 ` shejialuo
2025-02-26 13:49 ` [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
` (9 more replies)
9 siblings, 10 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:48 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi All:
This changes enhances the following things:
1. [PATCH v7 3/9]: use "open_nofollow" with "fstat" to check whether the
file is regular. And update the test to improve coverage.
2. [PACTH v7 4/9]: improve the commit message suggested by Patrick.
Thanks,
Jialuo
---
This series mainly does the following things:
1. Fix subshell issues
2. Add ref checks for packed-backend.
1. Check whether the filetype of "packed-refs" is correct.
2. Check whether the syntax of "packed-refs" is correct by using the
rules from "packed-backend.c::create_snapshot" and
"packed-backend.c::next_record".
3. Check whether the pointed object exists and whether the
"packed-refs" file is sorted.
3. Call "git refs verify" for "git-fsck(1)".
shejialuo (9):
t0602: use subshell to ensure working directory unchanged
builtin/refs: get worktrees without reading head information
packed-backend: check whether the "packed-refs" is regular file
packed-backend: check if header starts with "# pack-refs with: "
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NUL characters
packed-backend: add "packed-refs" entry consistency check
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.adoc | 14 +
Documentation/git-fsck.adoc | 7 +-
builtin/fsck.c | 33 +-
builtin/refs.c | 2 +-
fsck.h | 4 +
refs/packed-backend.c | 361 +++++++++-
t/t0602-reffiles-fsck.sh | 1209 +++++++++++++++++++-------------
worktree.c | 5 +
worktree.h | 8 +
9 files changed, 1161 insertions(+), 482 deletions(-)
Range-diff against v6:
1: b3952d80a2 = 1: b3952d80a2 t0602: use subshell to ensure working directory unchanged
2: fa5ce20bb7 = 2: fa5ce20bb7 builtin/refs: get worktrees without reading head information
3: 787645a700 ! 3: 861583f417 packed-backend: check whether the "packed-refs" is regular file
@@ Commit message
the expected filetype, confirming it is created by "git pack-refs"
command.
- Use "lstat" to check the file mode. If we cannot check the file status
- due to there is no such file this is OK because there is a possibility
- that there is no "packed-refs" in the repo.
+ We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
+ If the returned "fd" value is less than 0, we could check whether the
+ "errno" is "ELOOP" to report an error to the user. And then we use
+ "fstat" to check whether the "packed-refs" file is a regular file.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ REF_STORE_READ, "fsck");
+ struct stat st;
+ int ret = 0;
++ int fd;
if (!is_main_worktree(wt))
-- return 0;
-+ goto cleanup;
+ return 0;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
-+ if (lstat(refs->path, &st) < 0) {
++ fd = open_nofollow(refs->path, O_RDONLY);
++ if (fd < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
++
++ if (errno == ELOOP) {
++ struct fsck_ref_report report = { 0 };
++ report.path = "packed-refs";
++ ret = fsck_report_ref(o, &report,
++ FSCK_MSG_BAD_REF_FILETYPE,
++ "not a regular file but a symlink");
++ goto cleanup;
++ }
++
++ ret = error_errno(_("unable to open '%s'"), refs->path);
++ goto cleanup;
++ } else if (fstat(fd, &st) < 0) {
+ ret = error_errno(_("unable to stat '%s'"), refs->path);
+ goto cleanup;
-+ }
-+
-+ if (!S_ISREG(st.st_mode)) {
++ } else if (!S_ISREG(st.st_mode)) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ }
+
+cleanup:
++ if (fd >= 0)
++ close(fd);
+ return ret;
}
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref content checks should work wi
+ ln -sf packed-refs-back .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
-+ error: packed-refs: badRefFiletype: not a regular file
++ error: packed-refs: badRefFiletype: not a regular file but a symlink
+ EOF
+ rm .git/packed-refs &&
++ test_cmp expect err &&
++
++ mkdir .git/packed-refs &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: packed-refs: badRefFiletype: not a regular file
++ EOF
++ rm -r .git/packed-refs &&
+ test_cmp expect err
+ )
+'
4: f097e0f093 ! 4: 5f54cb05c3 packed-backend: check if header starts with "# pack-refs with: "
@@ Metadata
## Commit message ##
packed-backend: check if header starts with "# pack-refs with: "
- We always write a space after "# pack-refs with:". However, when
- creating the packed-ref snapshot, we only check whether the header
- starts with "# pack-refs with:". However, we need to make sure that we
- would not break compatibility by tightening the rule. The following is
- how some third-party libraries handle the header of "packed-ref" file.
+ We always write a space after "# pack-refs with:" but we don't align
+ with this rule in the "create_snapshot" method where we would check
+ whether header starts with "# pack-refs with:". It might seem that we
+ should undoubtedly tighten this rule, however, we don't have any
+ technical documentation about this and there is a possibility that we
+ would break the compatibility for other third-party libraries.
+
+ By investigating influential third-party libraries, we could conclude
+ how these libraries handle the header of "packed-refs" file:
1. libgit2 is fine and always writes the space. It also expects the
whitespace to exist.
@@ Commit message
3. gitoxide expects the space t exist and writes it.
4. go-git doesn't create the header by default.
- So, we are safe to tighten the rule by checking whether the header
- starts with "# pack-refs with: ".
+ As many third-party libraries expect a single space after "# pack-refs
+ with:", if we forget to write the space after the colon,
+ "create_snapshot" won't catch this. And we would break other
+ re-implementations. So, we'd better tighten the rule by checking whether
+ the header starts with "# pack-refs with: ".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
5: a589a38b68 ! 5: 7d7dc899ad packed-backend: add "packed-refs" header consistency check
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
struct stat st;
-+ int fd;
int ret = 0;
-
- if (!is_main_worktree(wt))
+ int fd;
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
-+ /*
-+ * There is a chance that "packed-refs" file is removed or converted to
-+ * a symlink after filetype check and before open. So we need to avoid
-+ * this race condition by opening the file.
-+ */
-+ fd = open_nofollow(refs->path, O_RDONLY);
-+ if (fd < 0) {
-+ if (errno == ENOENT)
-+ goto cleanup;
-+
-+ if (errno == ELOOP) {
-+ struct fsck_ref_report report = { 0 };
-+ report.path = "packed-refs";
-+ ret = fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_REF_FILETYPE,
-+ "not a regular file");
-+ goto cleanup;
-+ }
-+ }
-+
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
-+ ret = error_errno(_("unable to read %s"), refs->path);
++ ret = error_errno(_("unable to read '%s'"), refs->path);
+ goto cleanup;
+ }
+
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
+ if (fd >= 0)
+ close(fd);
+ strbuf_release(&packed_ref_content);
return ret;
}
6: 7255c2b597 = 6: 571479d3e7 packed-backend: check whether the refname contains NUL characters
7: 7794a2ebfd = 7: e498a57286 packed-backend: add "packed-refs" entry consistency check
8: 2a9138b14d ! 8: 3638cb118d packed-backend: check whether the "packed-refs" is sorted
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
struct strbuf packed_ref_content = STRBUF_INIT;
+ unsigned int sorted = 0;
struct stat st;
-- int fd;
int ret = 0;
-+ int fd;
-
- if (!is_main_worktree(wt))
- goto cleanup;
+ int fd;
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
+ packed_ref_content.buf + packed_ref_content.len);
cleanup:
- strbuf_release(&packed_ref_content);
+ if (fd >= 0)
## t/t0602-reffiles-fsck.sh ##
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'packed-refs content should be checked' '
9: ccde32491f = 9: 5d87e76d28 builtin/fsck: add `git refs verify` child process
--
2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
@ 2025-02-26 13:49 ` shejialuo
2025-02-26 13:49 ` [PATCH v7 2/9] builtin/refs: get worktrees without reading head information shejialuo
` (8 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
For every test, we would execute the command "cd repo" in the first but
we never execute the command "cd .." to restore the working directory.
However, it's either not a good idea use above way. Because if any test
fails between "cd repo" and "cd ..", the "cd .." will never be reached.
And we cannot correctly restore the working directory.
Let's use subshell to ensure that the current working directory could be
restored to the correct path.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
1 file changed, 494 insertions(+), 473 deletions(-)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..cf7a202d0d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -14,222 +14,229 @@ test_expect_success 'ref name should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
-
- git commit --allow-empty -m initial &&
- git checkout -b default-branch &&
- git tag default-tag &&
- git tag multi_hierarchy/default-tag &&
-
- cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
- git refs verify 2>err &&
- test_must_be_empty err &&
- rm $branch_dir_prefix/@ &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
- git refs verify 2>err &&
- rm $tag_dir_prefix/tag-1.lock &&
- test_must_be_empty err &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/.lock: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/.lock &&
- test_cmp expect err &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/$refname: badRefName: invalid refname format
- EOF
- rm "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ (
+ cd repo &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ git commit --allow-empty -m initial &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_cmp expect err || return 1
- done &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- mkdir "$branch_dir_prefix/$refname" &&
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ error: refs/tags/.lock: badRefName: invalid refname format
EOF
- rm -r "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done
+ rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=warn refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=ignore refs verify 2>err &&
- test_must_be_empty err
+ (
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/.branch-1: badRefName: invalid refname format
+ EOF
+ rm $branch_dir_prefix/.branch-1 &&
+ test_cmp expect err &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=ignore refs verify 2>err &&
+ test_must_be_empty err
+ )
'
test_expect_success 'ref name check should work for multiple worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
-
- cd repo &&
- test_commit initial &&
- git checkout -b branch-1 &&
- test_commit second &&
- git checkout -b branch-2 &&
- test_commit third &&
- git checkout -b branch-3 &&
- git worktree add ./worktree-1 branch-1 &&
- git worktree add ./worktree-2 branch-2 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
- (
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
(
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
-
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err &&
-
- for worktree in "worktree-1" "worktree-2"
- do
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
(
- cd $worktree &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err || return 1
- )
- done
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+ )
'
test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- git refs verify 2>err &&
- test_must_be_empty err &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- for trailing_content in " garbage" " more garbage"
- do
- printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $branch_dir_prefix/branch-garbage &&
- test_cmp expect err || return 1
- done &&
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- '\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err &&
- printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- garbage'\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
+ )
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -237,99 +244,103 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- bad_content_1=$(git rev-parse main)x &&
- bad_content_2=xfsazqfxcadas &&
- bad_content_3=Xfsazqfxcadas &&
- printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
- printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
- printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
- error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
- error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'textual symref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for good_referent in "refs/heads/branch" "HEAD"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
- do
- printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-1 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-2 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-3 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-complicated &&
- test_cmp expect err
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+ )
'
test_expect_success 'textual symref content should be checked (aggregate)' '
@@ -337,32 +348,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
- printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'the target of the textual symref should be checked' '
@@ -370,28 +383,30 @@ test_expect_success 'the target of the textual symref should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
- git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
-
- for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
- do
- printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
- EOF
- rm $branch_dir_prefix/branch-bad-1 &&
- test_cmp expect err || return 1
- done
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked' '
@@ -399,201 +414,207 @@ test_expect_success SYMLINKS 'symlink symref content should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
- test_cmp expect err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- main_worktree_refdir_prefix=.git/refs/heads &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
-
- ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree2_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $main_worktree_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- for bad_referent_name in ".tag" "branch "
- do
- ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
- test_cmp expect err || return 1
- done
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
- test_must_fail git refs verify 2>err &&
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $worktree1_refdir_prefix/bad-branch-1 &&
- test_cmp expect err || return 1
- done &&
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
- test_must_fail git refs verify 2>err &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $worktree2_refdir_prefix/bad-branch-2 &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree1_refdir_prefix/branch-garbage &&
- test_cmp expect err
+ rm $worktree1_refdir_prefix/branch-garbage &&
+ test_cmp expect err
+ )
'
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 2/9] builtin/refs: get worktrees without reading head information
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
2025-02-26 13:49 ` [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-02-26 13:49 ` shejialuo
2025-02-26 13:49 ` [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
` (7 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we should avoid reading the head information, which may
execute the read operation in packed backend with stricter checks to die
the program. Instead, we should continue to check other parts of the
"packed-refs" file completely.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head information.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 8 ++++++++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index d4a68c9c23..d23482a746 100644
--- a/worktree.c
+++ b/worktree.c
@@ -198,6 +198,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..a305c7e2c7 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,14 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
+ * get the worktree without worrying about failures pertaining to parsing
+ * the HEAD ref. This is useful in contexts where it is assumed that the
+ * refdb may not be in a consistent state.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
2025-02-26 13:49 ` [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-26 13:49 ` [PATCH v7 2/9] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-26 13:49 ` shejialuo
2025-02-26 18:36 ` Junio C Hamano
2025-02-26 13:50 ` [PATCH v7 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
` (6 subsequent siblings)
9 siblings, 1 reply; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". Let's verify that the "packed-refs" has
the expected filetype, confirming it is created by "git pack-refs"
command.
We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
If the returned "fd" value is less than 0, we could check whether the
"errno" is "ELOOP" to report an error to the user. And then we use
"fstat" to check whether the "packed-refs" file is a regular file.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 50 +++++++++++++++++++++++++++++++++++++---
t/t0602-reffiles-fsck.sh | 30 ++++++++++++++++++++++++
2 files changed, 77 insertions(+), 3 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a7b6f74b6e..f69a0598c7 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../git-compat-util.h"
#include "../config.h"
#include "../dir.h"
+#include "../fsck.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1748,15 +1749,58 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ struct stat st;
+ int ret = 0;
+ int fd;
if (!is_main_worktree(wt))
return 0;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ fd = open_nofollow(refs->path, O_RDONLY);
+ if (fd < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ if (errno == ELOOP) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file but a symlink");
+ goto cleanup;
+ }
+
+ ret = error_errno(_("unable to open '%s'"), refs->path);
+ goto cleanup;
+ } else if (fstat(fd, &st) < 0) {
+ ret = error_errno(_("unable to stat '%s'"), refs->path);
+ goto cleanup;
+ } else if (!S_ISREG(st.st_mode)) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+cleanup:
+ if (fd >= 0)
+ close(fd);
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index cf7a202d0d..68b7d4999e 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -617,4 +617,34 @@ test_expect_success 'ref content checks should work with worktrees' '
)
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-back .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file but a symlink
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ mkdir .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm -r .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 4/9] packed-backend: check if header starts with "# pack-refs with: "
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
` (2 preceding siblings ...)
2025-02-26 13:49 ` [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-26 13:50 ` shejialuo
2025-02-26 13:50 ` [PATCH v7 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
` (5 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We always write a space after "# pack-refs with:" but we don't align
with this rule in the "create_snapshot" method where we would check
whether header starts with "# pack-refs with:". It might seem that we
should undoubtedly tighten this rule, however, we don't have any
technical documentation about this and there is a possibility that we
would break the compatibility for other third-party libraries.
By investigating influential third-party libraries, we could conclude
how these libraries handle the header of "packed-refs" file:
1. libgit2 is fine and always writes the space. It also expects the
whitespace to exist.
2. JGit does not expect th header to have a trailing space, but expects
the "peeled" capability to have a leading space, which is mostly
equivalent because that capability is typically the first one we
write. It always writes the space.
3. gitoxide expects the space t exist and writes it.
4. go-git doesn't create the header by default.
As many third-party libraries expect a single space after "# pack-refs
with:", if we forget to write the space after the colon,
"create_snapshot" won't catch this. And we would break other
re-implementations. So, we'd better tighten the rule by checking whether
the header starts with "# pack-refs with: ".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f69a0598c7..3dd3fec459 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
+ if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
die_invalid_line(refs->path,
snapshot->buf,
snapshot->eof - snapshot->buf);
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 5/9] packed-backend: add "packed-refs" header consistency check
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
` (3 preceding siblings ...)
2025-02-26 13:50 ` [PATCH v7 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
@ 2025-02-26 13:50 ` shejialuo
2025-02-26 13:50 ` [PATCH v7 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
` (4 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with: ". However, we need to consider other situations and
discuss whether we need to add checks.
1. If the header does not exist, we should not report an error to the
user. This is because in older Git version, we never write header in
the "packed-refs" file. Also, we do allow no header in "packed-refs"
in runtime.
2. If the header content does not start with "# packed-ref with: ", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER". This is expected because we make it extensible
intentionally and runtime "create_snapshot" won't complain about
unknown traits. In order to align with the runtime behavior. There is
no need to report.
As we have analyzed, we only need to check the case 2 in the above. In
order to do this, use "open_nofollow" function to get the file
descriptor and then read the "packed-refs" file via "strbuf_read". Like
what "create_snapshot" and other functions do, we could split the line
by finding the next newline in the buffer. When we cannot find a
newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
Then, parse the first line to apply the checks. Update the test to
exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 8 ++++
fsck.h | 2 +
refs/packed-backend.c | 73 ++++++++++++++++++++++++++++++++++
t/t0602-reffiles-fsck.sh | 52 ++++++++++++++++++++++++
4 files changed, 135 insertions(+)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index b14bc44ca4..11906f90fd 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,10 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index a44c231a5f..67e3c97bc0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3dd3fec459..b00fca6501 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ unsigned long line_number, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ strbuf_release(&packed_entry);
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o,
+ const char *start, const char *eol)
+{
+ if (!starts_with(start, "# pack-refs with: ")) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ }
+
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
struct stat st;
int ret = 0;
int fd;
@@ -1797,9 +1861,18 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read '%s'"), refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
if (fd >= 0)
close(fd);
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 68b7d4999e..74d876984d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -647,4 +647,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
)
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled" \
+ "# pack-refs with:peeled fully-peeled sorted"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
+
+test_expect_success 'packed-refs missing header should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
+test_expect_success 'packed-refs unknown traits should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 6/9] packed-backend: check whether the refname contains NUL characters
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
` (4 preceding siblings ...)
2025-02-26 13:50 ` [PATCH v7 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-26 13:50 ` shejialuo
2025-02-26 13:50 ` [PATCH v7 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
` (3 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
However, it is reported in [1], we cannot catch some corruption. But we
already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.
In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.
Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index b00fca6501..6e7d08c565 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,21 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+ return !!memchr(refname->buf, '\0', refname->len);
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -895,6 +910,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_nul(&iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 7/9] packed-backend: add "packed-refs" entry consistency check
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
` (5 preceding siblings ...)
2025-02-26 13:50 ` [PATCH v7 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-02-26 13:50 ` shejialuo
2025-02-26 13:50 ` [PATCH v7 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
` (2 subsequent siblings)
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry to inspect whether the oid is
not correct. Then, check whether the next character is oid. Then
check the refname.
2. If the next line starts with '^', it would continue to parse the
peeled oid and check whether the last character is '\n'.
As we decide to implement the ref consistency check for "packed-refs",
let's port these two checks and update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 122 ++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 44 ++++++++++++
4 files changed, 169 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 11906f90fd..02a7bf0503 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 67e3c97bc0..14d70f6653 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6e7d08c565..8c410fca77 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1812,9 +1812,114 @@ static int packed_fsck_ref_header(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+ int ret = 0;
+
+ /*
+ * Skip the '^' and parse the peeled oid.
+ */
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p != eol) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
+ int ret = 0;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p == eol || !isspace(*p)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+ goto cleanup;
+ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
+ if (refname_contains_nul(refname)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
+ }
+
+ if (check_refname_format(refname->buf, 0)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
+ struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
const char *eol;
int ret = 0;
@@ -1827,6 +1932,21 @@ static int packed_fsck_ref_content(struct fsck_options *o,
line_number++;
}
+ while (start < eof) {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, line_number, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
+ strbuf_release(&refname);
return ret;
}
@@ -1884,7 +2004,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 74d876984d..a88c792ce1 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -699,4 +699,48 @@ test_expect_success 'packed-refs unknown traits should not be reported' '
)
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $short_oid refs/heads/branch-1
+ ${branch_1_oid}x
+ $branch_2_oid refs/heads/bad-branch
+ $branch_2_oid refs/heads/branch.
+ $tag_1_oid refs/tags/annotated-tag-3
+ ^$short_oid
+ $tag_2_oid refs/tags/annotated-tag-4.
+ ^$tag_2_peeled_oid garbage
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 8/9] packed-backend: check whether the "packed-refs" is sorted
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
` (6 preceding siblings ...)
2025-02-26 13:50 ` [PATCH v7 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-26 13:50 ` shejialuo
2025-02-26 13:51 ` [PATCH v7 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:50 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
When there is a "sorted" trait in the header of the "packed-refs" file,
it means that each entry is sorted increasingly by comparing the
refname. We should add checks to verify whether the "packed-refs" is
sorted in this case.
Update the "packed_fsck_ref_header" to know whether there is a "sorted"
trail in the header. It may seem that we could record all refnames
during the parsing process and then compare later. However, this is not
a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. We cannot reuse the existing compare function "cmp_packed_ref_records"
which cause repetition.
Because "cmp_packed_ref_records" needs an extra parameter "struct
snaphost", extract the common part into a new function
"cmp_packed_ref_records" to reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to parse the file
again and user the new fsck message "packedRefUnsorted(ERROR)" to report
to the user if the file is not sorted.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 116 ++++++++++++++++++++++++++++-----
t/t0602-reffiles-fsck.sh | 87 +++++++++++++++++++++++++
4 files changed, 191 insertions(+), 16 deletions(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 02a7bf0503..9601fff228 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -187,6 +187,9 @@
(ERROR) The "packed-refs" file contains an entry that is
not terminated by a newline.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 14d70f6653..19f3cb2773 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 8c410fca77..a1710d7c2a 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,9 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
-{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+static int cmp_packed_refname(const char *r1, const char *r2)
+{
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +317,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1797,19 +1803,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
}
static int packed_fsck_ref_header(struct fsck_options *o,
- const char *start, const char *eol)
+ const char *start, const char *eol,
+ unsigned int *sorted)
{
- if (!starts_with(start, "# pack-refs with: ")) {
+ struct string_list traits = STRING_LIST_INIT_NODUP;
+ char *tmp_line;
+ int ret = 0;
+ char *p;
+
+ tmp_line = xmemdupz(start, eol - start);
+ if (!skip_prefix(tmp_line, "# pack-refs with: ", (const char **)&p)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs.header";
- return fsck_report_ref(o, &report,
- FSCK_MSG_BAD_PACKED_REF_HEADER,
- "'%.*s' does not start with '# pack-refs with: '",
- (int)(eol - start), start);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ goto cleanup;
}
- return 0;
+ string_list_split_in_place(&traits, p, " ", -1);
+ *sorted = unsorted_string_list_has_string(&traits, "sorted");
+
+cleanup:
+ free(tmp_line);
+ string_list_clear(&traits, 0);
+ return ret;
}
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ -1915,8 +1935,68 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return ret;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ const char *start, const char *eof)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ unsigned long line_number = 1;
+ const char *former = NULL;
+ const char *current;
+ const char *eol;
+ int ret = 0;
+
+ if (*start == '#') {
+ eol = memchr(start, '\n', eof - start);
+ start = eol + 1;
+ line_number++;
+ }
+
+ for (; start < eof; line_number++, start = eol + 1) {
+ eol = memchr(start, '\n', eof - start);
+
+ if (*start == '^')
+ continue;
+
+ if (!former) {
+ former = start + hexsz + 1;
+ continue;
+ }
+
+ current = start + hexsz + 1;
+ if (cmp_packed_refname(former, current) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is less than previous refname '%s'";
+
+ eol = memchr(former, '\n', eof - former);
+ strbuf_add(&refname1, former, eol - former);
+ eol = memchr(current, '\n', eof - current);
+ strbuf_add(&refname2, current, eol - current);
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname2.buf, refname1.buf);
+ goto cleanup;
+ }
+ former = current;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
+ unsigned int *sorted,
const char *start, const char *eof)
{
struct strbuf refname = STRBUF_INIT;
@@ -1926,7 +2006,7 @@ static int packed_fsck_ref_content(struct fsck_options *o,
ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
if (*start == '#') {
- ret |= packed_fsck_ref_header(o, start, eol);
+ ret |= packed_fsck_ref_header(o, start, eol, sorted);
start = eol + 1;
line_number++;
@@ -1957,6 +2037,7 @@ static int packed_fsck(struct ref_store *ref_store,
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
struct strbuf packed_ref_content = STRBUF_INIT;
+ unsigned int sorted = 0;
struct stat st;
int ret = 0;
int fd;
@@ -2004,8 +2085,11 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
+ if (!ret && sorted)
+ ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
cleanup:
if (fd >= 0)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index a88c792ce1..767e2bd4a0 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -743,4 +743,91 @@ test_expect_success 'packed-refs content should be checked' '
)
'
+test_expect_success 'packed-ref with sorted trait should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ $tag_1_oid $refname3
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 3: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname1'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $tag_1_oid $refname3
+ ^$tag_1_peeled_oid
+ $branch_2_oid $refname2
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 4: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname3'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
+test_expect_success 'packed-ref without sorted trait should not be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ EOF
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v7 9/9] builtin/fsck: add `git refs verify` child process
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
` (7 preceding siblings ...)
2025-02-26 13:50 ` [PATCH v7 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-02-26 13:51 ` shejialuo
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
9 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-26 13:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
Then, introduce the option to allow the user to disable checking ref
database consistency. Put this function in the very first execution
sequence of "git-fsck(1)" due to that we don't want the existing code of
"git-fsck(1)" which would implicitly check the consistency of refs to
die the program.
Last, update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/git-fsck.adoc | 7 ++++++-
builtin/fsck.c | 33 ++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 39 +++++++++++++++++++++++++++++++++++++
3 files changed, 77 insertions(+), 2 deletions(-)
diff --git a/Documentation/git-fsck.adoc b/Documentation/git-fsck.adoc
index 8f32800a83..11203ba925 100644
--- a/Documentation/git-fsck.adoc
+++ b/Documentation/git-fsck.adoc
@@ -12,7 +12,7 @@ SYNOPSIS
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
[--[no-]full] [--strict] [--verbose] [--lost-found]
[--[no-]dangling] [--[no-]progress] [--connectivity-only]
- [--[no-]name-objects] [<object>...]
+ [--[no-]name-objects] [--[no-]references] [<object>...]
DESCRIPTION
-----------
@@ -104,6 +104,11 @@ care about this output and want to speed it up further.
progress status even if the standard error stream is not
directed to a terminal.
+--[no-]references::
+ Control whether to check the references database consistency
+ via 'git refs verify'. See linkgit:git-refs[1] for details.
+ The default is to check the references database.
+
CONFIGURATION
-------------
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a4dcb0716..f4f395cfbd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -50,6 +50,7 @@ static int verbose;
static int show_progress = -1;
static int show_dangling = 1;
static int name_objects;
+static int check_references = 1;
#define ERROR_OBJECT 01
#define ERROR_REACHABLE 02
#define ERROR_PACK 04
@@ -905,11 +906,37 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(struct repository *r)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+
+ if (show_progress)
+ progress = start_progress(r, _("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
" [--[no-]dangling] [--[no-]progress] [--connectivity-only]\n"
- " [--[no-]name-objects] [<object>...]"),
+ " [--[no-]name-objects] [--[no-]references] [<object>...]"),
NULL
};
@@ -928,6 +955,7 @@ static struct option fsck_opts[] = {
N_("write dangling objects in .git/lost-found")),
OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),
+ OPT_BOOL(0, "references", &check_references, N_("check reference database consistency")),
OPT_END(),
};
@@ -970,6 +998,9 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ if (check_references)
+ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 767e2bd4a0..9d1dc2144c 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -830,4 +830,43 @@ test_expect_success 'packed-ref without sorted trait should not be checked' '
)
'
+test_expect_success '--[no-]references option should apply to fsck' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ (
+ cd repo &&
+ test_commit default &&
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --references 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --no-references 2>err &&
+ rm $branch_dir_prefix/branch-garbage &&
+ test_must_be_empty err || return 1
+ done
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-26 13:49 ` [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-26 18:36 ` Junio C Hamano
2025-02-27 0:57 ` shejialuo
0 siblings, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-02-26 18:36 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> +static int packed_fsck(struct ref_store *ref_store,
> + struct fsck_options *o,
> struct worktree *wt)
> {
> + struct packed_ref_store *refs = packed_downcast(ref_store,
> + REF_STORE_READ, "fsck");
> + struct stat st;
> + int ret = 0;
> + int fd;
>
> if (!is_main_worktree(wt))
> return 0;
I do not think it is worth a reroll only to improve this one, but
for future reference, initializing "fd = -1" and jumping to cleanup
here instead of "return 0" would future-proof the code better. This
is especially so, given that in a few patches later, we would add a
strbuf that is initialized before this "we do not do anything
outside the primary worktree" short-cut, and many "goto cleanup"s we
see in this patch below would jump to cleanup to strbuf_release() on
that initialized but unused strbuf. Jumping there with negative fd
to cleanup that already avoids close(fd) for negative fd would be
like jumping there with initialized but unused strbuf. Having a
single exit point ("cleanup:" label) would help future evolution of
the code, by making it easier to add more resource-acquriing code to
this function in the future.
> - return 0;
> + if (o->verbose)
> + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
> +
> + fd = open_nofollow(refs->path, O_RDONLY);
> + if (fd < 0) {
> + /*
> + * If the packed-refs file doesn't exist, there's nothing
> + * to check.
> + */
> + if (errno == ENOENT)
> + goto cleanup;
> +
> + if (errno == ELOOP) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs";
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_FILETYPE,
> + "not a regular file but a symlink");
> + goto cleanup;
> + }
> +
> + ret = error_errno(_("unable to open '%s'"), refs->path);
> + goto cleanup;
> + } else if (fstat(fd, &st) < 0) {
> + ret = error_errno(_("unable to stat '%s'"), refs->path);
> + goto cleanup;
> + } else if (!S_ISREG(st.st_mode)) {
> + struct fsck_ref_report report = { 0 };
> + report.path = "packed-refs";
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_FILETYPE,
> + "not a regular file");
> + goto cleanup;
> + }
> +
> +cleanup:
> + if (fd >= 0)
> + close(fd);
> + return ret;
> }
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-26 18:36 ` Junio C Hamano
@ 2025-02-27 0:57 ` shejialuo
2025-02-27 14:10 ` Patrick Steinhardt
2025-02-27 16:57 ` Junio C Hamano
0 siblings, 2 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 0:57 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Wed, Feb 26, 2025 at 10:36:29AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > +static int packed_fsck(struct ref_store *ref_store,
> > + struct fsck_options *o,
> > struct worktree *wt)
> > {
> > + struct packed_ref_store *refs = packed_downcast(ref_store,
> > + REF_STORE_READ, "fsck");
> > + struct stat st;
> > + int ret = 0;
> > + int fd;
> >
> > if (!is_main_worktree(wt))
> > return 0;
>
> I do not think it is worth a reroll only to improve this one, but
> for future reference, initializing "fd = -1" and jumping to cleanup
> here instead of "return 0" would future-proof the code better. This
> is especially so, given that in a few patches later, we would add a
> strbuf that is initialized before this "we do not do anything
> outside the primary worktree" short-cut, and many "goto cleanup"s we
> see in this patch below would jump to cleanup to strbuf_release() on
> that initialized but unused strbuf. Jumping there with negative fd
> to cleanup that already avoids close(fd) for negative fd would be
> like jumping there with initialized but unused strbuf. Having a
> single exit point ("cleanup:" label) would help future evolution of
> the code, by making it easier to add more resource-acquriing code to
> this function in the future.
>
You are right. Actually, I just want to avoid assigning the `fd` to -1.
However, I didn't realize that I would initialize the strbuf later.
After waking up, I have suddenly realized this problem.
If other reviewers don't have any comments for this new version, I will
send out a reroll. We have already iterated many times, if we could make
it better, why not?
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-27 0:57 ` shejialuo
@ 2025-02-27 14:10 ` Patrick Steinhardt
2025-02-27 16:57 ` Junio C Hamano
1 sibling, 0 replies; 168+ messages in thread
From: Patrick Steinhardt @ 2025-02-27 14:10 UTC (permalink / raw)
To: shejialuo; +Cc: Junio C Hamano, git, Karthik Nayak, Michael Haggerty
On Thu, Feb 27, 2025 at 08:57:01AM +0800, shejialuo wrote:
> On Wed, Feb 26, 2025 at 10:36:29AM -0800, Junio C Hamano wrote:
> > shejialuo <shejialuo@gmail.com> writes:
> >
> > > +static int packed_fsck(struct ref_store *ref_store,
> > > + struct fsck_options *o,
> > > struct worktree *wt)
> > > {
> > > + struct packed_ref_store *refs = packed_downcast(ref_store,
> > > + REF_STORE_READ, "fsck");
> > > + struct stat st;
> > > + int ret = 0;
> > > + int fd;
> > >
> > > if (!is_main_worktree(wt))
> > > return 0;
> >
> > I do not think it is worth a reroll only to improve this one, but
> > for future reference, initializing "fd = -1" and jumping to cleanup
> > here instead of "return 0" would future-proof the code better. This
> > is especially so, given that in a few patches later, we would add a
> > strbuf that is initialized before this "we do not do anything
> > outside the primary worktree" short-cut, and many "goto cleanup"s we
> > see in this patch below would jump to cleanup to strbuf_release() on
> > that initialized but unused strbuf. Jumping there with negative fd
> > to cleanup that already avoids close(fd) for negative fd would be
> > like jumping there with initialized but unused strbuf. Having a
> > single exit point ("cleanup:" label) would help future evolution of
> > the code, by making it easier to add more resource-acquriing code to
> > this function in the future.
> >
>
> You are right. Actually, I just want to avoid assigning the `fd` to -1.
> However, I didn't realize that I would initialize the strbuf later.
> After waking up, I have suddenly realized this problem.
>
> If other reviewers don't have any comments for this new version, I will
> send out a reroll. We have already iterated many times, if we could make
> it better, why not?
I don't have anything else to add to this version, thanks!
Patrick
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v8 0/9] add more ref consistency checks
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
` (8 preceding siblings ...)
2025-02-26 13:51 ` [PATCH v7 9/9] builtin/fsck: add `git refs verify` child process shejialuo
@ 2025-02-27 16:03 ` shejialuo
2025-02-27 16:05 ` [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
` (8 more replies)
9 siblings, 9 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:03 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Hi All:
This changes enhances the following things:
1. [PATCH v8 3/9]: initialize fd = -1 for two purposes:
1. We could use unified `goto cleanup` where we have one only
control.
2. We should not close the fd when we cannot open the file.
Hope that this version would be final. Thank for the effort of every
reviewer.
Thanks,
Jialuo
---
This series mainly does the following things:
1. Fix subshell issues
2. Add ref checks for packed-backend.
1. Check whether the filetype of "packed-refs" is correct.
2. Check whether the syntax of "packed-refs" is correct by using the
rules from "packed-backend.c::create_snapshot" and
"packed-backend.c::next_record".
3. Check whether the pointed object exists and whether the
"packed-refs" file is sorted.
3. Call "git refs verify" for "git-fsck(1)".
shejialuo (9):
t0602: use subshell to ensure working directory unchanged
builtin/refs: get worktrees without reading head information
packed-backend: check whether the "packed-refs" is regular file
packed-backend: check if header starts with "# pack-refs with: "
packed-backend: add "packed-refs" header consistency check
packed-backend: check whether the refname contains NUL characters
packed-backend: add "packed-refs" entry consistency check
packed-backend: check whether the "packed-refs" is sorted
builtin/fsck: add `git refs verify` child process
Documentation/fsck-msgids.adoc | 14 +
Documentation/git-fsck.adoc | 7 +-
builtin/fsck.c | 33 +-
builtin/refs.c | 2 +-
fsck.h | 4 +
refs/packed-backend.c | 363 +++++++++-
t/t0602-reffiles-fsck.sh | 1209 +++++++++++++++++++-------------
worktree.c | 5 +
worktree.h | 8 +
9 files changed, 1162 insertions(+), 483 deletions(-)
Range-diff against v7:
1: b3952d80a2 = 1: b3952d80a2 t0602: use subshell to ensure working directory unchanged
2: fa5ce20bb7 = 2: fa5ce20bb7 builtin/refs: get worktrees without reading head information
3: 861583f417 ! 3: b3686a9695 packed-backend: check whether the "packed-refs" is regular file
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ REF_STORE_READ, "fsck");
+ struct stat st;
+ int ret = 0;
-+ int fd;
++ int fd = -1;
if (!is_main_worktree(wt))
- return 0;
+- return 0;
++ goto cleanup;
- return 0;
+ if (o->verbose)
4: 5f54cb05c3 = 4: 2638d5043f packed-backend: check if header starts with "# pack-refs with: "
5: 7d7dc899ad ! 5: 13e34de350 packed-backend: add "packed-refs" header consistency check
@@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
+ struct strbuf packed_ref_content = STRBUF_INIT;
struct stat st;
int ret = 0;
- int fd;
+ int fd = -1;
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
6: 571479d3e7 = 6: 0632a1d5e2 packed-backend: check whether the refname contains NUL characters
7: e498a57286 = 7: 4618da3199 packed-backend: add "packed-refs" entry consistency check
8: 3638cb118d ! 8: 355e43d251 packed-backend: check whether the "packed-refs" is sorted
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
+ unsigned int sorted = 0;
struct stat st;
int ret = 0;
- int fd;
+ int fd = -1;
@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
9: 5d87e76d28 = 9: 57dac06151 builtin/fsck: add `git refs verify` child process
--
2.48.1
^ permalink raw reply [flat|nested] 168+ messages in thread
* [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
@ 2025-02-27 16:05 ` shejialuo
2025-02-27 16:06 ` [PATCH v8 2/9] builtin/refs: get worktrees without reading head information shejialuo
` (7 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:05 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
For every test, we would execute the command "cd repo" in the first but
we never execute the command "cd .." to restore the working directory.
However, it's either not a good idea use above way. Because if any test
fails between "cd repo" and "cd ..", the "cd .." will never be reached.
And we cannot correctly restore the working directory.
Let's use subshell to ensure that the current working directory could be
restored to the correct path.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 967 ++++++++++++++++++++-------------------
1 file changed, 494 insertions(+), 473 deletions(-)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index d4a08b823b..cf7a202d0d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -14,222 +14,229 @@ test_expect_success 'ref name should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
-
- git commit --allow-empty -m initial &&
- git checkout -b default-branch &&
- git tag default-tag &&
- git tag multi_hierarchy/default-tag &&
-
- cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
- git refs verify 2>err &&
- test_must_be_empty err &&
- rm $branch_dir_prefix/@ &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
- git refs verify 2>err &&
- rm $tag_dir_prefix/tag-1.lock &&
- test_must_be_empty err &&
-
- cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/.lock: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/.lock &&
- test_cmp expect err &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/$refname: badRefName: invalid refname format
- EOF
- rm "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ (
+ cd repo &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done &&
+ git commit --allow-empty -m initial &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
- EOF
- rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
- test_cmp expect err || return 1
- done &&
-
- for refname in ".refname-starts-with-dot" "~refname-has-stride"
- do
- mkdir "$branch_dir_prefix/$refname" &&
- cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
+
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ error: refs/tags/.lock: badRefName: invalid refname format
EOF
- rm -r "$branch_dir_prefix/$refname" &&
- test_cmp expect err || return 1
- done
+ rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=warn refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- git -c fsck.badRefName=ignore refs verify 2>err &&
- test_must_be_empty err
+ (
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/.branch-1: badRefName: invalid refname format
+ EOF
+ rm $branch_dir_prefix/.branch-1 &&
+ test_cmp expect err &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=ignore refs verify 2>err &&
+ test_must_be_empty err
+ )
'
test_expect_success 'ref name check should work for multiple worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
-
- cd repo &&
- test_commit initial &&
- git checkout -b branch-1 &&
- test_commit second &&
- git checkout -b branch-2 &&
- test_commit third &&
- git checkout -b branch-3 &&
- git worktree add ./worktree-1 branch-1 &&
- git worktree add ./worktree-2 branch-2 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
- (
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
(
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-3
- ) &&
-
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err &&
-
- for worktree in "worktree-1" "worktree-2"
- do
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
(
- cd $worktree &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
- error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err || return 1
- )
- done
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+ )
'
test_expect_success 'regular ref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- git refs verify 2>err &&
- test_must_be_empty err &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- for trailing_content in " garbage" " more garbage"
- do
- printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $branch_dir_prefix/branch-garbage &&
- test_cmp expect err || return 1
- done &&
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- '\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err &&
- printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
- garbage'\''
- EOF
- rm $branch_dir_prefix/branch-garbage-special &&
- test_cmp expect err
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
+ )
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -237,99 +244,103 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- bad_content_1=$(git rev-parse main)x &&
- bad_content_2=xfsazqfxcadas &&
- bad_content_3=Xfsazqfxcadas &&
- printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
- printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
- printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
- printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
- printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
- error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
- error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'textual symref content should be checked (individual)' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
- for good_referent in "refs/heads/branch" "HEAD"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
- do
- printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
- test_must_fail git refs verify 2>err &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
- rm $branch_dir_prefix/branch-bad &&
- test_cmp expect err || return 1
- done &&
-
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $branch_dir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-1 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-2 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-trailing-3 &&
- test_cmp expect err &&
-
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- EOF
- rm $branch_dir_prefix/a/b/branch-complicated &&
- test_cmp expect err
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+ )
'
test_expect_success 'textual symref content should be checked (aggregate)' '
@@ -337,32 +348,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
- printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
- printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
- printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
- printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
- printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
- printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
- printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
- EOF
- sort err >sorted_err &&
- test_cmp expect sorted_err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
'
test_expect_success 'the target of the textual symref should be checked' '
@@ -370,28 +383,30 @@ test_expect_success 'the target of the textual symref should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
- do
- printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
- git refs verify 2>err &&
- rm $branch_dir_prefix/branch-good &&
- test_must_be_empty err || return 1
- done &&
-
- for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
- do
- printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
- EOF
- rm $branch_dir_prefix/branch-bad-1 &&
- test_cmp expect err || return 1
- done
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked' '
@@ -399,201 +414,207 @@ test_expect_success SYMLINKS 'symlink symref content should be checked' '
git init repo &&
branch_dir_prefix=.git/refs/heads &&
tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
- test_cmp expect err
+ (
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+ )
'
test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- main_worktree_refdir_prefix=.git/refs/heads &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
-
- ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $worktree2_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $main_worktree_refdir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
- EOF
- rm $worktree1_refdir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- for bad_referent_name in ".tag" "branch "
- do
- ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree1_refdir_prefix/bad-symbolic &&
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
test_cmp expect err &&
- ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
- test_must_fail git refs verify 2>err &&
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
- error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
EOF
- rm $worktree2_refdir_prefix/bad-symbolic &&
- test_cmp expect err || return 1
- done
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+ )
'
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
- cd repo &&
- test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
- worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
- worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
-
(
- cd worktree-1 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
- (
- cd worktree-2 &&
- git update-ref refs/worktree/branch-4 refs/heads/branch-1
- ) &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
- test_must_fail git refs verify 2>err &&
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
- rm $worktree1_refdir_prefix/bad-branch-1 &&
- test_cmp expect err || return 1
- done &&
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
- for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
- do
- printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
- test_must_fail git refs verify 2>err &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
cat >expect <<-EOF &&
- error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $worktree2_refdir_prefix/bad-branch-2 &&
- test_cmp expect err || return 1
- done &&
-
- printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
- EOF
- rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err &&
-
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree1_refdir_prefix/branch-garbage &&
- test_cmp expect err
+ rm $worktree1_refdir_prefix/branch-garbage &&
+ test_cmp expect err
+ )
'
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 2/9] builtin/refs: get worktrees without reading head information
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
2025-02-27 16:05 ` [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
@ 2025-02-27 16:06 ` shejialuo
2025-02-27 16:06 ` [PATCH v8 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
` (6 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:06 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.
It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.
However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.
Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we should avoid reading the head information, which may
execute the read operation in packed backend with stricter checks to die
the program. Instead, we should continue to check other parts of the
"packed-refs" file completely.
Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head information.
Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 2 +-
worktree.c | 5 +++++
worktree.h | 8 ++++++++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- worktrees = get_worktrees();
+ worktrees = get_worktrees_without_reading_head();
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index d4a68c9c23..d23482a746 100644
--- a/worktree.c
+++ b/worktree.c
@@ -198,6 +198,11 @@ struct worktree **get_worktrees(void)
return get_worktrees_internal(0);
}
+struct worktree **get_worktrees_without_reading_head(void)
+{
+ return get_worktrees_internal(1);
+}
+
const char *get_worktree_git_dir(const struct worktree *wt)
{
if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..a305c7e2c7 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,14 @@ struct worktree {
*/
struct worktree **get_worktrees(void);
+/*
+ * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
+ * get the worktree without worrying about failures pertaining to parsing
+ * the HEAD ref. This is useful in contexts where it is assumed that the
+ * refdb may not be in a consistent state.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
/*
* Returns 1 if linked worktrees exist, 0 otherwise.
*/
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
2025-02-27 16:05 ` [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-27 16:06 ` [PATCH v8 2/9] builtin/refs: get worktrees without reading head information shejialuo
@ 2025-02-27 16:06 ` shejialuo
2025-02-27 16:06 ` [PATCH v8 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
` (5 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:06 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
Although "git-fsck(1)" and "packed-backend.c" will check some
consistency and correctness of "packed-refs" file, they never check the
filetype of the "packed-refs". Let's verify that the "packed-refs" has
the expected filetype, confirming it is created by "git pack-refs"
command.
We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
If the returned "fd" value is less than 0, we could check whether the
"errno" is "ELOOP" to report an error to the user. And then we use
"fstat" to check whether the "packed-refs" file is a regular file.
Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
the user if "packed-refs" is not a regular file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 52 ++++++++++++++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 30 +++++++++++++++++++++++
2 files changed, 78 insertions(+), 4 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a7b6f74b6e..1fba804a2a 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -4,6 +4,7 @@
#include "../git-compat-util.h"
#include "../config.h"
#include "../dir.h"
+#include "../fsck.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1748,15 +1749,58 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
-static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED,
+static int packed_fsck(struct ref_store *ref_store,
+ struct fsck_options *o,
struct worktree *wt)
{
+ struct packed_ref_store *refs = packed_downcast(ref_store,
+ REF_STORE_READ, "fsck");
+ struct stat st;
+ int ret = 0;
+ int fd = -1;
if (!is_main_worktree(wt))
- return 0;
+ goto cleanup;
- return 0;
+ if (o->verbose)
+ fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
+
+ fd = open_nofollow(refs->path, O_RDONLY);
+ if (fd < 0) {
+ /*
+ * If the packed-refs file doesn't exist, there's nothing
+ * to check.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ if (errno == ELOOP) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file but a symlink");
+ goto cleanup;
+ }
+
+ ret = error_errno(_("unable to open '%s'"), refs->path);
+ goto cleanup;
+ } else if (fstat(fd, &st) < 0) {
+ ret = error_errno(_("unable to stat '%s'"), refs->path);
+ goto cleanup;
+ } else if (!S_ISREG(st.st_mode)) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs";
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_FILETYPE,
+ "not a regular file");
+ goto cleanup;
+ }
+
+cleanup:
+ if (fd >= 0)
+ close(fd);
+ return ret;
}
struct ref_storage_be refs_be_packed = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index cf7a202d0d..68b7d4999e 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -617,4 +617,34 @@ test_expect_success 'ref content checks should work with worktrees' '
)
'
+test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git pack-refs --all &&
+
+ mv .git/packed-refs .git/packed-refs-back &&
+ ln -sf packed-refs-back .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file but a symlink
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ mkdir .git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs: badRefFiletype: not a regular file
+ EOF
+ rm -r .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 4/9] packed-backend: check if header starts with "# pack-refs with: "
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
` (2 preceding siblings ...)
2025-02-27 16:06 ` [PATCH v8 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
@ 2025-02-27 16:06 ` shejialuo
2025-02-27 16:06 ` [PATCH v8 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
` (4 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:06 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
We always write a space after "# pack-refs with:" but we don't align
with this rule in the "create_snapshot" method where we would check
whether header starts with "# pack-refs with:". It might seem that we
should undoubtedly tighten this rule, however, we don't have any
technical documentation about this and there is a possibility that we
would break the compatibility for other third-party libraries.
By investigating influential third-party libraries, we could conclude
how these libraries handle the header of "packed-refs" file:
1. libgit2 is fine and always writes the space. It also expects the
whitespace to exist.
2. JGit does not expect th header to have a trailing space, but expects
the "peeled" capability to have a leading space, which is mostly
equivalent because that capability is typically the first one we
write. It always writes the space.
3. gitoxide expects the space t exist and writes it.
4. go-git doesn't create the header by default.
As many third-party libraries expect a single space after "# pack-refs
with:", if we forget to write the space after the colon,
"create_snapshot" won't catch this. And we would break other
re-implementations. So, we'd better tighten the rule by checking whether
the header starts with "# pack-refs with: ".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 1fba804a2a..eaa8746f3e 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
+ if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
die_invalid_line(refs->path,
snapshot->buf,
snapshot->eof - snapshot->buf);
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 5/9] packed-backend: add "packed-refs" header consistency check
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
` (3 preceding siblings ...)
2025-02-27 16:06 ` [PATCH v8 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
@ 2025-02-27 16:06 ` shejialuo
2025-02-27 16:07 ` [PATCH v8 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
` (3 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:06 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
In "packed-backend.c::create_snapshot", if there is a header (the line
which starts with '#'), we will check whether the line starts with "#
pack-refs with: ". However, we need to consider other situations and
discuss whether we need to add checks.
1. If the header does not exist, we should not report an error to the
user. This is because in older Git version, we never write header in
the "packed-refs" file. Also, we do allow no header in "packed-refs"
in runtime.
2. If the header content does not start with "# packed-ref with: ", we
should report an error just like what "create_snapshot" does. So,
create a new fsck message "badPackedRefHeader(ERROR)" for this.
3. If the header content is not the same as the constant string
"PACKED_REFS_HEADER". This is expected because we make it extensible
intentionally and runtime "create_snapshot" won't complain about
unknown traits. In order to align with the runtime behavior. There is
no need to report.
As we have analyzed, we only need to check the case 2 in the above. In
order to do this, use "open_nofollow" function to get the file
descriptor and then read the "packed-refs" file via "strbuf_read". Like
what "create_snapshot" and other functions do, we could split the line
by finding the next newline in the buffer. When we cannot find a
newline, we could report an error.
So, create a function "packed_fsck_ref_next_line" to find the next
newline and if there is no such newline, use
"packedRefEntryNotTerminated(ERROR)" to report an error to the user.
Then, parse the first line to apply the checks. Update the test to
exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 8 ++++
fsck.h | 2 +
refs/packed-backend.c | 73 ++++++++++++++++++++++++++++++++++
t/t0602-reffiles-fsck.sh | 52 ++++++++++++++++++++++++
4 files changed, 135 insertions(+)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index b14bc44ca4..11906f90fd 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,10 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefHeader`::
+ (ERROR) The "packed-refs" file contains an invalid
+ header.
+
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
@@ -176,6 +180,10 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`packedRefEntryNotTerminated`::
+ (ERROR) The "packed-refs" file contains an entry that is
+ not terminated by a newline.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index a44c231a5f..67e3c97bc0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
@@ -53,6 +54,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE, ERROR) \
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
+ FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index eaa8746f3e..07154bccae 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
return empty_ref_iterator_begin();
}
+static int packed_fsck_ref_next_line(struct fsck_options *o,
+ unsigned long line_number, const char *start,
+ const char *eof, const char **eol)
+{
+ int ret = 0;
+
+ *eol = memchr(start, '\n', eof - start);
+ if (!*eol) {
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
+ "'%.*s' is not terminated with a newline",
+ (int)(eof - start), start);
+
+ /*
+ * There is no newline but we still want to parse it to the end of
+ * the buffer.
+ */
+ *eol = eof;
+ strbuf_release(&packed_entry);
+ }
+
+ return ret;
+}
+
+static int packed_fsck_ref_header(struct fsck_options *o,
+ const char *start, const char *eol)
+{
+ if (!starts_with(start, "# pack-refs with: ")) {
+ struct fsck_ref_report report = { 0 };
+ report.path = "packed-refs.header";
+
+ return fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ }
+
+ return 0;
+}
+
+static int packed_fsck_ref_content(struct fsck_options *o,
+ const char *start, const char *eof)
+{
+ unsigned long line_number = 1;
+ const char *eol;
+ int ret = 0;
+
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ if (*start == '#') {
+ ret |= packed_fsck_ref_header(o, start, eol);
+
+ start = eol + 1;
+ line_number++;
+ }
+
+ return ret;
+}
+
static int packed_fsck(struct ref_store *ref_store,
struct fsck_options *o,
struct worktree *wt)
{
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
+ struct strbuf packed_ref_content = STRBUF_INIT;
struct stat st;
int ret = 0;
int fd = -1;
@@ -1797,9 +1861,18 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
+ if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
+ ret = error_errno(_("unable to read '%s'"), refs->path);
+ goto cleanup;
+ }
+
+ ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
+
cleanup:
if (fd >= 0)
close(fd);
+ strbuf_release(&packed_ref_content);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 68b7d4999e..74d876984d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -647,4 +647,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
)
'
+test_expect_success 'packed-refs header should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
+ "# pack-refs with traits: peeled fully-peeled sorted " \
+ "# pack-refs with a: peeled fully-peeled" \
+ "# pack-refs with:peeled fully-peeled sorted"
+ do
+ printf "%s\n" "$bad_header" >.git/packed-refs &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err || return 1
+ done
+ )
+'
+
+test_expect_success 'packed-refs missing header should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
+test_expect_success 'packed-refs unknown traits should not be reported' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 6/9] packed-backend: check whether the refname contains NUL characters
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
` (4 preceding siblings ...)
2025-02-27 16:06 ` [PATCH v8 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
@ 2025-02-27 16:07 ` shejialuo
2025-02-27 16:07 ` [PATCH v8 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
` (2 subsequent siblings)
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
However, it is reported in [1], we cannot catch some corruption. But we
already have the code path and we must miss out something.
We use the following code to get the refname:
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf
In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.
However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.
In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.
Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.
[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/packed-backend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 07154bccae..9a90c52f70 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,21 @@ static void verify_buffer_safe(struct snapshot *snapshot)
last_line, eof - last_line);
}
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+ return !!memchr(refname->buf, '\0', refname->len);
+}
+
#define SMALL_FILE_SIZE (32*1024)
/*
@@ -895,6 +910,9 @@ static int next_record(struct packed_ref_iterator *iter)
strbuf_add(&iter->refname_buf, p, eol - p);
iter->base.refname = iter->refname_buf.buf;
+ if (refname_contains_nul(&iter->refname_buf))
+ die("packed refname contains embedded NULL: %s", iter->base.refname);
+
if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
if (!refname_is_safe(iter->base.refname))
die("packed refname is dangerous: %s",
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 7/9] packed-backend: add "packed-refs" entry consistency check
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
` (5 preceding siblings ...)
2025-02-27 16:07 ` [PATCH v8 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
@ 2025-02-27 16:07 ` shejialuo
2025-02-27 16:07 ` [PATCH v8 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-27 16:07 ` [PATCH v8 9/9] builtin/fsck: add `git refs verify` child process shejialuo
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
"packed-backend.c::next_record" will parse the ref entry to check the
consistency. This function has already checked the following things:
1. Parse the main line of the ref entry to inspect whether the oid is
not correct. Then, check whether the next character is oid. Then
check the refname.
2. If the next line starts with '^', it would continue to parse the
peeled oid and check whether the last character is '\n'.
As we decide to implement the ref consistency check for "packed-refs",
let's port these two checks and update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 122 ++++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 44 ++++++++++++
4 files changed, 169 insertions(+), 1 deletion(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 11906f90fd..02a7bf0503 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -16,6 +16,9 @@
`badObjectSha1`::
(ERROR) An object has a bad sha1.
+`badPackedRefEntry`::
+ (ERROR) The "packed-refs" file contains an invalid entry.
+
`badPackedRefHeader`::
(ERROR) The "packed-refs" file contains an invalid
header.
diff --git a/fsck.h b/fsck.h
index 67e3c97bc0..14d70f6653 100644
--- a/fsck.h
+++ b/fsck.h
@@ -30,6 +30,7 @@ enum fsck_msg_type {
FUNC(BAD_EMAIL, ERROR) \
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
+ FUNC(BAD_PACKED_REF_ENTRY, ERROR) \
FUNC(BAD_PACKED_REF_HEADER, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
FUNC(BAD_REF_CONTENT, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 9a90c52f70..ef20300fd3 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1812,9 +1812,114 @@ static int packed_fsck_ref_header(struct fsck_options *o,
return 0;
}
+static int packed_fsck_ref_peeled_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id peeled;
+ const char *p;
+ int ret = 0;
+
+ /*
+ * Skip the '^' and parse the peeled oid.
+ */
+ start++;
+ if (parse_oid_hex_algop(start, &peeled, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid peeled oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p != eol) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has trailing garbage after peeled oid '%.*s'",
+ (int)(eol - p), p);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
+static int packed_fsck_ref_main_line(struct fsck_options *o,
+ struct ref_store *ref_store,
+ unsigned long line_number,
+ struct strbuf *refname,
+ const char *start, const char *eol)
+{
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct object_id oid;
+ const char *p;
+ int ret = 0;
+
+ if (parse_oid_hex_algop(start, &oid, &p, ref_store->repo->hash_algo)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "'%.*s' has invalid oid",
+ (int)(eol - start), start);
+ goto cleanup;
+ }
+
+ if (p == eol || !isspace(*p)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "has no space after oid '%s' but with '%.*s'",
+ oid_to_hex(&oid), (int)(eol - p), p);
+ goto cleanup;
+ }
+
+ p++;
+ strbuf_reset(refname);
+ strbuf_add(refname, p, eol - p);
+ if (refname_contains_nul(refname)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_ENTRY,
+ "refname '%s' contains NULL binaries",
+ refname->buf);
+ }
+
+ if (check_refname_format(refname->buf, 0)) {
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_NAME,
+ "has bad refname '%s'", refname->buf);
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
+ struct ref_store *ref_store,
const char *start, const char *eof)
{
+ struct strbuf refname = STRBUF_INIT;
unsigned long line_number = 1;
const char *eol;
int ret = 0;
@@ -1827,6 +1932,21 @@ static int packed_fsck_ref_content(struct fsck_options *o,
line_number++;
}
+ while (start < eof) {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_main_line(o, ref_store, line_number, &refname, start, eol);
+ start = eol + 1;
+ line_number++;
+ if (start < eof && *start == '^') {
+ ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
+ ret |= packed_fsck_ref_peeled_line(o, ref_store, line_number,
+ start, eol);
+ start = eol + 1;
+ line_number++;
+ }
+ }
+
+ strbuf_release(&refname);
return ret;
}
@@ -1884,7 +2004,7 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 74d876984d..a88c792ce1 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -699,4 +699,48 @@ test_expect_success 'packed-refs unknown traits should not be reported' '
)
'
+test_expect_success 'packed-refs content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ git tag -a annotated-tag-2 -m tag-2 &&
+
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_2_oid=$(git rev-parse annotated-tag-2) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ tag_2_peeled_oid=$(git rev-parse annotated-tag-2^{}) &&
+ short_oid=$(printf "%s" $tag_1_peeled_oid | cut -c 1-4) &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $short_oid refs/heads/branch-1
+ ${branch_1_oid}x
+ $branch_2_oid refs/heads/bad-branch
+ $branch_2_oid refs/heads/branch.
+ $tag_1_oid refs/tags/annotated-tag-3
+ ^$short_oid
+ $tag_2_oid refs/tags/annotated-tag-4.
+ ^$tag_2_peeled_oid garbage
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 2: badPackedRefEntry: '\''$short_oid refs/heads/branch-1'\'' has invalid oid
+ error: packed-refs line 3: badPackedRefEntry: has no space after oid '\''$branch_1_oid'\'' but with '\''x'\''
+ error: packed-refs line 4: badRefName: has bad refname '\'' refs/heads/bad-branch'\''
+ error: packed-refs line 5: badRefName: has bad refname '\''refs/heads/branch.'\''
+ error: packed-refs line 7: badPackedRefEntry: '\''$short_oid'\'' has invalid peeled oid
+ error: packed-refs line 8: badRefName: has bad refname '\''refs/tags/annotated-tag-4.'\''
+ error: packed-refs line 9: badPackedRefEntry: has trailing garbage after peeled oid '\'' garbage'\''
+ EOF
+ test_cmp expect err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 8/9] packed-backend: check whether the "packed-refs" is sorted
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
` (6 preceding siblings ...)
2025-02-27 16:07 ` [PATCH v8 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
@ 2025-02-27 16:07 ` shejialuo
2025-02-27 16:07 ` [PATCH v8 9/9] builtin/fsck: add `git refs verify` child process shejialuo
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
When there is a "sorted" trait in the header of the "packed-refs" file,
it means that each entry is sorted increasingly by comparing the
refname. We should add checks to verify whether the "packed-refs" is
sorted in this case.
Update the "packed_fsck_ref_header" to know whether there is a "sorted"
trail in the header. It may seem that we could record all refnames
during the parsing process and then compare later. However, this is not
a good design due to the following reasons:
1. Because we need to store the state across the whole checking
lifetime, we would consume a lot of memory if there are many entries
in the "packed-refs" file.
2. We cannot reuse the existing compare function "cmp_packed_ref_records"
which cause repetition.
Because "cmp_packed_ref_records" needs an extra parameter "struct
snaphost", extract the common part into a new function
"cmp_packed_ref_records" to reuse this function to compare.
Then, create a new function "packed_fsck_ref_sorted" to parse the file
again and user the new fsck message "packedRefUnsorted(ERROR)" to report
to the user if the file is not sorted.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.adoc | 3 +
fsck.h | 1 +
refs/packed-backend.c | 116 ++++++++++++++++++++++++++++-----
t/t0602-reffiles-fsck.sh | 87 +++++++++++++++++++++++++
4 files changed, 191 insertions(+), 16 deletions(-)
diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 02a7bf0503..9601fff228 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -187,6 +187,9 @@
(ERROR) The "packed-refs" file contains an entry that is
not terminated by a newline.
+`packedRefUnsorted`::
+ (ERROR) The "packed-refs" file is not sorted.
+
`refMissingNewline`::
(INFO) A loose ref that does not end with newline(LF). As
valid implementations of Git never created such a loose ref
diff --git a/fsck.h b/fsck.h
index 14d70f6653..19f3cb2773 100644
--- a/fsck.h
+++ b/fsck.h
@@ -56,6 +56,7 @@ enum fsck_msg_type {
FUNC(MISSING_TYPE_ENTRY, ERROR) \
FUNC(MULTIPLE_AUTHORS, ERROR) \
FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
+ FUNC(PACKED_REF_UNSORTED, ERROR) \
FUNC(TREE_NOT_SORTED, ERROR) \
FUNC(UNKNOWN_TYPE, ERROR) \
FUNC(ZERO_PADDED_DATE, ERROR) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index ef20300fd3..813e5020e4 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -300,14 +300,9 @@ struct snapshot_record {
size_t len;
};
-static int cmp_packed_ref_records(const void *v1, const void *v2,
- void *cb_data)
-{
- const struct snapshot *snapshot = cb_data;
- const struct snapshot_record *e1 = v1, *e2 = v2;
- const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
- const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+static int cmp_packed_refname(const char *r1, const char *r2)
+{
while (1) {
if (*r1 == '\n')
return *r2 == '\n' ? 0 : -1;
@@ -322,6 +317,17 @@ static int cmp_packed_ref_records(const void *v1, const void *v2,
}
}
+static int cmp_packed_ref_records(const void *v1, const void *v2,
+ void *cb_data)
+{
+ const struct snapshot *snapshot = cb_data;
+ const struct snapshot_record *e1 = v1, *e2 = v2;
+ const char *r1 = e1->start + snapshot_hexsz(snapshot) + 1;
+ const char *r2 = e2->start + snapshot_hexsz(snapshot) + 1;
+
+ return cmp_packed_refname(r1, r2);
+}
+
/*
* Compare a snapshot record at `rec` to the specified NUL-terminated
* refname.
@@ -1797,19 +1803,33 @@ static int packed_fsck_ref_next_line(struct fsck_options *o,
}
static int packed_fsck_ref_header(struct fsck_options *o,
- const char *start, const char *eol)
+ const char *start, const char *eol,
+ unsigned int *sorted)
{
- if (!starts_with(start, "# pack-refs with: ")) {
+ struct string_list traits = STRING_LIST_INIT_NODUP;
+ char *tmp_line;
+ int ret = 0;
+ char *p;
+
+ tmp_line = xmemdupz(start, eol - start);
+ if (!skip_prefix(tmp_line, "# pack-refs with: ", (const char **)&p)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs.header";
- return fsck_report_ref(o, &report,
- FSCK_MSG_BAD_PACKED_REF_HEADER,
- "'%.*s' does not start with '# pack-refs with: '",
- (int)(eol - start), start);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_PACKED_REF_HEADER,
+ "'%.*s' does not start with '# pack-refs with: '",
+ (int)(eol - start), start);
+ goto cleanup;
}
- return 0;
+ string_list_split_in_place(&traits, p, " ", -1);
+ *sorted = unsorted_string_list_has_string(&traits, "sorted");
+
+cleanup:
+ free(tmp_line);
+ string_list_clear(&traits, 0);
+ return ret;
}
static int packed_fsck_ref_peeled_line(struct fsck_options *o,
@@ -1915,8 +1935,68 @@ static int packed_fsck_ref_main_line(struct fsck_options *o,
return ret;
}
+static int packed_fsck_ref_sorted(struct fsck_options *o,
+ struct ref_store *ref_store,
+ const char *start, const char *eof)
+{
+ size_t hexsz = ref_store->repo->hash_algo->hexsz;
+ struct strbuf packed_entry = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ struct strbuf refname1 = STRBUF_INIT;
+ struct strbuf refname2 = STRBUF_INIT;
+ unsigned long line_number = 1;
+ const char *former = NULL;
+ const char *current;
+ const char *eol;
+ int ret = 0;
+
+ if (*start == '#') {
+ eol = memchr(start, '\n', eof - start);
+ start = eol + 1;
+ line_number++;
+ }
+
+ for (; start < eof; line_number++, start = eol + 1) {
+ eol = memchr(start, '\n', eof - start);
+
+ if (*start == '^')
+ continue;
+
+ if (!former) {
+ former = start + hexsz + 1;
+ continue;
+ }
+
+ current = start + hexsz + 1;
+ if (cmp_packed_refname(former, current) >= 0) {
+ const char *err_fmt =
+ "refname '%s' is less than previous refname '%s'";
+
+ eol = memchr(former, '\n', eof - former);
+ strbuf_add(&refname1, former, eol - former);
+ eol = memchr(current, '\n', eof - current);
+ strbuf_add(&refname2, current, eol - current);
+
+ strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
+ report.path = packed_entry.buf;
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_PACKED_REF_UNSORTED,
+ err_fmt, refname2.buf, refname1.buf);
+ goto cleanup;
+ }
+ former = current;
+ }
+
+cleanup:
+ strbuf_release(&packed_entry);
+ strbuf_release(&refname1);
+ strbuf_release(&refname2);
+ return ret;
+}
+
static int packed_fsck_ref_content(struct fsck_options *o,
struct ref_store *ref_store,
+ unsigned int *sorted,
const char *start, const char *eof)
{
struct strbuf refname = STRBUF_INIT;
@@ -1926,7 +2006,7 @@ static int packed_fsck_ref_content(struct fsck_options *o,
ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
if (*start == '#') {
- ret |= packed_fsck_ref_header(o, start, eol);
+ ret |= packed_fsck_ref_header(o, start, eol, sorted);
start = eol + 1;
line_number++;
@@ -1957,6 +2037,7 @@ static int packed_fsck(struct ref_store *ref_store,
struct packed_ref_store *refs = packed_downcast(ref_store,
REF_STORE_READ, "fsck");
struct strbuf packed_ref_content = STRBUF_INIT;
+ unsigned int sorted = 0;
struct stat st;
int ret = 0;
int fd = -1;
@@ -2004,8 +2085,11 @@ static int packed_fsck(struct ref_store *ref_store,
goto cleanup;
}
- ret = packed_fsck_ref_content(o, ref_store, packed_ref_content.buf,
+ ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
packed_ref_content.buf + packed_ref_content.len);
+ if (!ret && sorted)
+ ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
+ packed_ref_content.buf + packed_ref_content.len);
cleanup:
if (fd >= 0)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index a88c792ce1..767e2bd4a0 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -743,4 +743,91 @@ test_expect_success 'packed-refs content should be checked' '
)
'
+test_expect_success 'packed-ref with sorted trait should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ EOF
+ git refs verify 2>err &&
+ rm .git/packed-refs &&
+ test_must_be_empty err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ $tag_1_oid $refname3
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 3: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname1'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled sorted
+ $tag_1_oid $refname3
+ ^$tag_1_peeled_oid
+ $branch_2_oid $refname2
+ EOF
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: packed-refs line 4: packedRefUnsorted: refname '\''$refname2'\'' is less than previous refname '\''$refname3'\''
+ EOF
+ rm .git/packed-refs &&
+ test_cmp expect err
+ )
+'
+
+test_expect_success 'packed-ref without sorted trait should not be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git tag -a annotated-tag-1 -m tag-1 &&
+ branch_1_oid=$(git rev-parse branch-1) &&
+ branch_2_oid=$(git rev-parse branch-2) &&
+ tag_1_oid=$(git rev-parse annotated-tag-1) &&
+ tag_1_peeled_oid=$(git rev-parse annotated-tag-1^{}) &&
+ refname1="refs/heads/main" &&
+ refname2="refs/heads/foo" &&
+ refname3="refs/tags/foo" &&
+
+ cat >.git/packed-refs <<-EOF &&
+ # pack-refs with: peeled fully-peeled
+ $branch_2_oid $refname1
+ $branch_1_oid $refname2
+ EOF
+ git refs verify 2>err &&
+ test_must_be_empty err
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* [PATCH v8 9/9] builtin/fsck: add `git refs verify` child process
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
` (7 preceding siblings ...)
2025-02-27 16:07 ` [PATCH v8 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
@ 2025-02-27 16:07 ` shejialuo
8 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-27 16:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano,
Michael Haggerty
At now, we have already implemented the ref consistency checks for both
"files-backend" and "packed-backend". Although we would check some
redundant things, it won't cause trouble. So, let's integrate it into
the "git-fsck(1)" command to get feedback from the users. And also by
calling "git refs verify" in "git-fsck(1)", we make sure that the new
added checks don't break.
Introduce a new function "fsck_refs" that initializes and runs a child
process to execute the "git refs verify" command. In order to provide
the user interface create a progress which makes the total task be 1.
It's hard to know how many loose refs we will check now. We might
improve this later.
Then, introduce the option to allow the user to disable checking ref
database consistency. Put this function in the very first execution
sequence of "git-fsck(1)" due to that we don't want the existing code of
"git-fsck(1)" which would implicitly check the consistency of refs to
die the program.
Last, update the test to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/git-fsck.adoc | 7 ++++++-
builtin/fsck.c | 33 ++++++++++++++++++++++++++++++-
t/t0602-reffiles-fsck.sh | 39 +++++++++++++++++++++++++++++++++++++
3 files changed, 77 insertions(+), 2 deletions(-)
diff --git a/Documentation/git-fsck.adoc b/Documentation/git-fsck.adoc
index 8f32800a83..11203ba925 100644
--- a/Documentation/git-fsck.adoc
+++ b/Documentation/git-fsck.adoc
@@ -12,7 +12,7 @@ SYNOPSIS
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
[--[no-]full] [--strict] [--verbose] [--lost-found]
[--[no-]dangling] [--[no-]progress] [--connectivity-only]
- [--[no-]name-objects] [<object>...]
+ [--[no-]name-objects] [--[no-]references] [<object>...]
DESCRIPTION
-----------
@@ -104,6 +104,11 @@ care about this output and want to speed it up further.
progress status even if the standard error stream is not
directed to a terminal.
+--[no-]references::
+ Control whether to check the references database consistency
+ via 'git refs verify'. See linkgit:git-refs[1] for details.
+ The default is to check the references database.
+
CONFIGURATION
-------------
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a4dcb0716..f4f395cfbd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -50,6 +50,7 @@ static int verbose;
static int show_progress = -1;
static int show_dangling = 1;
static int name_objects;
+static int check_references = 1;
#define ERROR_OBJECT 01
#define ERROR_REACHABLE 02
#define ERROR_PACK 04
@@ -905,11 +906,37 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
return res;
}
+static void fsck_refs(struct repository *r)
+{
+ struct child_process refs_verify = CHILD_PROCESS_INIT;
+ struct progress *progress = NULL;
+
+ if (show_progress)
+ progress = start_progress(r, _("Checking ref database"), 1);
+
+ if (verbose)
+ fprintf_ln(stderr, _("Checking ref database"));
+
+ child_process_init(&refs_verify);
+ refs_verify.git_cmd = 1;
+ strvec_pushl(&refs_verify.args, "refs", "verify", NULL);
+ if (verbose)
+ strvec_push(&refs_verify.args, "--verbose");
+ if (check_strict)
+ strvec_push(&refs_verify.args, "--strict");
+
+ if (run_command(&refs_verify))
+ errors_found |= ERROR_REFS;
+
+ display_progress(progress, 1);
+ stop_progress(&progress);
+}
+
static char const * const fsck_usage[] = {
N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n"
" [--[no-]full] [--strict] [--verbose] [--lost-found]\n"
" [--[no-]dangling] [--[no-]progress] [--connectivity-only]\n"
- " [--[no-]name-objects] [<object>...]"),
+ " [--[no-]name-objects] [--[no-]references] [<object>...]"),
NULL
};
@@ -928,6 +955,7 @@ static struct option fsck_opts[] = {
N_("write dangling objects in .git/lost-found")),
OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),
+ OPT_BOOL(0, "references", &check_references, N_("check reference database consistency")),
OPT_END(),
};
@@ -970,6 +998,9 @@ int cmd_fsck(int argc,
git_config(git_fsck_config, &fsck_obj_options);
prepare_repo_settings(the_repository);
+ if (check_references)
+ fsck_refs(the_repository);
+
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(the_repository,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 767e2bd4a0..9d1dc2144c 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -830,4 +830,43 @@ test_expect_success 'packed-ref without sorted trait should not be checked' '
)
'
+test_expect_success '--[no-]references option should apply to fsck' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ (
+ cd repo &&
+ test_commit default &&
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --references 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse HEAD)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git fsck --no-references 2>err &&
+ rm $branch_dir_prefix/branch-garbage &&
+ test_must_be_empty err || return 1
+ done
+ )
+'
+
test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 168+ messages in thread
* Re: [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-27 0:57 ` shejialuo
2025-02-27 14:10 ` Patrick Steinhardt
@ 2025-02-27 16:57 ` Junio C Hamano
2025-02-28 5:02 ` shejialuo
1 sibling, 1 reply; 168+ messages in thread
From: Junio C Hamano @ 2025-02-27 16:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
shejialuo <shejialuo@gmail.com> writes:
> You are right. Actually, I just want to avoid assigning the `fd` to -1.
Why not?
Between leaving it uninitialized and explicitly initializing it to
signal that it is invalid, the only difference is that you can
programmatically check if fd is invalid and refrain from calling
close(fd), for example, with the latter, while with the former you
cannot.
> However, I didn't realize that I would initialize the strbuf later.
> After waking up, I have suddenly realized this problem.
Given that initialized-but-never-used strbuf does not hold any
acquired resources, the current code at the end of the series is
still OK. So there is technically nothing to fix. I'll take a
reroll if you later send one, but as I said, I do not think it is
necessary to reroll only to add fd=-1 initialization.
^ permalink raw reply [flat|nested] 168+ messages in thread
* Re: [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file
2025-02-27 16:57 ` Junio C Hamano
@ 2025-02-28 5:02 ` shejialuo
0 siblings, 0 replies; 168+ messages in thread
From: shejialuo @ 2025-02-28 5:02 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak, Michael Haggerty
On Thu, Feb 27, 2025 at 08:57:11AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > You are right. Actually, I just want to avoid assigning the `fd` to -1.
>
> Why not?
>
> Between leaving it uninitialized and explicitly initializing it to
> signal that it is invalid, the only difference is that you can
> programmatically check if fd is invalid and refrain from calling
> close(fd), for example, with the latter, while with the former you
> cannot.
>
Yes, that's correct.
> > However, I didn't realize that I would initialize the strbuf later.
> > After waking up, I have suddenly realized this problem.
>
> Given that initialized-but-never-used strbuf does not hold any
> acquired resources, the current code at the end of the series is
> still OK. So there is technically nothing to fix. I'll take a
> reroll if you later send one, but as I said, I do not think it is
> necessary to reroll only to add fd=-1 initialization.
Yes, as you have said, there is nothing wrong at now. And as Patrick has
nothing comment. I have sent out a reroll to make code better.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 168+ messages in thread
end of thread, other threads:[~2025-02-28 5:02 UTC | newest]
Thread overview: 168+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
2025-01-07 14:17 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 13:40 ` shejialuo
2025-01-24 7:54 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
2025-01-07 14:57 ` Karthik Nayak
2025-01-07 16:34 ` shejialuo
2025-01-08 8:40 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-07 16:33 ` Karthik Nayak
2025-01-17 14:00 ` shejialuo
2025-01-17 22:01 ` Eric Sunshine
2025-01-18 3:05 ` shejialuo
2025-01-19 8:03 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-08 0:54 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:23 ` shejialuo
2025-01-24 7:51 ` Patrick Steinhardt
2025-02-17 13:16 ` shejialuo
2025-01-05 13:49 ` [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:33 ` shejialuo
2025-01-05 13:49 ` [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:35 ` shejialuo
2025-01-05 13:50 ` [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 08/10] packed-backend: add check for object consistency shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 10/10] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-06 22:16 ` Junio C Hamano
2025-01-07 12:00 ` shejialuo
2025-01-07 15:52 ` Junio C Hamano
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
2025-01-30 4:06 ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-01-30 17:53 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 2/8] builtin/refs: get worktrees without reading head info shejialuo
2025-01-30 18:04 ` Junio C Hamano
2025-01-31 13:29 ` shejialuo
2025-01-31 16:16 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-30 18:23 ` Junio C Hamano
2025-01-31 13:54 ` shejialuo
2025-01-31 16:20 ` Junio C Hamano
2025-02-01 9:47 ` shejialuo
2025-02-03 20:15 ` Junio C Hamano
2025-02-04 3:58 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:07 ` [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-30 18:58 ` Junio C Hamano
2025-01-31 14:23 ` shejialuo
2025-01-30 4:07 ` [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-05 10:09 ` shejialuo
2025-01-30 4:07 ` [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 4:28 ` shejialuo
2025-01-30 4:08 ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-30 19:02 ` Junio C Hamano
2025-01-31 14:35 ` shejialuo
2025-01-31 16:23 ` Junio C Hamano
2025-02-01 9:50 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:08 ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-30 19:03 ` Junio C Hamano
2025-01-31 14:37 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 5:32 ` shejialuo
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
2025-02-06 5:58 ` [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-06 5:58 ` [PATCH v3 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-06 5:58 ` [PATCH v3 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-06 5:59 ` [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:12 ` shejialuo
2025-02-12 17:48 ` Junio C Hamano
2025-02-14 3:53 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-06 5:59 ` [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:18 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:20 ` shejialuo
2025-02-12 10:42 ` Patrick Steinhardt
2025-02-12 10:56 ` shejialuo
2025-02-06 6:00 ` [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:21 ` shejialuo
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
2025-02-14 4:51 ` [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-14 4:52 ` [PATCH v4 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-14 9:19 ` Karthik Nayak
2025-02-14 12:20 ` shejialuo
2025-02-14 4:52 ` [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-14 9:50 ` Karthik Nayak
2025-02-14 12:37 ` shejialuo
2025-02-14 4:52 ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-14 10:30 ` Karthik Nayak
2025-02-14 12:43 ` shejialuo
2025-02-14 14:01 ` Junio C Hamano
2025-02-14 4:52 ` [PATCH v4 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-14 4:53 ` [PATCH v4 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-14 4:59 ` [PATCH v4 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-14 4:59 ` [PATCH v4 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-14 9:04 ` [PATCH v4 0/8] add more ref consistency checks Karthik Nayak
2025-02-14 12:16 ` shejialuo
2025-02-17 15:25 ` [PATCH v5 " shejialuo
2025-02-17 15:27 ` [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-17 15:27 ` [PATCH v5 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25 8:26 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-25 12:34 ` shejialuo
2025-02-17 15:27 ` [PATCH v5 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-17 15:28 ` [PATCH v5 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-17 15:28 ` [PATCH v5 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-17 15:28 ` [PATCH v5 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-25 8:27 ` [PATCH v5 0/8] add more ref consistency checks Patrick Steinhardt
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
2025-02-25 13:21 ` [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-25 13:21 ` [PATCH v6 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25 13:21 ` [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25 17:44 ` Junio C Hamano
2025-02-26 12:05 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26 8:08 ` Patrick Steinhardt
2025-02-26 12:28 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25 13:21 ` [PATCH v6 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-25 13:22 ` [PATCH v6 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-25 13:22 ` [PATCH v6 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-25 13:22 ` [PATCH v6 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
2025-02-26 13:49 ` [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-26 13:49 ` [PATCH v7 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-26 13:49 ` [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-26 18:36 ` Junio C Hamano
2025-02-27 0:57 ` shejialuo
2025-02-27 14:10 ` Patrick Steinhardt
2025-02-27 16:57 ` Junio C Hamano
2025-02-28 5:02 ` shejialuo
2025-02-26 13:50 ` [PATCH v7 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26 13:50 ` [PATCH v7 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-26 13:50 ` [PATCH v7 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-26 13:50 ` [PATCH v7 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-26 13:50 ` [PATCH v7 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-26 13:51 ` [PATCH v7 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
2025-02-27 16:05 ` [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-27 16:06 ` [PATCH v8 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-27 16:06 ` [PATCH v8 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-27 16:06 ` [PATCH v8 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-27 16:06 ` [PATCH v8 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-27 16:07 ` [PATCH v8 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-27 16:07 ` [PATCH v8 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-27 16:07 ` [PATCH v8 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-27 16:07 ` [PATCH v8 9/9] builtin/fsck: add `git refs verify` child process shejialuo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).