* [PATCH v5 1/9] ref: initialize "fsck_ref_report" with zero
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
@ 2024-09-29 7:15 ` shejialuo
2024-10-08 7:29 ` Karthik Nayak
2024-09-29 7:15 ` [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs shejialuo
` (10 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:15 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "fsck.c::fsck_refs_error_function", we need to tell whether "oid" and
"referent" is NULL. So, we need to always initialize these parameters to
NULL instead of letting them point to anywhere when creating a new
"fsck_ref_report" structure.
The original code explicitly initializes the "path" member in the
"struct fsck_ref_report" to NULL (which implicitly 0-initializes other
members in the struct). It is more customary to use "{ 0 }" to express
that we are 0-initializing everything. In order to align with the the
codebase, initialize "fsck_ref_report" with zero.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0824c0b8a9..03d2503276 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3520,7 +3520,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
goto cleanup;
if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
- struct fsck_ref_report report = { .path = NULL };
+ struct fsck_ref_report report = { 0 };
strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs.
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
2024-09-29 7:15 ` [PATCH v5 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
@ 2024-09-29 7:15 ` shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-09-29 7:15 ` [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
` (9 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:15 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. As we decide to add more
checks for ref content, we need to set up support for multiple
worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
refs under the worktrees.
Because we should only check once for "packed-refs", let's call the fsck
function for packed-backend when in the main worktree. In order to know
which directory we check, we should default print this information
instead of specifying "--verbose".
It's not suitable to print these information to the stderr. So, change
to stdout.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 11 ++++++--
refs/files-backend.c | 18 ++++++++----
t/t0602-reffiles-fsck.sh | 59 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 81 insertions(+), 7 deletions(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index 24978a7b7b..3c492ea922 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -5,6 +5,7 @@
#include "parse-options.h"
#include "refs.h"
#include "strbuf.h"
+#include "worktree.h"
#define REFS_MIGRATE_USAGE \
N_("git refs migrate --ref-format=<format> [--dry-run]")
@@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
{
struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
+ struct worktree **worktrees, **p;
const char * const verify_usage[] = {
REFS_VERIFY_USAGE,
NULL,
@@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
OPT_END(),
};
- int ret;
+ int ret = 0;
argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
if (argc)
@@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+ worktrees = get_worktrees();
+ for (p = worktrees; *p; p++) {
+ struct worktree *wt = *p;
+ ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options);
+ }
fsck_options_clear(&fsck_refs_options);
+ free_worktrees(worktrees);
return ret;
}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 03d2503276..57318b4c4e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3558,7 +3558,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
if (o->verbose)
- fprintf_ln(stderr, "Checking %s/%s",
+ fprintf_ln(stdout, "Checking %s/%s",
refs_check_dir, iter->relative_path);
for (size_t i = 0; fsck_refs_fn[i]; i++) {
if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
@@ -3589,8 +3589,8 @@ static int files_fsck_refs(struct ref_store *ref_store,
NULL,
};
- if (o->verbose)
- fprintf_ln(stderr, _("Checking references consistency"));
+ fprintf_ln(stdout, _("Checking references consistency in %s"),
+ ref_store->gitdir);
return files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fn);
}
@@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store,
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_READ, "fsck");
- return files_fsck_refs(ref_store, o) |
- refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+ int ret = files_fsck_refs(ref_store, o);
+
+ /*
+ * packed-refs should only be checked once because it is shared
+ * between all worktrees.
+ */
+ if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir))
+ ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+
+ return ret;
}
struct ref_storage_be refs_be_files = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 71a4d1a5ae..4c6cd6f7d0 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -89,4 +89,63 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
test_must_be_empty err
'
+test_expect_success 'ref name check should work for multiple worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/.branch-2 &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/@ &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/worktree/.branch-2: badRefName: invalid refname format
+ error: refs/worktree/@: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ (
+ cd worktree-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/worktree/.branch-2: badRefName: invalid refname format
+ error: refs/worktree/@: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ ) &&
+
+ (
+ cd worktree-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/worktree/.branch-2: badRefName: invalid refname format
+ error: refs/worktree/@: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
+'
+
test_done
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs.
2024-09-29 7:15 ` [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs shejialuo
@ 2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:42 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 6:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote:
> We have already set up the infrastructure to check the consistency for
> refs, but we do not support multiple worktrees. As we decide to add more
> checks for ref content, we need to set up support for multiple
> worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
> refs under the worktrees.
Makes sense.
> Because we should only check once for "packed-refs", let's call the fsck
> function for packed-backend when in the main worktree. In order to know
> which directory we check, we should default print this information
> instead of specifying "--verbose".
This change should likely be evicted into its own commit with a bit more
explanation.
> It's not suitable to print these information to the stderr. So, change
> to stdout.
This one, too. Why exactly is in not suitable to print to stderr?
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> builtin/refs.c | 11 ++++++--
> refs/files-backend.c | 18 ++++++++----
> t/t0602-reffiles-fsck.sh | 59 ++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 81 insertions(+), 7 deletions(-)
>
> diff --git a/builtin/refs.c b/builtin/refs.c
> index 24978a7b7b..3c492ea922 100644
> --- a/builtin/refs.c
> +++ b/builtin/refs.c
> @@ -5,6 +5,7 @@
> #include "parse-options.h"
> #include "refs.h"
> #include "strbuf.h"
> +#include "worktree.h"
>
> #define REFS_MIGRATE_USAGE \
> N_("git refs migrate --ref-format=<format> [--dry-run]")
> @@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
> static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> {
> struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
> + struct worktree **worktrees, **p;
> const char * const verify_usage[] = {
> REFS_VERIFY_USAGE,
> NULL,
> @@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
> OPT_END(),
> };
> - int ret;
> + int ret = 0;
>
> argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
> if (argc)
> @@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> git_config(git_fsck_config, &fsck_refs_options);
> prepare_repo_settings(the_repository);
>
> - ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
> + worktrees = get_worktrees();
> + for (p = worktrees; *p; p++) {
> + struct worktree *wt = *p;
> + ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options);
> + }
I think it is more customary to say `ret |=` instead of `ref +=`.
Otherwise we could at least in theory wrap around and even land at `ret
== 0`, even though this is quite unlikely.
> fsck_options_clear(&fsck_refs_options);
> + free_worktrees(worktrees);
> return ret;
> }
>
[snip]
> @@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store,
> struct files_ref_store *refs =
> files_downcast(ref_store, REF_STORE_READ, "fsck");
>
> - return files_fsck_refs(ref_store, o) |
> - refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> + int ret = files_fsck_refs(ref_store, o);
> +
> + /*
> + * packed-refs should only be checked once because it is shared
> + * between all worktrees.
> + */
> + if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir))
> + ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> +
> + return ret;
> }
>
> struct ref_storage_be refs_be_files = {
What is the current behaviour? Is it that we verify the packed-refs file
multiple times, or rather that we call `packed_ref_store->be->fsck()`
many times even though we know it won't do anything for anything except
for the main worktree?
If it is the former I very much agree that we should make this
conditional. If it's the latter I'm more in the camp of letting it be
such that if worktrees were to ever gain support for "packed-refs" we
wouldn't have to change anything.
In any case, as proposed I think it would make sense to evict this into
a standalone commit such that these details can be explained in the
commit message.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs.
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-07 8:42 ` shejialuo
2024-10-07 9:16 ` Patrick Steinhardt
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-10-07 8:42 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 08:58:30AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote:
> > We have already set up the infrastructure to check the consistency for
> > refs, but we do not support multiple worktrees. As we decide to add more
> > checks for ref content, we need to set up support for multiple
> > worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
> > refs under the worktrees.
>
> Makes sense.
>
> > Because we should only check once for "packed-refs", let's call the fsck
> > function for packed-backend when in the main worktree. In order to know
> > which directory we check, we should default print this information
> > instead of specifying "--verbose".
>
> This change should likely be evicted into its own commit with a bit more
> explanation.
>
> > It's not suitable to print these information to the stderr. So, change
> > to stdout.
>
> This one, too. Why exactly is in not suitable to print to stderr?
>
I am sorry for the confusion. We should not print which directory we
check here into stderr. Because I think this will make test script
contain many unrelated info when using "git refs verify 2>err".
The reason here is when checking the consistency of refs in multiple
worktrees. The ref name could be repeat. For example, worktree A
has its own ref called "test" under ".git/worktrees/A/refs/worktree/test"
and worktree B has its own ref still called "test" under
".git/worktrees/B/refs/worktree/test".
However, the refname would be printed to "refs/worktree/test". It will
make the user confused which "refs/worktree/test" is checked. So, we
should print this information like:
Checking references consistency in .git
...
checking references consistency in .git/worktrees/A
...
checking references consistency in .git/worktrees/B
However, when writing this, I feel a ".git" is a bad usage. It will make
the user think it will check everything here. This should be improved in
the next version.
> > @@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> > OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
> > OPT_END(),
> > };
> > - int ret;
> > + int ret = 0;
> >
> > argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
> > if (argc)
> > @@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> > git_config(git_fsck_config, &fsck_refs_options);
> > prepare_repo_settings(the_repository);
> >
> > - ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
> > + worktrees = get_worktrees();
> > + for (p = worktrees; *p; p++) {
> > + struct worktree *wt = *p;
> > + ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options);
> > + }
>
> I think it is more customary to say `ret |=` instead of `ref +=`.
> Otherwise we could at least in theory wrap around and even land at `ret
> == 0`, even though this is quite unlikely.
>
I agree here. I will improve this in the next version.
[snip]
> > @@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store,
> > struct files_ref_store *refs =
> > files_downcast(ref_store, REF_STORE_READ, "fsck");
> >
> > - return files_fsck_refs(ref_store, o) |
> > - refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> > + int ret = files_fsck_refs(ref_store, o);
> > +
> > + /*
> > + * packed-refs should only be checked once because it is shared
> > + * between all worktrees.
> > + */
> > + if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir))
> > + ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> > +
> > + return ret;
> > }
> >
> > struct ref_storage_be refs_be_files = {
>
> What is the current behaviour? Is it that we verify the packed-refs file
> multiple times, or rather that we call `packed_ref_store->be->fsck()`
> many times even though we know it won't do anything for anything except
> for the main worktree?
>
That's a good question. I think the second is the current behaviour. We
will call `packed_ref_store->be->fsck()` many times. I understand what
you mean here, we just put the check into `packed_ref_store->be->fsck()`
function.
> If it is the former I very much agree that we should make this
> conditional. If it's the latter I'm more in the camp of letting it be
> such that if worktrees were to ever gain support for "packed-refs" we
> wouldn't have to change anything.
>
I agree.
> In any case, as proposed I think it would make sense to evict this into
> a standalone commit such that these details can be explained in the
> commit message.
>
Yes, the current commit message lacks of details.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs.
2024-10-07 8:42 ` shejialuo
@ 2024-10-07 9:16 ` Patrick Steinhardt
2024-10-07 12:06 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 9:16 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 04:42:21PM +0800, shejialuo wrote:
> On Mon, Oct 07, 2024 at 08:58:30AM +0200, Patrick Steinhardt wrote:
> > On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote:
> > > We have already set up the infrastructure to check the consistency for
> > > refs, but we do not support multiple worktrees. As we decide to add more
> > > checks for ref content, we need to set up support for multiple
> > > worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
> > > refs under the worktrees.
> >
> > Makes sense.
> >
> > > Because we should only check once for "packed-refs", let's call the fsck
> > > function for packed-backend when in the main worktree. In order to know
> > > which directory we check, we should default print this information
> > > instead of specifying "--verbose".
> >
> > This change should likely be evicted into its own commit with a bit more
> > explanation.
> >
> > > It's not suitable to print these information to the stderr. So, change
> > > to stdout.
> >
> > This one, too. Why exactly is in not suitable to print to stderr?
> >
>
> I am sorry for the confusion. We should not print which directory we
> check here into stderr. Because I think this will make test script
> contain many unrelated info when using "git refs verify 2>err".
>
> The reason here is when checking the consistency of refs in multiple
> worktrees. The ref name could be repeat. For example, worktree A
> has its own ref called "test" under ".git/worktrees/A/refs/worktree/test"
> and worktree B has its own ref still called "test" under
> ".git/worktrees/B/refs/worktree/test".
>
> However, the refname would be printed to "refs/worktree/test". It will
> make the user confused which "refs/worktree/test" is checked. So, we
> should print this information like:
>
> Checking references consistency in .git
> ...
> checking references consistency in .git/worktrees/A
> ...
> checking references consistency in .git/worktrees/B
>
> However, when writing this, I feel a ".git" is a bad usage. It will make
> the user think it will check everything here. This should be improved in
> the next version.
But wouldn't it be the better solution if we printed the fully-qualified
reference name "worktrees/worktree/refs/worktree/test" instead? That
would remove the need to say which directory we're currently verifying
in the first place.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs.
2024-10-07 9:16 ` Patrick Steinhardt
@ 2024-10-07 12:06 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-07 12:06 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 11:16:19AM +0200, Patrick Steinhardt wrote:
[snip]
> > However, the refname would be printed to "refs/worktree/test". It will
> > make the user confused which "refs/worktree/test" is checked. So, we
> > should print this information like:
> >
> > Checking references consistency in .git
> > ...
> > checking references consistency in .git/worktrees/A
> > ...
> > checking references consistency in .git/worktrees/B
> >
> > However, when writing this, I feel a ".git" is a bad usage. It will make
> > the user think it will check everything here. This should be improved in
> > the next version.
>
> But wouldn't it be the better solution if we printed the fully-qualified
> reference name "worktrees/worktree/refs/worktree/test" instead? That
> would remove the need to say which directory we're currently verifying
> in the first place.
>
Good idea. I will use this way in the next version.
> Patrick
Thanks
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
2024-09-29 7:15 ` [PATCH v5 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-09-29 7:15 ` [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs shejialuo
@ 2024-09-29 7:15 ` shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-08 7:43 ` Karthik Nayak
2024-09-29 7:16 ` [PATCH v5 4/9] ref: add more strict checks for regular refs shejialuo
` (8 subsequent siblings)
11 siblings, 2 replies; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:15 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
"git-fsck(1)" has some consistency checks for regular refs. As we want
to align the checks "git refs verify" performs with them (and eventually
call the unified code that checks refs from both), port the logic
"git-fsck" has to "git refs verify".
"git-fsck(1)" will report an error when the ref content is invalid.
Following this, add a similar check to "git refs verify". Then add a new
fsck error message "badRefContent(ERROR)" to represent that a ref has an
invalid content.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 ++
fsck.h | 1 +
refs/files-backend.c | 45 ++++++++++++++++++++++++
t/t0602-reffiles-fsck.sh | 66 +++++++++++++++++++++++++++++++++++
4 files changed, 115 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 68a2801f15..22c385ea22 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -19,6 +19,9 @@
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
+`badRefContent`::
+ (ERROR) A ref has bad content.
+
`badRefFiletype`::
(ERROR) A ref has a bad file type.
diff --git a/fsck.h b/fsck.h
index 500b4c04d2..0d99a87911 100644
--- a/fsck.h
+++ b/fsck.h
@@ -31,6 +31,7 @@ enum fsck_msg_type {
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
+ FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 57318b4c4e..35b3fa983e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3504,6 +3504,50 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refs_check_dir,
struct dir_iterator *iter);
+static int files_fsck_refs_content(struct ref_store *ref_store,
+ struct fsck_options *o,
+ const char *refs_check_dir,
+ struct dir_iterator *iter)
+{
+ struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf referent = STRBUF_INIT;
+ struct strbuf refname = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ unsigned int type = 0;
+ int failure_errno = 0;
+ struct object_id oid;
+ int ret = 0;
+
+ strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
+ report.path = refname.buf;
+
+ if (S_ISLNK(iter->st.st_mode))
+ goto cleanup;
+
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "cannot read ref file");
+ goto cleanup;
+ }
+
+ if (parse_loose_ref_contents(ref_store->repo->hash_algo,
+ ref_content.buf, &oid, &referent,
+ &type, &failure_errno)) {
+ strbuf_rtrim(&ref_content);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "%s", ref_content.buf);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&refname);
+ strbuf_release(&ref_content);
+ strbuf_release(&referent);
+ return ret;
+}
+
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
const char *refs_check_dir,
@@ -3586,6 +3630,7 @@ static int files_fsck_refs(struct ref_store *ref_store,
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
+ files_fsck_refs_content,
NULL,
};
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 4c6cd6f7d0..628f9bcc46 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -148,4 +148,70 @@ test_expect_success 'ref name check should work for multiple worktrees' '
)
'
+test_expect_success 'regular ref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ bad_content=$(git rev-parse main)x &&
+ printf "%s" $bad_content >$tag_dir_prefix/tag-bad-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content
+ EOF
+ rm $tag_dir_prefix/tag-bad-1 &&
+ test_cmp expect err &&
+
+ bad_content=xfsazqfxcadas &&
+ printf "%s" $bad_content >$tag_dir_prefix/tag-bad-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content
+ EOF
+ rm $tag_dir_prefix/tag-bad-2 &&
+ test_cmp expect err &&
+
+ bad_content=Xfsazqfxcadas &&
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err
+'
+
+test_expect_success 'regular ref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
test_done
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-09-29 7:15 ` [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:42 ` shejialuo
2024-10-08 7:43 ` Karthik Nayak
1 sibling, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 6:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Sep 29, 2024 at 03:15:46PM +0800, shejialuo wrote:
> "git-fsck(1)" has some consistency checks for regular refs. As we want
> to align the checks "git refs verify" performs with them (and eventually
> call the unified code that checks refs from both), port the logic
> "git-fsck" has to "git refs verify".
What's missing here is the actual intent of this commit, namely why we
want to align the checks. I assume that this prepares us for calling
`git refs verify` as part of git-fsck(1), but readers not familiar with
the larger picture may be left wondering.
> "git-fsck(1)" will report an error when the ref content is invalid.
> Following this, add a similar check to "git refs verify". Then add a new
> fsck error message "badRefContent(ERROR)" to represent that a ref has an
> invalid content.
It would help readers to know where the code is that you're porting over
to `git refs verify` so that one can double check that the port is done
faithfully to the original.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-07 8:42 ` shejialuo
2024-10-07 9:18 ` Patrick Steinhardt
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-10-07 8:42 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 08:58:34AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:15:46PM +0800, shejialuo wrote:
> > "git-fsck(1)" has some consistency checks for regular refs. As we want
> > to align the checks "git refs verify" performs with them (and eventually
> > call the unified code that checks refs from both), port the logic
> > "git-fsck" has to "git refs verify".
>
> What's missing here is the actual intent of this commit, namely why we
> want to align the checks. I assume that this prepares us for calling
> `git refs verify` as part of git-fsck(1), but readers not familiar with
> the larger picture may be left wondering.
>
Indeed, I will improve this in the next version.
> > "git-fsck(1)" will report an error when the ref content is invalid.
> > Following this, add a similar check to "git refs verify". Then add a new
> > fsck error message "badRefContent(ERROR)" to represent that a ref has an
> > invalid content.
>
> It would help readers to know where the code is that you're porting over
> to `git refs verify` so that one can double check that the port is done
> faithfully to the original.
>
I am a little confused here. There are too many codes in "git-fsck(1)"
to check the ref consistency. How could I accurately express this info
in the commit message?
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-07 8:42 ` shejialuo
@ 2024-10-07 9:18 ` Patrick Steinhardt
2024-10-07 12:08 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 9:18 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 04:42:44PM +0800, shejialuo wrote:
> On Mon, Oct 07, 2024 at 08:58:34AM +0200, Patrick Steinhardt wrote:
> > On Sun, Sep 29, 2024 at 03:15:46PM +0800, shejialuo wrote:
> > > "git-fsck(1)" will report an error when the ref content is invalid.
> > > Following this, add a similar check to "git refs verify". Then add a new
> > > fsck error message "badRefContent(ERROR)" to represent that a ref has an
> > > invalid content.
> >
> > It would help readers to know where the code is that you're porting over
> > to `git refs verify` so that one can double check that the port is done
> > faithfully to the original.
> >
>
> I am a little confused here. There are too many codes in "git-fsck(1)"
> to check the ref consistency. How could I accurately express this info
> in the commit message?
Well, you say you ported over a specific consistency check from
git-fsck(1) to `git refs verify` in the commit message. So I assume that
it should match a specific check in git-fsck(1), shouldn't it?
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-07 9:18 ` Patrick Steinhardt
@ 2024-10-07 12:08 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-07 12:08 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 11:18:24AM +0200, Patrick Steinhardt wrote:
> On Mon, Oct 07, 2024 at 04:42:44PM +0800, shejialuo wrote:
> > On Mon, Oct 07, 2024 at 08:58:34AM +0200, Patrick Steinhardt wrote:
> > > On Sun, Sep 29, 2024 at 03:15:46PM +0800, shejialuo wrote:
> > > > "git-fsck(1)" will report an error when the ref content is invalid.
> > > > Following this, add a similar check to "git refs verify". Then add a new
> > > > fsck error message "badRefContent(ERROR)" to represent that a ref has an
> > > > invalid content.
> > >
> > > It would help readers to know where the code is that you're porting over
> > > to `git refs verify` so that one can double check that the port is done
> > > faithfully to the original.
> > >
> >
> > I am a little confused here. There are too many codes in "git-fsck(1)"
> > to check the ref consistency. How could I accurately express this info
> > in the commit message?
>
> Well, you say you ported over a specific consistency check from
> git-fsck(1) to `git refs verify` in the commit message. So I assume that
> it should match a specific check in git-fsck(1), shouldn't it?
>
I understand your meaning here. I will improve the commit message in the
next version.
> Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-09-29 7:15 ` [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-08 7:43 ` Karthik Nayak
2024-10-08 12:24 ` shejialuo
1 sibling, 1 reply; 209+ messages in thread
From: Karthik Nayak @ 2024-10-08 7:43 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 532 bytes --]
shejialuo <shejialuo@gmail.com> writes:
[snip]
> + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_CONTENT,
> + "cannot read ref file");
> + goto cleanup;
> + }
> +
Shouldn't we use `die_errno` here instead? I mean, this is not really a
bad ref content issue. If we don't want to die here, it would still
probably be nice to get the actual issue using `strerror` instead and
use that instead of the generic message we have here.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-08 7:43 ` Karthik Nayak
@ 2024-10-08 12:24 ` shejialuo
2024-10-08 17:44 ` Junio C Hamano
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-10-08 12:24 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano
On Tue, Oct 08, 2024 at 12:43:20AM -0700, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> [snip]
>
> > + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_BAD_REF_CONTENT,
> > + "cannot read ref file");
> > + goto cleanup;
> > + }
> > +
>
> Shouldn't we use `die_errno` here instead? I mean, this is not really a
> bad ref content issue. If we don't want to die here, it would still
> probably be nice to get the actual issue using `strerror` instead and
> use that instead of the generic message we have here.
>
Well, I think I need to dive into the "open" system call here. Actually,
we have two opinions now. Junio thought that we should use
"fsck_report_ref" to report. Karthik, Patrick and I thought that we
should report using "*errno" because this is a general error.
Let me investigate what situations will make "open" system call fail.
Thus, we could fully explain which choice we will choose.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-08 12:24 ` shejialuo
@ 2024-10-08 17:44 ` Junio C Hamano
2024-10-09 8:05 ` Patrick Steinhardt
2024-10-09 11:55 ` shejialuo
0 siblings, 2 replies; 209+ messages in thread
From: Junio C Hamano @ 2024-10-08 17:44 UTC (permalink / raw)
To: shejialuo; +Cc: Karthik Nayak, git, Patrick Steinhardt
shejialuo <shejialuo@gmail.com> writes:
> On Tue, Oct 08, 2024 at 12:43:20AM -0700, Karthik Nayak wrote:
>> shejialuo <shejialuo@gmail.com> writes:
>>
>> [snip]
>>
>> > + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
>> > + ret = fsck_report_ref(o, &report,
>> > + FSCK_MSG_BAD_REF_CONTENT,
>> > + "cannot read ref file");
>> > + goto cleanup;
>> > + }
>> > +
>>
>> Shouldn't we use `die_errno` here instead? I mean, this is not really a
>> bad ref content issue. If we don't want to die here, it would still
>> probably be nice to get the actual issue using `strerror` instead and
>> use that instead of the generic message we have here.
>>
>
> Well, I think I need to dive into the "open" system call here. Actually,
> we have two opinions now. Junio thought that we should use
> "fsck_report_ref" to report. Karthik, Patrick and I thought that we
> should report using "*errno" because this is a general error.
What do you mean by "a general error"? It is true that we failed to
read a ref file, so even if it is an I/O error, I'd think it is OK
to report it as an error while reading one particular ref.
Giving more information is a separate issue. If fsck_report_ref()
can be extended to take something like
"cannot read ref file '%s': (%s)", iter->path.buf, strerror(errno)
that would give the user necessary information.
And I agree with half-of what Karthik said, i.e., we do not want to
die here if this is meant to run as a part of "git fsck".
I may have said this before, but quite frankly, the API into the
fsck_report_ref() function is misdesigned. If the single constant
string "cannot read ref file" cnanot give more information than
FSCK_MSG_BAD_REF_CONTENT, the parameter has no value.
The fsck.c:report() function, which is the main function to report
fsck's findings before fsck_report_ref() was introduced, did not
have such a problem, as it allowed "const char *fmt, ..." at the
end. Is it too late to fix the fsck_report_ref()?
Thanks.
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-08 17:44 ` Junio C Hamano
@ 2024-10-09 8:05 ` Patrick Steinhardt
2024-10-09 11:59 ` shejialuo
2024-10-09 11:55 ` shejialuo
1 sibling, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-09 8:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: shejialuo, Karthik Nayak, git
On Tue, Oct 08, 2024 at 10:44:53AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > On Tue, Oct 08, 2024 at 12:43:20AM -0700, Karthik Nayak wrote:
> >> shejialuo <shejialuo@gmail.com> writes:
> >>
> >> [snip]
> >>
> >> > + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> >> > + ret = fsck_report_ref(o, &report,
> >> > + FSCK_MSG_BAD_REF_CONTENT,
> >> > + "cannot read ref file");
> >> > + goto cleanup;
> >> > + }
> >> > +
> >>
> >> Shouldn't we use `die_errno` here instead? I mean, this is not really a
> >> bad ref content issue. If we don't want to die here, it would still
> >> probably be nice to get the actual issue using `strerror` instead and
> >> use that instead of the generic message we have here.
> >>
> >
> > Well, I think I need to dive into the "open" system call here. Actually,
> > we have two opinions now. Junio thought that we should use
> > "fsck_report_ref" to report. Karthik, Patrick and I thought that we
> > should report using "*errno" because this is a general error.
>
> What do you mean by "a general error"? It is true that we failed to
> read a ref file, so even if it is an I/O error, I'd think it is OK
> to report it as an error while reading one particular ref.
>
> Giving more information is a separate issue. If fsck_report_ref()
> can be extended to take something like
>
> "cannot read ref file '%s': (%s)", iter->path.buf, strerror(errno)
>
> that would give the user necessary information.
Yeah, this is also in line with what I proposed elsewhere, where we have
been discussing the 1:1 mapping between error codes and error messages.
If the error messages were dynamic they'd be a whole lot useful overall
and could provide more context.
> And I agree with half-of what Karthik said, i.e., we do not want to
> die here if this is meant to run as a part of "git fsck".
>
> I may have said this before, but quite frankly, the API into the
> fsck_report_ref() function is misdesigned. If the single constant
> string "cannot read ref file" cnanot give more information than
> FSCK_MSG_BAD_REF_CONTENT, the parameter has no value.
True in the current form, yeah. If `fsck_report_ref()` learned to take a
vararg argument and treat its first argument as a string format it would
be justified though, as the message is now dynamic and can contain more
context around the specific failure that cannot be provided statically
via the 1:1 mapping between error code and message.
> The fsck.c:report() function, which is the main function to report
> fsck's findings before fsck_report_ref() was introduced, did not
> have such a problem, as it allowed "const char *fmt, ..." at the
> end. Is it too late to fix the fsck_report_ref()?
I don't think so, I think we should be able to refactor the code rather
easily to do so.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-09 8:05 ` Patrick Steinhardt
@ 2024-10-09 11:59 ` shejialuo
2024-10-10 6:52 ` Patrick Steinhardt
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-10-09 11:59 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, Karthik Nayak, git
On Wed, Oct 09, 2024 at 10:05:19AM +0200, Patrick Steinhardt wrote:
[snip]
> > I may have said this before, but quite frankly, the API into the
> > fsck_report_ref() function is misdesigned. If the single constant
> > string "cannot read ref file" cnanot give more information than
> > FSCK_MSG_BAD_REF_CONTENT, the parameter has no value.
>
> True in the current form, yeah. If `fsck_report_ref()` learned to take a
> vararg argument and treat its first argument as a string format it would
> be justified though, as the message is now dynamic and can contain more
> context around the specific failure that cannot be provided statically
> via the 1:1 mapping between error code and message.
>
It is not "learned". At current, `fsck_report_ref` can do this and is
the same as "fsck.c::report". I have explained this when replying to
Junio.
> > The fsck.c:report() function, which is the main function to report
> > fsck's findings before fsck_report_ref() was introduced, did not
> > have such a problem, as it allowed "const char *fmt, ..." at the
> > end. Is it too late to fix the fsck_report_ref()?
>
> I don't think so, I think we should be able to refactor the code rather
> easily to do so.
>
It's not hard to refactor the code. But this is not the problem. I am a
little confused here. Because we already allowed "fsck_report_ref"
having "const char *fmt, ..." at the end.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-09 11:59 ` shejialuo
@ 2024-10-10 6:52 ` Patrick Steinhardt
2024-10-10 16:00 ` Junio C Hamano
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-10 6:52 UTC (permalink / raw)
To: shejialuo; +Cc: Junio C Hamano, Karthik Nayak, git
On Wed, Oct 09, 2024 at 07:59:20PM +0800, shejialuo wrote:
> On Wed, Oct 09, 2024 at 10:05:19AM +0200, Patrick Steinhardt wrote:
> > > The fsck.c:report() function, which is the main function to report
> > > fsck's findings before fsck_report_ref() was introduced, did not
> > > have such a problem, as it allowed "const char *fmt, ..." at the
> > > end. Is it too late to fix the fsck_report_ref()?
> >
> > I don't think so, I think we should be able to refactor the code rather
> > easily to do so.
> >
>
> It's not hard to refactor the code. But this is not the problem. I am a
> little confused here. Because we already allowed "fsck_report_ref"
> having "const char *fmt, ..." at the end.
Ah, I didn't double check, but was operating on what I understood from
this thread. In that case I think that the current interface is okay.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-10 6:52 ` Patrick Steinhardt
@ 2024-10-10 16:00 ` Junio C Hamano
0 siblings, 0 replies; 209+ messages in thread
From: Junio C Hamano @ 2024-10-10 16:00 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: shejialuo, Karthik Nayak, git
Patrick Steinhardt <ps@pks.im> writes:
> On Wed, Oct 09, 2024 at 07:59:20PM +0800, shejialuo wrote:
>> On Wed, Oct 09, 2024 at 10:05:19AM +0200, Patrick Steinhardt wrote:
>> > > The fsck.c:report() function, which is the main function to report
>> > > fsck's findings before fsck_report_ref() was introduced, did not
>> > > have such a problem, as it allowed "const char *fmt, ..." at the
>> > > end. Is it too late to fix the fsck_report_ref()?
>> >
>> > I don't think so, I think we should be able to refactor the code rather
>> > easily to do so.
>> >
>>
>> It's not hard to refactor the code. But this is not the problem. I am a
>> little confused here. Because we already allowed "fsck_report_ref"
>> having "const char *fmt, ..." at the end.
>
> Ah, I didn't double check, but was operating on what I understood from
> this thread. In that case I think that the current interface is okay.
I didn't, either. So there is an obvious way out for "why aren't we
telling the errno to users" issue? That's good.
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-08 17:44 ` Junio C Hamano
2024-10-09 8:05 ` Patrick Steinhardt
@ 2024-10-09 11:55 ` shejialuo
1 sibling, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-09 11:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Karthik Nayak, git, Patrick Steinhardt
On Tue, Oct 08, 2024 at 10:44:53AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > On Tue, Oct 08, 2024 at 12:43:20AM -0700, Karthik Nayak wrote:
> >> shejialuo <shejialuo@gmail.com> writes:
> >>
> >> [snip]
> >>
> >> > + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> >> > + ret = fsck_report_ref(o, &report,
> >> > + FSCK_MSG_BAD_REF_CONTENT,
> >> > + "cannot read ref file");
> >> > + goto cleanup;
> >> > + }
> >> > +
> >>
> >> Shouldn't we use `die_errno` here instead? I mean, this is not really a
> >> bad ref content issue. If we don't want to die here, it would still
> >> probably be nice to get the actual issue using `strerror` instead and
> >> use that instead of the generic message we have here.
> >>
> >
> > Well, I think I need to dive into the "open" system call here. Actually,
> > we have two opinions now. Junio thought that we should use
> > "fsck_report_ref" to report. Karthik, Patrick and I thought that we
> > should report using "*errno" because this is a general error.
>
> What do you mean by "a general error"? It is true that we failed to
> read a ref file, so even if it is an I/O error, I'd think it is OK
> to report it as an error while reading one particular ref.
Make sense.
> Giving more information is a separate issue. If fsck_report_ref()
> can be extended to take something like
>
> "cannot read ref file '%s': (%s)", iter->path.buf, strerror(errno)
>
> that would give the user necessary information.
At current, the `fsck_report_ref` can do this. I think I used
`fsck_report_ref` function badly in this case.
> And I agree with half-of what Karthik said, i.e., we do not want to
> die here if this is meant to run as a part of "git fsck".
Yes, we should not die the program. Instead, we need to continuously
check other refs.
> I may have said this before, but quite frankly, the API into the
> fsck_report_ref() function is misdesigned. If the single constant
> string "cannot read ref file" cnanot give more information than
> FSCK_MSG_BAD_REF_CONTENT, the parameter has no value.
>
> The fsck.c:report() function, which is the main function to report
> fsck's findings before fsck_report_ref() was introduced, did not
> have such a problem, as it allowed "const char *fmt, ..." at the
> end. Is it too late to fix the fsck_report_ref()?
I agree that if the FSCK message id could explain the error well, there
is no need for us to provide extra message. But, I want to say the
`fsck_report_ref` is not misdesigned here. It is just the same as the
"fsck.c::report" function which has "const char *fmt, ..." at the end
like the following shows:
int fsck_report_ref(struct fsck_options *options,
struct fsck_ref_report *report,
enum fsck_msg_id msg_id,
const char *fmt, ...)
And I do think "fsck.c::report" function also has the above problems.
Let me give you some examples here in "fsck.c":
report(options, tree_oid, OBJ_TREE,
FSCK_MSG_BAD_FILEMODE,
"contains bad file modes");
report(options, tree_oid, OBJ_TREE,
FSCK_MSG_DUPLICATE_ENTRIES,
"contains duplicate file entries");
...
So, I want to say there is no difference between "fsck_ref_report" and
"fsck.c::report". When I refactored the code in GSoC journey, the main
problem is that we should reuse the original "fsck.c::report" code
instead of writing redundant codes.
The final result is I extract a new function "fsck_vreport" here (I
leverage the original "fsck.c::report" function) which will be called by
"fsck_ref_report" and "fsck.c::report".
static int fsck_vreport(struct fsck_options *options,
void *fsck_report,
enum fsck_msg_id msg_id,
const char *fmt, va_list ap)
From my perspective, if we decide to refactor, we should allow the user
call the followings:
fsck_ref_report(..., FSCK_MSG_BAD_REF_CONTENT, NULL);
report(..., FSCK_MSG_DUPLICATE_ENTRIES, NULL);
So, we should check whether `fmt` is NULL in the `fsck_vreport`
function to make sure that if FSCK message is good enough to explain
what happens, we should not pass any message.
> Thanks.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v5 4/9] ref: add more strict checks for regular refs
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (2 preceding siblings ...)
2024-09-29 7:15 ` [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-09-29 7:16 ` shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-09-29 7:16 ` [PATCH v5 5/9] ref: add basic symref content check for files backend shejialuo
` (7 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:16 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already used "parse_loose_ref_contents" function to check
whether the ref content is valid in files backend. However, by
using "parse_loose_ref_contents", we allow the ref's content to end with
garbage or without a newline.
Even though we never create such loose refs ourselves, we have accepted
such loose refs. So, it is entirely possible that some third-party tools
may rely on such loose refs being valid. We should not report an error
fsck message at current. We should notify the users about such
"curiously formatted" loose refs so that adequate care is taken before
we decide to tighten the rules in the future.
And it's not suitable either to report a warn fsck message to the user.
We don't yet want the "--strict" flag that controls this bit to end up
generating errors for such weirdly-formatted reference contents, as we
first want to assess whether this retroactive tightening will cause
issues for any tools out there. It may cause compatibility issues which
may break the repository. So we add the "unofficialFormattedRef(INFO)"
fsck message to represent the situation where the ref format is not
officially created by us and notify the users it may become an error in
the future.
It might appear that we can't provide the user with any warnings by
using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert
FSCK_INFO to FSCK_WARN and we can still warn the user about these
situations when using "git refs verify" without introducing
compatibility issues.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 8 +++++
fsck.h | 1 +
refs.c | 2 +-
refs/files-backend.c | 26 +++++++++++++--
refs/refs-internal.h | 2 +-
t/t0602-reffiles-fsck.sh | 59 +++++++++++++++++++++++++++++++++++
6 files changed, 93 insertions(+), 5 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 22c385ea22..e310b5bce9 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -179,6 +179,14 @@
`unknownType`::
(ERROR) Found an unknown object type.
+`unofficialFormattedRef`::
+ (INFO) The content of a loose ref file is not in the official
+ format such as not having a LF at the end or having trailing
+ garbage. As valid implementations of Git never created such a
+ loose ref file, it may become an error in the future. Report
+ to the git@vger.kernel.org mailing list if you see this error,
+ as we need to know what tools created such a file.
+
`unterminatedHeader`::
(FATAL) Missing end-of-line in the object header.
diff --git a/fsck.h b/fsck.h
index 0d99a87911..7420add5c0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -85,6 +85,7 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(UNOFFICIAL_FORMATTED_REF, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs.c b/refs.c
index 5f729ed412..6ba1bb1aa1 100644
--- a/refs.c
+++ b/refs.c
@@ -1788,7 +1788,7 @@ static int refs_read_special_head(struct ref_store *ref_store,
}
result = parse_loose_ref_contents(ref_store->repo->hash_algo, content.buf,
- oid, referent, type, failure_errno);
+ oid, referent, type, NULL, failure_errno);
done:
strbuf_release(&full_path);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 35b3fa983e..b2a790c884 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -568,7 +568,7 @@ static int read_ref_internal(struct ref_store *ref_store, const char *refname,
buf = sb_contents.buf;
ret = parse_loose_ref_contents(ref_store->repo->hash_algo, buf,
- oid, referent, type, &myerr);
+ oid, referent, type, NULL, &myerr);
out:
if (ret && !myerr)
@@ -605,7 +605,7 @@ static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno)
+ const char **trailing, int *failure_errno)
{
const char *p;
if (skip_prefix(buf, "ref:", &buf)) {
@@ -627,6 +627,10 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
*failure_errno = EINVAL;
return -1;
}
+
+ if (trailing)
+ *trailing = p;
+
return 0;
}
@@ -3513,6 +3517,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct strbuf referent = STRBUF_INIT;
struct strbuf refname = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
+ const char *trailing = NULL;
unsigned int type = 0;
int failure_errno = 0;
struct object_id oid;
@@ -3533,7 +3538,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
if (parse_loose_ref_contents(ref_store->repo->hash_algo,
ref_content.buf, &oid, &referent,
- &type, &failure_errno)) {
+ &type, &trailing, &failure_errno)) {
strbuf_rtrim(&ref_content);
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_CONTENT,
@@ -3541,6 +3546,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
+ if (!(type & REF_ISSYMREF)) {
+ if (!*trailing) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ "misses LF at the end");
+ goto cleanup;
+ }
+ if (*trailing != '\n' || *(trailing + 1)) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ "has trailing garbage: '%s'", trailing);
+ goto cleanup;
+ }
+ }
+
cleanup:
strbuf_release(&refname);
strbuf_release(&ref_content);
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 2313c830d8..73b05f971b 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -715,7 +715,7 @@ struct ref_store {
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno);
+ const char **trailing, int *failure_errno);
/*
* Fill in the generic part of refs and add it to our collection of
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 628f9bcc46..2f5c4a1926 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -185,6 +185,61 @@ test_expect_success 'regular ref content should be checked (individual)' '
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
EOF
rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: unofficialFormattedRef: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-garbage-1: unofficialFormattedRef: has trailing garbage: '\''
+
+
+ '\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-1 &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-2 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-garbage-2: unofficialFormattedRef: has trailing garbage: '\''
+
+
+ garbage'\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-2 &&
+ test_cmp expect err &&
+
+ printf "%s garbage\na" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-garbage-3: unofficialFormattedRef: has trailing garbage: '\'' garbage
+ a'\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-3 &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-4 &&
+ test_must_fail git -c fsck.unofficialFormattedRef=error refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/tag-garbage-4: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-4 &&
test_cmp expect err
'
@@ -203,12 +258,16 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: unofficialFormattedRef: misses LF at the end
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 4/9] ref: add more strict checks for regular refs
2024-09-29 7:16 ` [PATCH v5 4/9] ref: add more strict checks for regular refs shejialuo
@ 2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:44 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 6:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Sep 29, 2024 at 03:16:00PM +0800, shejialuo wrote:
> diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> index 22c385ea22..e310b5bce9 100644
> --- a/Documentation/fsck-msgids.txt
> +++ b/Documentation/fsck-msgids.txt
> @@ -179,6 +179,14 @@
> `unknownType`::
> (ERROR) Found an unknown object type.
>
> +`unofficialFormattedRef`::
> + (INFO) The content of a loose ref file is not in the official
> + format such as not having a LF at the end or having trailing
> + garbage. As valid implementations of Git never created such a
> + loose ref file, it may become an error in the future. Report
> + to the git@vger.kernel.org mailing list if you see this error,
> + as we need to know what tools created such a file.
> +
I find "unofficial" to be a tad weird. Do we rather want to say
something like "badRefTrailingGarbage"?
> @@ -3541,6 +3546,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
> goto cleanup;
> }
>
> + if (!(type & REF_ISSYMREF)) {
> + if (!*trailing) {
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
> + "misses LF at the end");
> + goto cleanup;
> + }
> + if (*trailing != '\n' || *(trailing + 1)) {
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
> + "has trailing garbage: '%s'", trailing);
> + goto cleanup;
> + }
> + }
> +
I think we should discern these two error cases and provide different
message IDs.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 4/9] ref: add more strict checks for regular refs
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-07 8:44 ` shejialuo
2024-10-07 9:25 ` Patrick Steinhardt
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-10-07 8:44 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 08:58:37AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:16:00PM +0800, shejialuo wrote:
> > diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> > index 22c385ea22..e310b5bce9 100644
> > --- a/Documentation/fsck-msgids.txt
> > +++ b/Documentation/fsck-msgids.txt
> > @@ -179,6 +179,14 @@
> > `unknownType`::
> > (ERROR) Found an unknown object type.
> >
> > +`unofficialFormattedRef`::
> > + (INFO) The content of a loose ref file is not in the official
> > + format such as not having a LF at the end or having trailing
> > + garbage. As valid implementations of Git never created such a
> > + loose ref file, it may become an error in the future. Report
> > + to the git@vger.kernel.org mailing list if you see this error,
> > + as we need to know what tools created such a file.
> > +
>
> I find "unofficial" to be a tad weird. Do we rather want to say
> something like "badRefTrailingGarbage"?
>
Well, I will answer this question just in below question together.
> > @@ -3541,6 +3546,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
> > goto cleanup;
> > }
> >
> > + if (!(type & REF_ISSYMREF)) {
> > + if (!*trailing) {
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
> > + "misses LF at the end");
> > + goto cleanup;
> > + }
> > + if (*trailing != '\n' || *(trailing + 1)) {
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
> > + "has trailing garbage: '%s'", trailing);
> > + goto cleanup;
> > + }
> > + }
> > +
>
> I think we should discern these two error cases and provide different
> message IDs.
>
Actually, in the previous versions, I have mapped one message id to one
error case. But, in the v4, Junio asked a question
Not limited to this patch, but isn't fsck_report_ref() misdesigned,
or is it just they are used poorly in these patches? In these two
callsites, the message string parameter does not give any more
information than what the FSCK_MSG_* enum gives.
That is what I meant by "misdesigned"---if one message enum always
corresponds to one human-readable message, there is not much point
in forcing callers to supply both, is there?
In my opinion, we should have only one case here for trailing garbage
and not end with a newline. When writing the code, I chose the name
"unofficialFormattedRef" for the following reason:
1. If we use two message ids here, for every message id, we need write
to info the user "please report this to git mailing list".
2. If we decide to make this as an error. We could just classify them
into "badRefContent" message category.
3. The semantic is correct here, they are truly curious formatted
refs, and eventually we will give the info to the user what is
curious.
So, I think we should not always map one message to one error case.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 4/9] ref: add more strict checks for regular refs
2024-10-07 8:44 ` shejialuo
@ 2024-10-07 9:25 ` Patrick Steinhardt
2024-10-07 12:19 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 9:25 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 04:44:16PM +0800, shejialuo wrote:
> On Mon, Oct 07, 2024 at 08:58:37AM +0200, Patrick Steinhardt wrote:
> > On Sun, Sep 29, 2024 at 03:16:00PM +0800, shejialuo wrote:
> > > diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> > > index 22c385ea22..e310b5bce9 100644
> > > --- a/Documentation/fsck-msgids.txt
> > > +++ b/Documentation/fsck-msgids.txt
> > > @@ -3541,6 +3546,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
> > > goto cleanup;
> > > }
> > >
> > > + if (!(type & REF_ISSYMREF)) {
> > > + if (!*trailing) {
> > > + ret = fsck_report_ref(o, &report,
> > > + FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
> > > + "misses LF at the end");
> > > + goto cleanup;
> > > + }
> > > + if (*trailing != '\n' || *(trailing + 1)) {
> > > + ret = fsck_report_ref(o, &report,
> > > + FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
> > > + "has trailing garbage: '%s'", trailing);
> > > + goto cleanup;
> > > + }
> > > + }
> > > +
> >
> > I think we should discern these two error cases and provide different
> > message IDs.
> >
>
> Actually, in the previous versions, I have mapped one message id to one
> error case. But, in the v4, Junio asked a question
>
> Not limited to this patch, but isn't fsck_report_ref() misdesigned,
> or is it just they are used poorly in these patches? In these two
> callsites, the message string parameter does not give any more
> information than what the FSCK_MSG_* enum gives.
>
> That is what I meant by "misdesigned"---if one message enum always
> corresponds to one human-readable message, there is not much point
> in forcing callers to supply both, is there?
>
> In my opinion, we should have only one case here for trailing garbage
> and not end with a newline. When writing the code, I chose the name
> "unofficialFormattedRef" for the following reason:
>
> 1. If we use two message ids here, for every message id, we need write
> to info the user "please report this to git mailing list".
>
> 2. If we decide to make this as an error. We could just classify them
> into "badRefContent" message category.
>
> 3. The semantic is correct here, they are truly curious formatted
> refs, and eventually we will give the info to the user what is
> curious.
>
> So, I think we should not always map one message to one error case.
From my point of view the error codes should be the single source of
truth, as this is what a user can use to disable specific checks. So if
one code maps to multiple messages they have the problem that they can
only disable all of those messages.
I don't disagree with what Junio is saying. It is somewhat duplicate
that the user has to pass both a code and a message in the current
form-- it should be sufficient for them to pass the code, and the
message can then e.g. be extracted from a central array that maps codes
to messages.
But you can also make the reverse argument: messages can be dynamic, so
that the caller may include additional details around why specfically
the check failed. The code and message would still be 1:1, but we may
include additional details like that to guide the user.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 4/9] ref: add more strict checks for regular refs
2024-10-07 9:25 ` Patrick Steinhardt
@ 2024-10-07 12:19 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-07 12:19 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 11:25:17AM +0200, Patrick Steinhardt wrote:
[snip]
> >
> > Actually, in the previous versions, I have mapped one message id to one
> > error case. But, in the v4, Junio asked a question
> >
> > Not limited to this patch, but isn't fsck_report_ref() misdesigned,
> > or is it just they are used poorly in these patches? In these two
> > callsites, the message string parameter does not give any more
> > information than what the FSCK_MSG_* enum gives.
> >
> > That is what I meant by "misdesigned"---if one message enum always
> > corresponds to one human-readable message, there is not much point
> > in forcing callers to supply both, is there?
> >
> > In my opinion, we should have only one case here for trailing garbage
> > and not end with a newline. When writing the code, I chose the name
> > "unofficialFormattedRef" for the following reason:
> >
> > 1. If we use two message ids here, for every message id, we need write
> > to info the user "please report this to git mailing list".
> >
> > 2. If we decide to make this as an error. We could just classify them
> > into "badRefContent" message category.
> >
> > 3. The semantic is correct here, they are truly curious formatted
> > refs, and eventually we will give the info to the user what is
> > curious.
> >
> > So, I think we should not always map one message to one error case.
>
> From my point of view the error codes should be the single source of
> truth, as this is what a user can use to disable specific checks. So if
> one code maps to multiple messages they have the problem that they can
> only disable all of those messages.
>
Thanks for your remind here. I totally forgot this. I have changed my
mind now, we should use one to one mapping here. As you said, if we do
not, we will give the user the bad experience.
> I don't disagree with what Junio is saying. It is somewhat duplicate
> that the user has to pass both a code and a message in the current
> form-- it should be sufficient for them to pass the code, and the
> message can then e.g. be extracted from a central array that maps codes
> to messages.
>
> But you can also make the reverse argument: messages can be dynamic, so
> that the caller may include additional details around why specfically
> the check failed. The code and message would still be 1:1, but we may
> include additional details like that to guide the user.
>
Yes, I will refactor the "fsck_report" to allow the user pass the "NULL"
message if the fsck message id is clear enough to indicate the error
case.
So, more things to do here.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v5 5/9] ref: add basic symref content check for files backend
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (3 preceding siblings ...)
2024-09-29 7:16 ` [PATCH v5 4/9] ref: add more strict checks for regular refs shejialuo
@ 2024-09-29 7:16 ` shejialuo
2024-10-08 7:58 ` Karthik Nayak
2024-09-29 7:16 ` [PATCH v5 6/9] ref: add escape check for the referent of symref shejialuo
` (6 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:16 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have code that checks regular ref contents, but we do not yet check
the contents of symbolic refs. By using "parse_loose_ref_content" for
symbolic refs, we will get the information of the "referent".
We do not need to check the "referent" by opening the file. This is
because if "referent" exists in the file system, we will eventually
check its correctness by inspecting every file in the "refs" directory.
If the "referent" does not exist in the filesystem, this is OK as it is
seen as the dangling symref.
So we just need to check the "referent" string content. A regular could
be accepted as a textual symref if it begins with "ref:", followed by
zero or more whitespaces, followed by the full refname, followed only by
whitespace characters. However, we always write a single SP after "ref:"
and a single LF after the refname. It may seem that we should report a
fsck error message when the "referent" does not apply above rules and we
should not be so aggressive because third-party reimplementations of Git
may have taken advantage of the looser syntax. Put it more specific, we
accept the following "referent":
1. "ref: refs/heads/master "
2. "ref: refs/heads/master \n \n"
3. "ref: refs/heads/master\n\n"
When introducing the regular ref content checks, we created a new fsck
message "unofficialFormattedRef" which exactly represents above
situation. So we will reuse this fsck message to write checks to info
the user about these situations.
But we do not allow any other trailing garbage. The followings are bad
symref contents which will be reported as fsck error by "git-fsck(1)".
1. "ref: refs/heads/master garbage\n"
2. "ref: refs/heads/master \n\n\n garbage "
And we introduce a new "badReferent(ERROR)" fsck message to report above
errors by using "ref.c::check_refname_format". But we cannot just pass
the "referent" to this function because the "referent" might contain
some whitespaces which will cause "check_refname_format" failing.
In order to add checks, we will do the following things:
1. Record the untrimmed length "orig_len" and untrimmed last byte
"orig_last_byte".
2. Use "strbuf_rtrim" to trim the whitespaces or newlines to make sure
"check_refname_format" won't be failed by them.
3. Use "orig_len" and "orig_last_byte" to check whether the "referent"
misses '\n' at the end or it has trailing whitespaces or newlines.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 ++
fsck.h | 1 +
refs/files-backend.c | 40 +++++++++++++++
t/t0602-reffiles-fsck.sh | 97 +++++++++++++++++++++++++++++++++++
4 files changed, 141 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index e310b5bce9..e0e4519334 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -28,6 +28,9 @@
`badRefName`::
(ERROR) A ref has an invalid format.
+`badReferent`::
+ (ERROR) The referent of a ref is invalid.
+
`badTagName`::
(INFO) A tag has an invalid format.
diff --git a/fsck.h b/fsck.h
index 7420add5c0..979d75cb53 100644
--- a/fsck.h
+++ b/fsck.h
@@ -34,6 +34,7 @@ enum fsck_msg_type {
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
+ FUNC(BAD_REFERENT, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
FUNC(BAD_TREE, ERROR) \
FUNC(BAD_TREE_SHA1, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b2a790c884..57ac466b64 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3508,6 +3508,43 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refs_check_dir,
struct dir_iterator *iter);
+static int files_fsck_symref_target(struct fsck_options *o,
+ struct fsck_ref_report *report,
+ struct strbuf *referent)
+{
+ char orig_last_byte;
+ size_t orig_len;
+ int ret = 0;
+
+ orig_len = referent->len;
+ orig_last_byte = referent->buf[orig_len - 1];
+ strbuf_rtrim(referent);
+
+ if (check_refname_format(referent->buf, 0)) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_BAD_REFERENT,
+ "points to invalid refname '%s'", referent->buf);
+ goto out;
+ }
+
+
+ if (referent->len == orig_len ||
+ (referent->len < orig_len && orig_last_byte != '\n')) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ "misses LF at the end");
+ }
+
+ if (referent->len != orig_len && referent->len != orig_len - 1) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ "has trailing whitespaces or newlines");
+ }
+
+out:
+ return ret;
+}
+
static int files_fsck_refs_content(struct ref_store *ref_store,
struct fsck_options *o,
const char *refs_check_dir,
@@ -3559,6 +3596,9 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
"has trailing garbage: '%s'", trailing);
goto cleanup;
}
+ } else {
+ ret = files_fsck_symref_target(o, &report, &referent);
+ goto cleanup;
}
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 2f5c4a1926..718f6abb71 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -273,4 +273,101 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'textual symref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err &&
+
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline-1: unofficialFormattedRef: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-2: unofficialFormattedRef: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: unofficialFormattedRef: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferent: points to invalid refname '\''refs/heads/.branch'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err
+'
+
+test_expect_success 'textual symref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferent: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-2: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: unofficialFormattedRef: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
test_done
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 5/9] ref: add basic symref content check for files backend
2024-09-29 7:16 ` [PATCH v5 5/9] ref: add basic symref content check for files backend shejialuo
@ 2024-10-08 7:58 ` Karthik Nayak
2024-10-08 12:18 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Karthik Nayak @ 2024-10-08 7:58 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 2495 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> We have code that checks regular ref contents, but we do not yet check
> the contents of symbolic refs. By using "parse_loose_ref_content" for
> symbolic refs, we will get the information of the "referent".
>
> We do not need to check the "referent" by opening the file. This is
> because if "referent" exists in the file system, we will eventually
> check its correctness by inspecting every file in the "refs" directory.
> If the "referent" does not exist in the filesystem, this is OK as it is
> seen as the dangling symref.
>
> So we just need to check the "referent" string content. A regular could
seems like we're missing the noun here, a regular what?
> be accepted as a textual symref if it begins with "ref:", followed by
> zero or more whitespaces, followed by the full refname, followed only by
> whitespace characters. However, we always write a single SP after "ref:"
> and a single LF after the refname. It may seem that we should report a
> fsck error message when the "referent" does not apply above rules and we
> should not be so aggressive because third-party reimplementations of Git
> may have taken advantage of the looser syntax. Put it more specific, we
> accept the following "referent":
>
> 1. "ref: refs/heads/master "
> 2. "ref: refs/heads/master \n \n"
> 3. "ref: refs/heads/master\n\n"
>
> When introducing the regular ref content checks, we created a new fsck
> message "unofficialFormattedRef" which exactly represents above
> situation. So we will reuse this fsck message to write checks to info
> the user about these situations.
>
Plus to what Patrick said in the previous commit, it would be nice to
separate these issues with different message IDs.
> But we do not allow any other trailing garbage. The followings are bad
> symref contents which will be reported as fsck error by "git-fsck(1)".
>
> 1. "ref: refs/heads/master garbage\n"
> 2. "ref: refs/heads/master \n\n\n garbage "
>
> And we introduce a new "badReferent(ERROR)" fsck message to report above
> errors by using "ref.c::check_refname_format". But we cannot just pass
> the "referent" to this function because the "referent" might contain
> some whitespaces which will cause "check_refname_format" failing.
>
It would be nice if you could elaborate here, or rather restructure to
say something like..
Since 'check_refname_format' doesn't work with whitespaces, we use
the trimmed version of 'referent' with the function.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 5/9] ref: add basic symref content check for files backend
2024-10-08 7:58 ` Karthik Nayak
@ 2024-10-08 12:18 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-08 12:18 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano
On Tue, Oct 08, 2024 at 12:58:16AM -0700, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > We have code that checks regular ref contents, but we do not yet check
> > the contents of symbolic refs. By using "parse_loose_ref_content" for
> > symbolic refs, we will get the information of the "referent".
> >
> > We do not need to check the "referent" by opening the file. This is
> > because if "referent" exists in the file system, we will eventually
> > check its correctness by inspecting every file in the "refs" directory.
> > If the "referent" does not exist in the filesystem, this is OK as it is
> > seen as the dangling symref.
> >
> > So we just need to check the "referent" string content. A regular could
>
> seems like we're missing the noun here, a regular what?
>
It should be "a regular ref". I copied the original commit message and
may carelessly type "daw" in vim to delete the "ref". Thanks.
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v5 6/9] ref: add escape check for the referent of symref
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (4 preceding siblings ...)
2024-09-29 7:16 ` [PATCH v5 5/9] ref: add basic symref content check for files backend shejialuo
@ 2024-09-29 7:16 ` shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-09-29 7:17 ` [PATCH v5 7/9] ref: enhance escape situation for worktrees shejialuo
` (5 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:16 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Ideally, we want to the users use "git symbolic-ref" to create symrefs
instead of writing raw contents into the filesystem. However, "git
symbolic-ref" is strict with the refname but not strict with the
referent. For example, we can make the "referent" located at the
"$(gitdir)/logs/aaa" and manually write the content into this where we
can still successfully parse this symref by using "git rev-parse".
$ git init repo && cd repo && git commit --allow-empty -mx
$ git symbolic-ref refs/heads/test logs/aaa
$ echo $(git rev-parse HEAD) > .git/logs/aaa
$ git rev-parse test
We may need to add some restrictions for "referent" parameter when using
"git symbolic-ref" to create symrefs because ideally all the
nonpeudo-refs should be located under the "refs" directory and we may
tighten this in the future.
In order to tell the user we may tighten the "escape" situation, create
a new fsck message "escapeReferent" to notify the user that this may
become an error in the future.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 8 ++++++++
fsck.h | 1 +
refs/files-backend.c | 7 +++++++
t/t0602-reffiles-fsck.sh | 18 ++++++++++++++++++
4 files changed, 34 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index e0e4519334..223974057d 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -52,6 +52,14 @@
`emptyName`::
(WARN) A path contains an empty name.
+`escapeReferent`::
+ (INFO) The referent of a symref is outside the "ref" directory.
+ Although we allow create a symref pointing to the referent which
+ is outside the "ref" by using `git symbolic-ref`, we may tighten
+ the rule in the future. Report to the git@vger.kernel.org
+ mailing list if you see this error, as we need to know what tools
+ created such a file.
+
`extraHeaderEntry`::
(IGNORE) Extra headers found after `tagger`.
diff --git a/fsck.h b/fsck.h
index 979d75cb53..5ecee0fda5 100644
--- a/fsck.h
+++ b/fsck.h
@@ -80,6 +80,7 @@ enum fsck_msg_type {
FUNC(LARGE_PATHNAME, WARN) \
/* infos (reported as warnings, but ignored by default) */ \
FUNC(BAD_FILEMODE, INFO) \
+ FUNC(ESCAPE_REFERENT, INFO) \
FUNC(GITMODULES_PARSE, INFO) \
FUNC(GITIGNORE_SYMLINK, INFO) \
FUNC(GITATTRIBUTES_SYMLINK, INFO) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 57ac466b64..bd215c8d08 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3520,6 +3520,13 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_last_byte = referent->buf[orig_len - 1];
strbuf_rtrim(referent);
+ if (!starts_with(referent->buf, "refs/")) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_ESCAPE_REFERENT,
+ "referent '%s' is outside of refs/",
+ referent->buf);
+ }
+
if (check_refname_format(referent->buf, 0)) {
ret = fsck_report_ref(o, report,
FSCK_MSG_BAD_REFERENT,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 718f6abb71..585f562245 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -370,4 +370,22 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'textual symref should be checked whether it is escaped' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs-back/heads/main\n" >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: escapeReferent: referent '\''refs-back/heads/main'\'' is outside of refs/
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err
+'
+
test_done
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 6/9] ref: add escape check for the referent of symref
2024-09-29 7:16 ` [PATCH v5 6/9] ref: add escape check for the referent of symref shejialuo
@ 2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:44 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 6:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Sep 29, 2024 at 03:16:21PM +0800, shejialuo wrote:
> Ideally, we want to the users use "git symbolic-ref" to create symrefs
> instead of writing raw contents into the filesystem. However, "git
> symbolic-ref" is strict with the refname but not strict with the
> referent. For example, we can make the "referent" located at the
> "$(gitdir)/logs/aaa" and manually write the content into this where we
> can still successfully parse this symref by using "git rev-parse".
>
> $ git init repo && cd repo && git commit --allow-empty -mx
> $ git symbolic-ref refs/heads/test logs/aaa
> $ echo $(git rev-parse HEAD) > .git/logs/aaa
> $ git rev-parse test
Oh, curious. This should definitely be tightened in git-symbolic-ref(1)
itself. The target should either be a root ref or something starting
with "refs/". Anyway, that is of course outside of the scope of this
patch series.
> We may need to add some restrictions for "referent" parameter when using
> "git symbolic-ref" to create symrefs because ideally all the
> nonpeudo-refs should be located under the "refs" directory and we may
> tighten this in the future.
Agreed.
> In order to tell the user we may tighten the "escape" situation, create
> a new fsck message "escapeReferent" to notify the user that this may
> become an error in the future.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> Documentation/fsck-msgids.txt | 8 ++++++++
> fsck.h | 1 +
> refs/files-backend.c | 7 +++++++
> t/t0602-reffiles-fsck.sh | 18 ++++++++++++++++++
> 4 files changed, 34 insertions(+)
>
> diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> index e0e4519334..223974057d 100644
> --- a/Documentation/fsck-msgids.txt
> +++ b/Documentation/fsck-msgids.txt
> @@ -52,6 +52,14 @@
> `emptyName`::
> (WARN) A path contains an empty name.
>
> +`escapeReferent`::
> + (INFO) The referent of a symref is outside the "ref" directory.
Proposal: 'The referent of a symbolic reference points neither to a root
reference nor to a reference starting with "refs/".'
I'd also rename this to e.g. "symrefTargetIsNotAReference" or something
like that, because it's not really about whether or not the referent is
"escaping". It's a bit of a mouthful, but I don't really have a better
name. So feel free to pick something different that describes the error
better.
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 57ac466b64..bd215c8d08 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3520,6 +3520,13 @@ static int files_fsck_symref_target(struct fsck_options *o,
> orig_last_byte = referent->buf[orig_len - 1];
> strbuf_rtrim(referent);
>
> + if (!starts_with(referent->buf, "refs/")) {
> + ret = fsck_report_ref(o, report,
> + FSCK_MSG_ESCAPE_REFERENT,
> + "referent '%s' is outside of refs/",
> + referent->buf);
> + }
> +
> if (check_refname_format(referent->buf, 0)) {
> ret = fsck_report_ref(o, report,
> FSCK_MSG_BAD_REFERENT,
This check is invalid, because referents can also point to root refs. So
you should probably also add a call to `is_root_ref()` here.
We also have `is_pseudo_ref()`, and one might be tempted to also allow
that. But pseudo refs aren't proper refs, so I'd argue that a symref
pointing to a pseudo ref is invalid, too.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 6/9] ref: add escape check for the referent of symref
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-07 8:44 ` shejialuo
2024-10-07 9:26 ` Patrick Steinhardt
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-10-07 8:44 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 08:58:55AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:16:21PM +0800, shejialuo wrote:
> > Ideally, we want to the users use "git symbolic-ref" to create symrefs
> > instead of writing raw contents into the filesystem. However, "git
> > symbolic-ref" is strict with the refname but not strict with the
> > referent. For example, we can make the "referent" located at the
> > "$(gitdir)/logs/aaa" and manually write the content into this where we
> > can still successfully parse this symref by using "git rev-parse".
> >
> > $ git init repo && cd repo && git commit --allow-empty -mx
> > $ git symbolic-ref refs/heads/test logs/aaa
> > $ echo $(git rev-parse HEAD) > .git/logs/aaa
> > $ git rev-parse test
>
> Oh, curious. This should definitely be tightened in git-symbolic-ref(1)
> itself. The target should either be a root ref or something starting
> with "refs/". Anyway, that is of course outside of the scope of this
> patch series.
>
I am curious here too when I did experiments when writing the code.
Because Junio have told me this could happen, so I dive into this.
However, it's not reasonable. If we want to tighten the rule, we need to
also let "git symbolic-ref" to align with the behavior. That's another
question though.
[snip]
> > diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> > index e0e4519334..223974057d 100644
> > --- a/Documentation/fsck-msgids.txt
> > +++ b/Documentation/fsck-msgids.txt
> > @@ -52,6 +52,14 @@
> > `emptyName`::
> > (WARN) A path contains an empty name.
> >
> > +`escapeReferent`::
> > + (INFO) The referent of a symref is outside the "ref" directory.
>
> Proposal: 'The referent of a symbolic reference points neither to a root
> reference nor to a reference starting with "refs/".'
>
That's much better.
> I'd also rename this to e.g. "symrefTargetIsNotAReference" or something
> like that, because it's not really about whether or not the referent is
> "escaping". It's a bit of a mouthful, but I don't really have a better
> name. So feel free to pick something different that describes the error
> better.
>
I guess "symrefTargetIsNotAReference" is a little too long. If we decide
to convert it to error later. Why not just put it into the "badReferent"
fsck message?
So, I do not think we need to rename. As I have talked about, we don't
need to map error case to fsck message id one by one.
> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index 57ac466b64..bd215c8d08 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -3520,6 +3520,13 @@ static int files_fsck_symref_target(struct fsck_options *o,
> > orig_last_byte = referent->buf[orig_len - 1];
> > strbuf_rtrim(referent);
> >
> > + if (!starts_with(referent->buf, "refs/")) {
> > + ret = fsck_report_ref(o, report,
> > + FSCK_MSG_ESCAPE_REFERENT,
> > + "referent '%s' is outside of refs/",
> > + referent->buf);
> > + }
> > +
> > if (check_refname_format(referent->buf, 0)) {
> > ret = fsck_report_ref(o, report,
> > FSCK_MSG_BAD_REFERENT,
>
> This check is invalid, because referents can also point to root refs. So
> you should probably also add a call to `is_root_ref()` here.
>
Thanks, I omit this situation here.
> We also have `is_pseudo_ref()`, and one might be tempted to also allow
> that. But pseudo refs aren't proper refs, so I'd argue that a symref
> pointing to a pseudo ref is invalid, too.
>
I agree.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 6/9] ref: add escape check for the referent of symref
2024-10-07 8:44 ` shejialuo
@ 2024-10-07 9:26 ` Patrick Steinhardt
0 siblings, 0 replies; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 9:26 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 04:44:44PM +0800, shejialuo wrote:
> On Mon, Oct 07, 2024 at 08:58:55AM +0200, Patrick Steinhardt wrote:
> > On Sun, Sep 29, 2024 at 03:16:21PM +0800, shejialuo wrote:
> > I'd also rename this to e.g. "symrefTargetIsNotAReference" or something
> > like that, because it's not really about whether or not the referent is
> > "escaping". It's a bit of a mouthful, but I don't really have a better
> > name. So feel free to pick something different that describes the error
> > better.
> >
>
> I guess "symrefTargetIsNotAReference" is a little too long. If we decide
> to convert it to error later. Why not just put it into the "badReferent"
> fsck message?
>
> So, I do not think we need to rename. As I have talked about, we don't
> need to map error case to fsck message id one by one.
Mostly because I disagree with this here. I think there should be a 1:1
mapping, and "badReferent" is too generic to provide that.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v5 7/9] ref: enhance escape situation for worktrees
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (5 preceding siblings ...)
2024-09-29 7:16 ` [PATCH v5 6/9] ref: add escape check for the referent of symref shejialuo
@ 2024-09-29 7:17 ` shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-09-29 7:17 ` [PATCH v5 8/9] t0602: add ref content checks " shejialuo
` (4 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:17 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We do allow users to use "git symbolic-ref" to create symrefs which
point to one of the linked worktrees from the primary worktree or one of
the linked worktrees.
We should not info the user about the escape for above situation. So,
enhance "files_fsck_symref_target" function to check whether the "referent"
starts with the "worktrees/" to make sure that we won't warn the user
when symrefs point to "referent" in the linked worktrees.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 5 +++--
t/t0602-reffiles-fsck.sh | 34 +++++++++++++++++++++++++++++++++-
2 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index bd215c8d08..1182bca108 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3520,10 +3520,11 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_last_byte = referent->buf[orig_len - 1];
strbuf_rtrim(referent);
- if (!starts_with(referent->buf, "refs/")) {
+ if (!starts_with(referent->buf, "refs/") &&
+ !starts_with(referent->buf, "worktrees/")) {
ret = fsck_report_ref(o, report,
FSCK_MSG_ESCAPE_REFERENT,
- "referent '%s' is outside of refs/",
+ "referent '%s' is outside of refs/ or worktrees/",
referent->buf);
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 585f562245..936448f780 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -382,10 +382,42 @@ test_expect_success 'textual symref should be checked whether it is escaped' '
printf "ref: refs-back/heads/main\n" >$branch_dir_prefix/branch-bad-1 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: escapeReferent: referent '\''refs-back/heads/main'\'' is outside of refs/
+ warning: refs/heads/branch-bad-1: escapeReferent: referent '\''refs-back/heads/main'\'' is outside of refs/ or worktrees/
EOF
rm $branch_dir_prefix/branch-bad-1 &&
test_cmp expect err
'
+test_expect_success 'textual symref escape check should work with worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+
+ (
+ cd worktree-1 &&
+ git branch refs/worktree/w1-branch &&
+ git symbolic-ref refs/worktree/branch-4 refs/heads/branch-1 &&
+ git symbolic-ref refs/worktree/branch-5 worktrees/worktree-2/refs/worktree/w2-branch
+ ) &&
+ (
+ cd worktree-2 &&
+ git branch refs/worktree/w2-branch &&
+ git symbolic-ref refs/worktree/branch-4 refs/heads/branch-1 &&
+ git symbolic-ref refs/worktree/branch-5 worktrees/worktree-1/refs/worktree/w1-branch
+ ) &&
+
+
+ git symbolic-ref refs/heads/branch-5 worktrees/worktree-1/refs/worktree/w1-branch &&
+ git symbolic-ref refs/heads/branch-6 worktrees/worktree-2/refs/worktree/w2-branch &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err
+'
+
test_done
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 7/9] ref: enhance escape situation for worktrees
2024-09-29 7:17 ` [PATCH v5 7/9] ref: enhance escape situation for worktrees shejialuo
@ 2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:45 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 6:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Sep 29, 2024 at 03:17:01PM +0800, shejialuo wrote:
> We do allow users to use "git symbolic-ref" to create symrefs which
> point to one of the linked worktrees from the primary worktree or one of
> the linked worktrees.
>
> We should not info the user about the escape for above situation. So,
> enhance "files_fsck_symref_target" function to check whether the "referent"
> starts with the "worktrees/" to make sure that we won't warn the user
> when symrefs point to "referent" in the linked worktrees.
Shouldn't this commit be squashed into the former one, as it immediately
fixes an edge case that was introduced with the parent commit?
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 7/9] ref: enhance escape situation for worktrees
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-07 8:45 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-07 8:45 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 08:58:40AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:17:01PM +0800, shejialuo wrote:
> > We do allow users to use "git symbolic-ref" to create symrefs which
> > point to one of the linked worktrees from the primary worktree or one of
> > the linked worktrees.
> >
> > We should not info the user about the escape for above situation. So,
> > enhance "files_fsck_symref_target" function to check whether the "referent"
> > starts with the "worktrees/" to make sure that we won't warn the user
> > when symrefs point to "referent" in the linked worktrees.
>
> Shouldn't this commit be squashed into the former one, as it immediately
> fixes an edge case that was introduced with the parent commit?
>
I partially agree here. I don't think this is an edge case that was
introduced with the parent commit. The reason why I use a new commit
here is that I want to emphasis the behavior.
This is because Junio asked me in the v4 about "escapeReferent"
I am not sure starting this as ERROR is wise. Users and third-party
tools make creative uses of the system and I cannot offhand think of
an argument why it should be forbidden to create a symbolic link to
our own HEAD or to some worktree-specific ref in another worktree.
Actually, I have never thought we could do this. So, this is my
intention. But I do agree that this commit is highly relevant with the
parent commit.
I will improve this in the next version.
> Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v5 8/9] t0602: add ref content checks for worktrees
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (6 preceding siblings ...)
2024-09-29 7:17 ` [PATCH v5 7/9] ref: enhance escape situation for worktrees shejialuo
@ 2024-09-29 7:17 ` shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-09-29 7:17 ` [PATCH v5 9/9] ref: add symlink ref content check for files backend shejialuo
` (3 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:17 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already added content tests, but we don't have tests when there
are worktrees in the repository. Add a new test to test all the
functionalities we have added for worktrees.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
t/t0602-reffiles-fsck.sh | 66 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 936448f780..97bbcd3f13 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -420,4 +420,70 @@ test_expect_success 'textual symref escape check should work with worktrees' '
test_must_be_empty err
'
+test_expect_success 'all textual symref checks should work with worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ bad_content_1=$(git rev-parse HEAD)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+
+ printf "%s" $bad_content_1 >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/worktree/bad-branch-1: badRefContent: $bad_content_1
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err &&
+
+ printf "%s" $bad_content_2 >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/worktree/bad-branch-2: badRefContent: $bad_content_2
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err &&
+
+ printf "%s" $bad_content_3 >$worktree1_refdir_prefix/bad-branch-3 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/worktree/bad-branch-3: badRefContent: $bad_content_3
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-3 &&
+ test_cmp expect err &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/worktree/branch-no-newline: unofficialFormattedRef: misses LF at the end
+ EOF
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree2_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/worktree/branch-garbage: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $worktree2_refdir_prefix/branch-garbage
+'
+
test_done
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 8/9] t0602: add ref content checks for worktrees
2024-09-29 7:17 ` [PATCH v5 8/9] t0602: add ref content checks " shejialuo
@ 2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:45 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 6:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Sep 29, 2024 at 03:17:18PM +0800, shejialuo wrote:
> We have already added content tests, but we don't have tests when there
> are worktrees in the repository. Add a new test to test all the
> functionalities we have added for worktrees.
I'd squash this commit into the one where you introduced checks for
worktrees. Or if this exercises errors that you have added in subsequent
commits I'd squash it into the respective commit that introduces those
checks.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 8/9] t0602: add ref content checks for worktrees
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-07 8:45 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-07 8:45 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 08:58:43AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:17:18PM +0800, shejialuo wrote:
> > We have already added content tests, but we don't have tests when there
> > are worktrees in the repository. Add a new test to test all the
> > functionalities we have added for worktrees.
>
> I'd squash this commit into the one where you introduced checks for
> worktrees. Or if this exercises errors that you have added in subsequent
> commits I'd squash it into the respective commit that introduces those
> checks.
>
Yes, make sense. I will improve this in the next version.
> Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v5 9/9] ref: add symlink ref content check for files backend
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (7 preceding siblings ...)
2024-09-29 7:17 ` [PATCH v5 8/9] t0602: add ref content checks " shejialuo
@ 2024-09-29 7:17 ` shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-09-30 18:57 ` [PATCH v5 0/9] add " Junio C Hamano
` (2 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-09-29 7:17 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already introduced "files_fsck_symref_target". We should reuse
this function to handle the symrefs which use legacy symbolic links. We
should not check the trailing garbage for symbolic refs. Add a new
parameter "symbolic_link" to disable some checks which should only be
executed for textual symrefs.
We firstly use the "strbuf_add_real_path" to resolve the symlink and
get the absolute path as the "ref_content" which the symlink ref points
to. Then we can use the absolute "abs_gitdir" of the "gitdir" and then
combine "ref_content" and "abs_gitdir" to extract the relative path
"referent". If "ref_content" is outside of "gitdir", we just use the
"ref_content" as the "referent". Thus, we can reuse
"files_fsck_symref_target" function to seamlessly check the symlink
refs.
Because we consider deprecating writing the symbolic links. We first
need to asses whether symbolic links may still be used. So, add a new
fsck message "symlinkRef(INFO)" to tell the user be aware of this
information.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 6 +++++
fsck.h | 1 +
refs/files-backend.c | 43 ++++++++++++++++++++++++++++-----
t/t0602-reffiles-fsck.sh | 45 +++++++++++++++++++++++++++++++++++
4 files changed, 89 insertions(+), 6 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 223974057d..ffe9d6a2f6 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -184,6 +184,12 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`symlinkRef`::
+ (INFO) A symbolic link is used as a symref. Report to the
+ git@vger.kernel.org mailing list if you see this error, as we
+ are assessing the feasibility of dropping the support to drop
+ creating symblinks as symrefs.
+
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
diff --git a/fsck.h b/fsck.h
index 5ecee0fda5..f1da5c8a77 100644
--- a/fsck.h
+++ b/fsck.h
@@ -87,6 +87,7 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(SYMLINK_REF, INFO) \
FUNC(UNOFFICIAL_FORMATTED_REF, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1182bca108..5a5327a146 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1,6 +1,7 @@
#define USE_THE_REPOSITORY_VARIABLE
#include "../git-compat-util.h"
+#include "../abspath.h"
#include "../config.h"
#include "../copy.h"
#include "../environment.h"
@@ -3510,15 +3511,18 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
- struct strbuf *referent)
+ struct strbuf *referent,
+ unsigned int symbolic_link)
{
char orig_last_byte;
size_t orig_len;
int ret = 0;
- orig_len = referent->len;
- orig_last_byte = referent->buf[orig_len - 1];
- strbuf_rtrim(referent);
+ if (!symbolic_link) {
+ orig_len = referent->len;
+ orig_last_byte = referent->buf[orig_len - 1];
+ strbuf_rtrim(referent);
+ }
if (!starts_with(referent->buf, "refs/") &&
!starts_with(referent->buf, "worktrees/")) {
@@ -3535,6 +3539,9 @@ static int files_fsck_symref_target(struct fsck_options *o,
goto out;
}
+ if (symbolic_link)
+ goto out;
+
if (referent->len == orig_len ||
(referent->len < orig_len && orig_last_byte != '\n')) {
@@ -3559,6 +3566,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct dir_iterator *iter)
{
struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf abs_gitdir = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct strbuf refname = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
@@ -3571,8 +3579,30 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
report.path = refname.buf;
- if (S_ISLNK(iter->st.st_mode))
+ if (S_ISLNK(iter->st.st_mode)) {
+ const char* relative_referent_path = NULL;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_SYMLINK_REF,
+ "use deprecated symbolic link for symref");
+
+ strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
+ strbuf_normalize_path(&abs_gitdir);
+ if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
+ strbuf_addch(&abs_gitdir, '/');
+
+ strbuf_add_real_path(&ref_content, iter->path.buf);
+ skip_prefix(ref_content.buf, abs_gitdir.buf,
+ &relative_referent_path);
+
+ if (relative_referent_path)
+ strbuf_addstr(&referent, relative_referent_path);
+ else
+ strbuf_addbuf(&referent, &ref_content);
+
+ ret += files_fsck_symref_target(o, &report, &referent, 1);
goto cleanup;
+ }
if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
ret = fsck_report_ref(o, &report,
@@ -3605,7 +3635,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
} else {
- ret = files_fsck_symref_target(o, &report, &referent);
+ ret = files_fsck_symref_target(o, &report, &referent, 0);
goto cleanup;
}
@@ -3613,6 +3643,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
strbuf_release(&refname);
strbuf_release(&ref_content);
strbuf_release(&referent);
+ strbuf_release(&abs_gitdir);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 97bbcd3f13..be4c064b3c 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -486,4 +486,49 @@ test_expect_success 'all textual symref checks should work with worktrees' '
rm $worktree2_refdir_prefix/branch-garbage
'
+test_expect_success SYMLINKS 'symlink symref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: escapeReferent: referent '\''logs/branch-escape'\'' is outside of refs/ or worktrees/
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch space" $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferent: points to invalid refname '\''refs/heads/branch space'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferent: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+'
+
test_done
--
2.46.2
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v5 9/9] ref: add symlink ref content check for files backend
2024-09-29 7:17 ` [PATCH v5 9/9] ref: add symlink ref content check for files backend shejialuo
@ 2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:45 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 6:58 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Sep 29, 2024 at 03:17:36PM +0800, shejialuo wrote:
> We have already introduced "files_fsck_symref_target". We should reuse
> this function to handle the symrefs which use legacy symbolic links. We
> should not check the trailing garbage for symbolic refs. Add a new
> parameter "symbolic_link" to disable some checks which should only be
> executed for textual symrefs.
You're getting into implementation details before noting what the actual
problem is. So I'd recommend first describing the problem at a higher
level, and then note that we can reuse parts of preexisting infra to
address the issue.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 9/9] ref: add symlink ref content check for files backend
2024-10-07 6:58 ` Patrick Steinhardt
@ 2024-10-07 8:45 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-07 8:45 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 07, 2024 at 08:58:50AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:17:36PM +0800, shejialuo wrote:
> > We have already introduced "files_fsck_symref_target". We should reuse
> > this function to handle the symrefs which use legacy symbolic links. We
> > should not check the trailing garbage for symbolic refs. Add a new
> > parameter "symbolic_link" to disable some checks which should only be
> > executed for textual symrefs.
>
> You're getting into implementation details before noting what the actual
> problem is. So I'd recommend first describing the problem at a higher
> level, and then note that we can reuse parts of preexisting infra to
> address the issue.
>
Thanks, I will improve this in the next version.
> Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 0/9] add ref content check for files backend
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (8 preceding siblings ...)
2024-09-29 7:17 ` [PATCH v5 9/9] ref: add symlink ref content check for files backend shejialuo
@ 2024-09-30 18:57 ` Junio C Hamano
2024-10-01 3:40 ` shejialuo
2024-10-07 12:49 ` shejialuo
2024-10-21 13:32 ` [PATCH v6 " shejialuo
11 siblings, 1 reply; 209+ messages in thread
From: Junio C Hamano @ 2024-09-30 18:57 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak
shejialuo <shejialuo@gmail.com> writes:
> Because I do not sync the upstream for a long time. For this series, I
> sync the latest upstream and generate the patch, it is based on
>
> 3857aae53f (Git 2.47-rc0, 2024-09-25)
Does this help to reduce conflicts when merging the topic to say
'next' or 'seen'? If so, such a rebase and noting it in the cover
letter message, like you just did, is very much appreciated.
If not, please don't ;-).
> And I don't think range-diff is useful, it is messy for the reviewers.
> Actually, there are not so many logic changes in this new version.
OK, so this needs a fresh full review. Thanks.
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 0/9] add ref content check for files backend
2024-09-30 18:57 ` [PATCH v5 0/9] add " Junio C Hamano
@ 2024-10-01 3:40 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-01 3:40 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Karthik Nayak
On Mon, Sep 30, 2024 at 11:57:19AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > Because I do not sync the upstream for a long time. For this series, I
> > sync the latest upstream and generate the patch, it is based on
> >
> > 3857aae53f (Git 2.47-rc0, 2024-09-25)
>
> Does this help to reduce conflicts when merging the topic to say
> 'next' or 'seen'? If so, such a rebase and noting it in the cover
> letter message, like you just did, is very much appreciated.
>
> If not, please don't ;-).
>
Actually, I am sure that there is no conflicts after squashing the
following two patches.
<xmqqle0gzdyh.fsf_-_@gitster.g>
<xmqqbk1cz69c.fsf@gitster.g>
The reason why I just sync the upstream is that the build system (such
as warning about unused parameters) and CIs are all changed.
I will remember this.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v5 0/9] add ref content check for files backend
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (9 preceding siblings ...)
2024-09-30 18:57 ` [PATCH v5 0/9] add " Junio C Hamano
@ 2024-10-07 12:49 ` shejialuo
2024-10-21 13:32 ` [PATCH v6 " shejialuo
11 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-07 12:49 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
From the discussion with Patrick in v5 and Junio in v4. I conclude the
follow things:
1. "fsck_ref_report" should not be refactored to accept `NULL`. There
would be only one situation where it will be a little bad (the content
of a ref does not end with a newline). In the other situations, the
message part will be useful, such as:
refs/heads/garbage-branch: trailingRefContent: ' garbage'.
refs/heads/escape: escapeReferent: referent 'xxx' is outside.
Although for some messages, only use fsck message id is enough. But we
could also specify the message. It's not harmful anyway.
2. The mapping from fsck message id to error case should be one to one.
This is essentially important because the user could set the fsck error
levels. If we use multiple to one, we will give the user a bad
experience. We should avoid this.
I will wait for more comments to ensure the next version will be better.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v6 0/9] add ref content check for files backend
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
` (10 preceding siblings ...)
2024-10-07 12:49 ` shejialuo
@ 2024-10-21 13:32 ` shejialuo
2024-10-21 13:34 ` [PATCH v6 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
` (11 more replies)
11 siblings, 12 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:32 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Hi All:
This new version updates the following things.
First, I want to talk about the new things. [PATCH v6 2/9] and [PATCH v6
3/9] are used to solve a bug when I implemented the checks for refname
for the following code:
if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
ret = fsck_report(...);
}
So, the code will wrongly report an error for "refs/heads/@". And I fix
this issue by using two commits.
For the difference against the previous version
1. Split [PATCH v5 8/9] into every related commit.
2. In [PATCH v6 4/9], print the worktree ref fullname to avoid
ambiguous.
3. Use one-to-one mapping fsck message.
4. Enhance the commit message and the usage of "fsck_report_ref" to
provide more useful information.
5. Rename "escapeReferent" to "symrefTargetIsNotARef". I agree that we
should use this. "escpae" is not right. However, I cannot find an
elegant name. So I follow the advice from Patrick.
I provide the "interdiff" here which will be helpful for reviewers.
Thanks,
Jialuo
shejialuo (9):
ref: initialize "fsck_ref_report" with zero
ref: check the full refname instead of basename
ref: initialize target name outside of check functions
ref: support multiple worktrees check for refs
ref: port git-fsck(1) regular refs check for files backend
ref: add more strict checks for regular refs
ref: add basic symref content check for files backend
ref: check whether the target of the symref is a ref
ref: add symlink ref content check for files backend
Documentation/fsck-msgids.txt | 35 +++
builtin/refs.c | 12 +-
fsck.h | 6 +
refs.c | 7 +-
refs.h | 3 +-
refs/debug.c | 5 +-
refs/files-backend.c | 187 ++++++++++++--
refs/packed-backend.c | 8 +-
refs/refs-internal.h | 5 +-
refs/reftable-backend.c | 3 +-
t/t0602-reffiles-fsck.sh | 457 +++++++++++++++++++++++++++++++++-
11 files changed, 693 insertions(+), 35 deletions(-)
Interdiff against v5:
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index ffe9d6a2f6..b14bc44ca4 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -28,8 +28,8 @@
`badRefName`::
(ERROR) A ref has an invalid format.
-`badReferent`::
- (ERROR) The referent of a ref is invalid.
+`badReferentName`::
+ (ERROR) The referent name of a symref is invalid.
`badTagName`::
(INFO) A tag has an invalid format.
@@ -52,14 +52,6 @@
`emptyName`::
(WARN) A path contains an empty name.
-`escapeReferent`::
- (INFO) The referent of a symref is outside the "ref" directory.
- Although we allow create a symref pointing to the referent which
- is outside the "ref" by using `git symbolic-ref`, we may tighten
- the rule in the future. Report to the git@vger.kernel.org
- mailing list if you see this error, as we need to know what tools
- created such a file.
-
`extraHeaderEntry`::
(IGNORE) Extra headers found after `tagger`.
@@ -184,11 +176,34 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`refMissingNewline`::
+ (INFO) A loose ref that does not end with newline(LF). As
+ valid implementations of Git never created such a loose ref
+ file, it may become an error in the future. Report to the
+ git@vger.kernel.org mailing list if you see this error, as
+ we need to know what tools created such a file.
+
`symlinkRef`::
- (INFO) A symbolic link is used as a symref. Report to the
+ (INFO) A symbolic link is used as a symref. Report to the
git@vger.kernel.org mailing list if you see this error, as we
are assessing the feasibility of dropping the support to drop
- creating symblinks as symrefs.
+ creating symbolic links as symrefs.
+
+`symrefTargetIsNotARef`::
+ (INFO) The target of a symbolic reference points neither to
+ a root reference nor to a reference starting with "refs/".
+ Although we allow create a symref pointing to the referent which
+ is outside the "ref" by using `git symbolic-ref`, we may tighten
+ the rule in the future. Report to the git@vger.kernel.org
+ mailing list if you see this error, as we need to know what tools
+ created such a file.
+
+`trailingRefContent`::
+ (INFO) A loose ref has trailing content. As valid implementations
+ of Git never created such a loose ref file, it may become an
+ error in the future. Report to the git@vger.kernel.org mailing
+ list if you see this error, as we need to know what tools
+ created such a file.
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
@@ -196,14 +211,6 @@
`unknownType`::
(ERROR) Found an unknown object type.
-`unofficialFormattedRef`::
- (INFO) The content of a loose ref file is not in the official
- format such as not having a LF at the end or having trailing
- garbage. As valid implementations of Git never created such a
- loose ref file, it may become an error in the future. Report
- to the git@vger.kernel.org mailing list if you see this error,
- as we need to know what tools created such a file.
-
`unterminatedHeader`::
(FATAL) Missing end-of-line in the object header.
diff --git a/builtin/refs.c b/builtin/refs.c
index 3c492ea922..886c4ceae3 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -89,9 +89,10 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
worktrees = get_worktrees();
for (p = worktrees; *p; p++) {
struct worktree *wt = *p;
- ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options);
+ ret |= refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options, wt);
}
+
fsck_options_clear(&fsck_refs_options);
free_worktrees(worktrees);
return ret;
diff --git a/fsck.h b/fsck.h
index f1da5c8a77..a44c231a5f 100644
--- a/fsck.h
+++ b/fsck.h
@@ -34,7 +34,7 @@ enum fsck_msg_type {
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
- FUNC(BAD_REFERENT, ERROR) \
+ FUNC(BAD_REFERENT_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
FUNC(BAD_TREE, ERROR) \
FUNC(BAD_TREE_SHA1, ERROR) \
@@ -80,7 +80,6 @@ enum fsck_msg_type {
FUNC(LARGE_PATHNAME, WARN) \
/* infos (reported as warnings, but ignored by default) */ \
FUNC(BAD_FILEMODE, INFO) \
- FUNC(ESCAPE_REFERENT, INFO) \
FUNC(GITMODULES_PARSE, INFO) \
FUNC(GITIGNORE_SYMLINK, INFO) \
FUNC(GITATTRIBUTES_SYMLINK, INFO) \
@@ -88,7 +87,9 @@ enum fsck_msg_type {
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
FUNC(SYMLINK_REF, INFO) \
- FUNC(UNOFFICIAL_FORMATTED_REF, INFO) \
+ FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
+ FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs.c b/refs.c
index 6ba1bb1aa1..f88b32a633 100644
--- a/refs.c
+++ b/refs.c
@@ -318,9 +318,10 @@ int check_refname_format(const char *refname, int flags)
return check_or_sanitize_refname(refname, flags, NULL);
}
-int refs_fsck(struct ref_store *refs, struct fsck_options *o)
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt)
{
- return refs->be->fsck(refs, o);
+ return refs->be->fsck(refs, o, wt);
}
void sanitize_refname_component(const char *refname, struct strbuf *out)
diff --git a/refs.h b/refs.h
index 108dfc93b3..341d43239c 100644
--- a/refs.h
+++ b/refs.h
@@ -549,7 +549,8 @@ int check_refname_format(const char *refname, int flags);
* reflogs are consistent, and non-zero otherwise. The errors will be
* written to stderr.
*/
-int refs_fsck(struct ref_store *refs, struct fsck_options *o);
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt);
/*
* Apply the rules from check_refname_format, but mutate the result until it
diff --git a/refs/debug.c b/refs/debug.c
index 45e2e784a0..72e80ddd6d 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -420,10 +420,11 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
}
static int debug_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
- int res = drefs->refs->be->fsck(drefs->refs, o);
+ int res = drefs->refs->be->fsck(drefs->refs, o, wt);
trace_printf_key(&trace_refs, "fsck: %d\n", res);
return res;
}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 5a5327a146..180f8e28b7 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -24,6 +24,7 @@
#include "../dir.h"
#include "../chdir-notify.h"
#include "../setup.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../revision.h"
@@ -3506,7 +3507,7 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
*/
typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *target_name,
struct dir_iterator *iter);
static int files_fsck_symref_target(struct fsck_options *o,
@@ -3514,27 +3515,29 @@ static int files_fsck_symref_target(struct fsck_options *o,
struct strbuf *referent,
unsigned int symbolic_link)
{
+ int is_referent_root;
char orig_last_byte;
size_t orig_len;
int ret = 0;
- if (!symbolic_link) {
- orig_len = referent->len;
- orig_last_byte = referent->buf[orig_len - 1];
+ orig_len = referent->len;
+ orig_last_byte = referent->buf[orig_len - 1];
+ if (!symbolic_link)
strbuf_rtrim(referent);
- }
- if (!starts_with(referent->buf, "refs/") &&
+ is_referent_root = is_root_ref(referent->buf);
+ if (!is_referent_root &&
+ !starts_with(referent->buf, "refs/") &&
!starts_with(referent->buf, "worktrees/")) {
ret = fsck_report_ref(o, report,
- FSCK_MSG_ESCAPE_REFERENT,
- "referent '%s' is outside of refs/ or worktrees/",
- referent->buf);
+ FSCK_MSG_SYMREF_TARGET_IS_NOT_A_REF,
+ "points to non-ref target '%s'", referent->buf);
+
}
- if (check_refname_format(referent->buf, 0)) {
+ if (!is_referent_root && check_refname_format(referent->buf, 0)) {
ret = fsck_report_ref(o, report,
- FSCK_MSG_BAD_REFERENT,
+ FSCK_MSG_BAD_REFERENT_NAME,
"points to invalid refname '%s'", referent->buf);
goto out;
}
@@ -3542,17 +3545,16 @@ static int files_fsck_symref_target(struct fsck_options *o,
if (symbolic_link)
goto out;
-
if (referent->len == orig_len ||
(referent->len < orig_len && orig_last_byte != '\n')) {
ret = fsck_report_ref(o, report,
- FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ FSCK_MSG_REF_MISSING_NEWLINE,
"misses LF at the end");
}
if (referent->len != orig_len && referent->len != orig_len - 1) {
ret = fsck_report_ref(o, report,
- FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ FSCK_MSG_TRAILING_REF_CONTENT,
"has trailing whitespaces or newlines");
}
@@ -3562,13 +3564,12 @@ static int files_fsck_symref_target(struct fsck_options *o,
static int files_fsck_refs_content(struct ref_store *ref_store,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *target_name,
struct dir_iterator *iter)
{
struct strbuf ref_content = STRBUF_INIT;
struct strbuf abs_gitdir = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
- struct strbuf refname = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
const char *trailing = NULL;
unsigned int type = 0;
@@ -3576,8 +3577,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct object_id oid;
int ret = 0;
- strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
- report.path = refname.buf;
+ report.path = target_name;
if (S_ISLNK(iter->st.st_mode)) {
const char* relative_referent_path = NULL;
@@ -3600,14 +3600,15 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
else
strbuf_addbuf(&referent, &ref_content);
- ret += files_fsck_symref_target(o, &report, &referent, 1);
+ ret |= files_fsck_symref_target(o, &report, &referent, 1);
goto cleanup;
}
if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_CONTENT,
- "cannot read ref file");
+ "cannot read ref file '%s': (%s)",
+ iter->path.buf, strerror(errno));
goto cleanup;
}
@@ -3624,13 +3625,13 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
if (!(type & REF_ISSYMREF)) {
if (!*trailing) {
ret = fsck_report_ref(o, &report,
- FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ FSCK_MSG_REF_MISSING_NEWLINE,
"misses LF at the end");
goto cleanup;
}
if (*trailing != '\n' || *(trailing + 1)) {
ret = fsck_report_ref(o, &report,
- FSCK_MSG_UNOFFICIAL_FORMATTED_REF,
+ FSCK_MSG_TRAILING_REF_CONTENT,
"has trailing garbage: '%s'", trailing);
goto cleanup;
}
@@ -3640,7 +3641,6 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
}
cleanup:
- strbuf_release(&refname);
strbuf_release(&ref_content);
strbuf_release(&referent);
strbuf_release(&abs_gitdir);
@@ -3649,7 +3649,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *target_name,
struct dir_iterator *iter)
{
struct strbuf sb = STRBUF_INIT;
@@ -3662,11 +3662,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
goto cleanup;
- if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
+ if (check_refname_format(target_name, 0)) {
struct fsck_ref_report report = { 0 };
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
- report.path = sb.buf;
+ report.path = target_name;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
"invalid refname format");
@@ -3680,8 +3679,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
static int files_fsck_refs_dir(struct ref_store *ref_store,
struct fsck_options *o,
const char *refs_check_dir,
+ struct worktree *wt,
files_fsck_refs_fn *fsck_refs_fn)
{
+ struct strbuf target_name = STRBUF_INIT;
struct strbuf sb = STRBUF_INIT;
struct dir_iterator *iter;
int iter_status;
@@ -3700,11 +3701,18 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
continue;
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
+ strbuf_reset(&target_name);
+
+ if (!is_main_worktree(wt))
+ strbuf_addf(&target_name, "worktrees/%s/", wt->id);
+ strbuf_addf(&target_name, "%s/%s", refs_check_dir,
+ iter->relative_path);
+
if (o->verbose)
- fprintf_ln(stdout, "Checking %s/%s",
- refs_check_dir, iter->relative_path);
+ fprintf_ln(stderr, "Checking %s", target_name.buf);
+
for (size_t i = 0; fsck_refs_fn[i]; i++) {
- if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
+ if (fsck_refs_fn[i](ref_store, o, target_name.buf, iter))
ret = -1;
}
} else {
@@ -3721,11 +3729,13 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
out:
strbuf_release(&sb);
+ strbuf_release(&target_name);
return ret;
}
static int files_fsck_refs(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
@@ -3733,27 +3743,20 @@ static int files_fsck_refs(struct ref_store *ref_store,
NULL,
};
- fprintf_ln(stdout, _("Checking references consistency in %s"),
- ref_store->gitdir);
- return files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fn);
+ if (o->verbose)
+ fprintf_ln(stderr, _("Checking references consistency"));
+ return files_fsck_refs_dir(ref_store, o, "refs", wt, fsck_refs_fn);
}
static int files_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_READ, "fsck");
- int ret = files_fsck_refs(ref_store, o);
-
- /*
- * packed-refs should only be checked once because it is shared
- * between all worktrees.
- */
- if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir))
- ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
-
- return ret;
+ return files_fsck_refs(ref_store, o, wt) |
+ refs->packed_ref_store->be->fsck(refs->packed_ref_store, o, wt);
}
struct ref_storage_be refs_be_files = {
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 07c57fd541..46dcaec654 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -13,6 +13,7 @@
#include "../lockfile.h"
#include "../chdir-notify.h"
#include "../statinfo.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../trace2.h"
@@ -1754,8 +1755,13 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
}
static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt)
{
+
+ if (!is_main_worktree(wt))
+ return 0;
+
return 0;
}
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 73b05f971b..125f1fe735 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -653,7 +653,8 @@ typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refnam
struct strbuf *referent);
typedef int fsck_fn(struct ref_store *ref_store,
- struct fsck_options *o);
+ struct fsck_options *o,
+ struct worktree *wt);
struct ref_storage_be {
const char *name;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f5f957e6de..b6a63c1015 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -2443,7 +2443,8 @@ static int reftable_be_reflog_expire(struct ref_store *ref_store,
}
static int reftable_be_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt UNUSED)
{
return 0;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index be4c064b3c..aee7e04b82 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -25,6 +25,13 @@ test_expect_success 'ref name should be checked' '
git tag tag-2 &&
git tag multi_hierarchy/tag-2 &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ EOF
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
@@ -33,20 +40,20 @@ test_expect_success 'ref name should be checked' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\'' branch-1'\'' &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/@: badRefName: invalid refname format
+ error: refs/heads/ branch-1: badRefName: invalid refname format
EOF
- rm $branch_dir_prefix/@ &&
+ rm $branch_dir_prefix/'\'' branch-1'\'' &&
test_cmp expect err &&
- cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ &&
+ cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/'\''~tag-2'\'' &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format
+ error: refs/tags/multi_hierarchy/~tag-2: badRefName: invalid refname format
EOF
- rm $tag_dir_prefix/multi_hierarchy/@ &&
+ rm $tag_dir_prefix/multi_hierarchy/'\''~tag-2'\'' &&
test_cmp expect err &&
cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock &&
@@ -60,6 +67,15 @@ test_expect_success 'ref name should be checked' '
error: refs/tags/.lock: badRefName: invalid refname format
EOF
rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ mkdir $tag_dir_prefix/'\''~new-feature'\'' &&
+ cp $tag_dir_prefix/tag-1 $tag_dir_prefix/'\''~new-feature'\''/tag-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/~new-feature/tag-1: badRefName: invalid refname format
+ EOF
+ rm -rf $tag_dir_prefix/'\''~new-feature'\'' &&
test_cmp expect err
'
@@ -84,7 +100,7 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\''~branch-1'\'' &&
git -c fsck.badRefName=ignore refs verify 2>err &&
test_must_be_empty err
'
@@ -114,13 +130,13 @@ test_expect_success 'ref name check should work for multiple worktrees' '
git update-ref refs/worktree/branch-4 refs/heads/branch-3
) &&
- cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/.branch-2 &&
- cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/@ &&
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/worktree/.branch-2: badRefName: invalid refname format
- error: refs/worktree/@: badRefName: invalid refname format
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
EOF
sort err >sorted_err &&
test_cmp expect sorted_err &&
@@ -129,8 +145,8 @@ test_expect_success 'ref name check should work for multiple worktrees' '
cd worktree-1 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/worktree/.branch-2: badRefName: invalid refname format
- error: refs/worktree/@: badRefName: invalid refname format
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
@@ -140,8 +156,8 @@ test_expect_success 'ref name check should work for multiple worktrees' '
cd worktree-2 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/worktree/.branch-2: badRefName: invalid refname format
- error: refs/worktree/@: badRefName: invalid refname format
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
@@ -190,7 +206,7 @@ test_expect_success 'regular ref content should be checked (individual)' '
printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline: unofficialFormattedRef: misses LF at the end
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
rm $branch_dir_prefix/branch-no-newline &&
test_cmp expect err &&
@@ -198,7 +214,7 @@ test_expect_success 'regular ref content should be checked (individual)' '
printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-garbage: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
rm $branch_dir_prefix/branch-garbage &&
test_cmp expect err &&
@@ -206,7 +222,7 @@ test_expect_success 'regular ref content should be checked (individual)' '
printf "%s\n\n\n" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-1 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/tags/tag-garbage-1: unofficialFormattedRef: has trailing garbage: '\''
+ warning: refs/tags/tag-garbage-1: trailingRefContent: has trailing garbage: '\''
'\''
@@ -217,7 +233,7 @@ test_expect_success 'regular ref content should be checked (individual)' '
printf "%s\n\n\n garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-2 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/tags/tag-garbage-2: unofficialFormattedRef: has trailing garbage: '\''
+ warning: refs/tags/tag-garbage-2: trailingRefContent: has trailing garbage: '\''
garbage'\''
@@ -228,16 +244,16 @@ test_expect_success 'regular ref content should be checked (individual)' '
printf "%s garbage\na" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-3 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/tags/tag-garbage-3: unofficialFormattedRef: has trailing garbage: '\'' garbage
+ warning: refs/tags/tag-garbage-3: trailingRefContent: has trailing garbage: '\'' garbage
a'\''
EOF
rm $tag_dir_prefix/tag-garbage-3 &&
test_cmp expect err &&
printf "%s garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-4 &&
- test_must_fail git -c fsck.unofficialFormattedRef=error refs verify 2>err &&
+ test_must_fail git -c fsck.trailingRefContent=error refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/tags/tag-garbage-4: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
+ error: refs/tags/tag-garbage-4: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
rm $tag_dir_prefix/tag-garbage-4 &&
test_cmp expect err
@@ -266,8 +282,8 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
- warning: refs/heads/branch-garbage: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
- warning: refs/heads/branch-no-newline: unofficialFormattedRef: misses LF at the end
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
@@ -287,10 +303,15 @@ test_expect_success 'textual symref content should be checked (individual)' '
rm $branch_dir_prefix/branch-good &&
test_must_be_empty err &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-head &&
+ test_must_be_empty err &&
+
printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-no-newline-1: unofficialFormattedRef: misses LF at the end
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
EOF
rm $branch_dir_prefix/branch-no-newline-1 &&
test_cmp expect err &&
@@ -298,8 +319,8 @@ test_expect_success 'textual symref content should be checked (individual)' '
printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
EOF
rm $branch_dir_prefix/a/b/branch-trailing-1 &&
test_cmp expect err &&
@@ -307,7 +328,7 @@ test_expect_success 'textual symref content should be checked (individual)' '
printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-2: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
EOF
rm $branch_dir_prefix/a/b/branch-trailing-2 &&
test_cmp expect err &&
@@ -315,7 +336,7 @@ test_expect_success 'textual symref content should be checked (individual)' '
printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-trailing-3: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
EOF
rm $branch_dir_prefix/a/b/branch-trailing-3 &&
test_cmp expect err &&
@@ -323,8 +344,8 @@ test_expect_success 'textual symref content should be checked (individual)' '
printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: misses LF at the end
- warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
EOF
rm $branch_dir_prefix/a/b/branch-complicated &&
test_cmp expect err &&
@@ -332,7 +353,7 @@ test_expect_success 'textual symref content should be checked (individual)' '
printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferent: points to invalid refname '\''refs/heads/.branch'\''
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
EOF
rm $branch_dir_prefix/branch-bad-1 &&
test_cmp expect err
@@ -348,6 +369,7 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
mkdir -p "$branch_dir_prefix/a/b" &&
printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
@@ -357,20 +379,20 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/branch-bad-1: badReferent: points to invalid refname '\''refs/heads/.branch'\''
- warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-complicated: unofficialFormattedRef: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-1: unofficialFormattedRef: misses LF at the end
- warning: refs/heads/a/b/branch-trailing-2: unofficialFormattedRef: has trailing whitespaces or newlines
- warning: refs/heads/a/b/branch-trailing-3: unofficialFormattedRef: has trailing whitespaces or newlines
- warning: refs/heads/branch-no-newline-1: unofficialFormattedRef: misses LF at the end
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
'
-test_expect_success 'textual symref should be checked whether it is escaped' '
+test_expect_success 'the target of the textual symref should be checked' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
@@ -379,48 +401,71 @@ test_expect_success 'textual symref should be checked whether it is escaped' '
test_commit default &&
mkdir -p "$branch_dir_prefix/a/b" &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err &&
+
+ printf "ref: refs/foo\n" >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err &&
+
printf "ref: refs-back/heads/main\n" >$branch_dir_prefix/branch-bad-1 &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-bad-1: escapeReferent: referent '\''refs-back/heads/main'\'' is outside of refs/ or worktrees/
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''refs-back/heads/main'\''
EOF
rm $branch_dir_prefix/branch-bad-1 &&
test_cmp expect err
'
-test_expect_success 'textual symref escape check should work with worktrees' '
+test_expect_success SYMLINKS 'symlink symref content should be checked' '
test_when_finished "rm -rf repo" &&
git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
cd repo &&
test_commit default &&
- git branch branch-1 &&
- git branch branch-2 &&
- git branch branch-3 &&
- git worktree add ./worktree-1 branch-2 &&
- git worktree add ./worktree-2 branch-3 &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
- (
- cd worktree-1 &&
- git branch refs/worktree/w1-branch &&
- git symbolic-ref refs/worktree/branch-4 refs/heads/branch-1 &&
- git symbolic-ref refs/worktree/branch-5 worktrees/worktree-2/refs/worktree/w2-branch
- ) &&
- (
- cd worktree-2 &&
- git branch refs/worktree/w2-branch &&
- git symbolic-ref refs/worktree/branch-4 refs/heads/branch-1 &&
- git symbolic-ref refs/worktree/branch-5 worktrees/worktree-1/refs/worktree/w1-branch
- ) &&
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
- git symbolic-ref refs/heads/branch-5 worktrees/worktree-1/refs/worktree/w1-branch &&
- git symbolic-ref refs/heads/branch-6 worktrees/worktree-2/refs/worktree/w2-branch &&
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
- git refs verify 2>err &&
- test_must_be_empty err
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
'
-test_expect_success 'all textual symref checks should work with worktrees' '
+test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
cd repo &&
@@ -449,7 +494,7 @@ test_expect_success 'all textual symref checks should work with worktrees' '
printf "%s" $bad_content_1 >$worktree1_refdir_prefix/bad-branch-1 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/worktree/bad-branch-1: badRefContent: $bad_content_1
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content_1
EOF
rm $worktree1_refdir_prefix/bad-branch-1 &&
test_cmp expect err &&
@@ -457,7 +502,7 @@ test_expect_success 'all textual symref checks should work with worktrees' '
printf "%s" $bad_content_2 >$worktree2_refdir_prefix/bad-branch-2 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/worktree/bad-branch-2: badRefContent: $bad_content_2
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content_2
EOF
rm $worktree2_refdir_prefix/bad-branch-2 &&
test_cmp expect err &&
@@ -465,7 +510,7 @@ test_expect_success 'all textual symref checks should work with worktrees' '
printf "%s" $bad_content_3 >$worktree1_refdir_prefix/bad-branch-3 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/worktree/bad-branch-3: badRefContent: $bad_content_3
+ error: worktrees/worktree-1/refs/worktree/bad-branch-3: badRefContent: $bad_content_3
EOF
rm $worktree1_refdir_prefix/bad-branch-3 &&
test_cmp expect err &&
@@ -473,61 +518,17 @@ test_expect_success 'all textual symref checks should work with worktrees' '
printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/worktree/branch-no-newline: unofficialFormattedRef: misses LF at the end
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
rm $worktree1_refdir_prefix/branch-no-newline &&
test_cmp expect err &&
- printf "%s garbage" "$(git rev-parse HEAD)" >$worktree2_refdir_prefix/branch-garbage &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/worktree/branch-garbage: unofficialFormattedRef: has trailing garbage: '\'' garbage'\''
- EOF
- rm $worktree2_refdir_prefix/branch-garbage
-'
-
-test_expect_success SYMLINKS 'symlink symref content should be checked (individual)' '
- test_when_finished "rm -rf repo" &&
- git init repo &&
- branch_dir_prefix=.git/refs/heads &&
- tag_dir_prefix=.git/refs/tags &&
- cd repo &&
- test_commit default &&
- mkdir -p "$branch_dir_prefix/a/b" &&
-
- ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
- git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
- EOF
- rm $branch_dir_prefix/branch-symbolic-good &&
- test_cmp expect err &&
-
- ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
git refs verify 2>err &&
cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
- warning: refs/heads/branch-symbolic: escapeReferent: referent '\''logs/branch-escape'\'' is outside of refs/ or worktrees/
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
EOF
- rm $branch_dir_prefix/branch-symbolic &&
- test_cmp expect err &&
-
- ln -sf ./"branch space" $branch_dir_prefix/branch-symbolic-bad &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
- error: refs/heads/branch-symbolic-bad: badReferent: points to invalid refname '\''refs/heads/branch space'\''
- EOF
- rm $branch_dir_prefix/branch-symbolic-bad &&
- test_cmp expect err &&
-
- ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
- error: refs/tags/tag-symbolic-1: badReferent: points to invalid refname '\''refs/tags/.tag'\''
- EOF
- rm $tag_dir_prefix/tag-symbolic-1 &&
+ rm $worktree1_refdir_prefix/branch-garbage &&
test_cmp expect err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v6 1/9] ref: initialize "fsck_ref_report" with zero
2024-10-21 13:32 ` [PATCH v6 " shejialuo
@ 2024-10-21 13:34 ` shejialuo
2024-10-21 13:34 ` [PATCH v6 2/9] ref: check the full refname instead of basename shejialuo
` (10 subsequent siblings)
11 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:34 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "fsck.c::fsck_refs_error_function", we need to tell whether "oid" and
"referent" is NULL. So, we need to always initialize these parameters to
NULL instead of letting them point to anywhere when creating a new
"fsck_ref_report" structure.
The original code explicitly initializes the "path" member in the
"struct fsck_ref_report" to NULL (which implicitly 0-initializes other
members in the struct). It is more customary to use "{ 0 }" to express
that we are 0-initializing everything. In order to align with the
codebase, initialize "fsck_ref_report" with zero.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0824c0b8a9..03d2503276 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3520,7 +3520,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
goto cleanup;
if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
- struct fsck_ref_report report = { .path = NULL };
+ struct fsck_ref_report report = { 0 };
strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v6 2/9] ref: check the full refname instead of basename
2024-10-21 13:32 ` [PATCH v6 " shejialuo
2024-10-21 13:34 ` [PATCH v6 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
@ 2024-10-21 13:34 ` shejialuo
2024-10-21 15:38 ` karthik nayak
2024-11-05 7:11 ` Patrick Steinhardt
2024-10-21 13:34 ` [PATCH v6 3/9] ref: initialize target name outside of check functions shejialuo
` (9 subsequent siblings)
11 siblings, 2 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:34 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "files-backend.c::files_fsck_refs_name", we validate the refname
format by using "check_refname_format" to check the basename of the
iterator with "REFNAME_ALLOW_ONELEVEL" flag.
However, this is a bad implementation. Although we doesn't allow a
single "@" in ".git" directory, we do allow "refs/heads/@". So, we will
report an error wrongly when there is a "refs/heads/@" ref by using one
level refname "@".
Because we just check one level refname, we either cannot check the
other parts of the full refname. And we will ignore the following
errors:
"refs/heads/ new-feature/test"
"refs/heads/~new-feature/test"
In order to fix the above problem, enhance "files_fsck_refs_name" to use
the full name for "check_refname_format". Then, replace the tests which
are related to "@" and add tests to exercise the above situations.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 4 ++--
t/t0602-reffiles-fsck.sh | 30 +++++++++++++++++++++++-------
2 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 03d2503276..f246c92684 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3519,10 +3519,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
goto cleanup;
- if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
+ strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
+ if (check_refname_format(sb.buf, 0)) {
struct fsck_ref_report report = { 0 };
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 71a4d1a5ae..0aee377439 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -25,6 +25,13 @@ test_expect_success 'ref name should be checked' '
git tag tag-2 &&
git tag multi_hierarchy/tag-2 &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ EOF
+ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+
cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
@@ -33,20 +40,20 @@ test_expect_success 'ref name should be checked' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\'' branch-1'\'' &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/heads/@: badRefName: invalid refname format
+ error: refs/heads/ branch-1: badRefName: invalid refname format
EOF
- rm $branch_dir_prefix/@ &&
+ rm $branch_dir_prefix/'\'' branch-1'\'' &&
test_cmp expect err &&
- cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ &&
+ cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/'\''~tag-2'\'' &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format
+ error: refs/tags/multi_hierarchy/~tag-2: badRefName: invalid refname format
EOF
- rm $tag_dir_prefix/multi_hierarchy/@ &&
+ rm $tag_dir_prefix/multi_hierarchy/'\''~tag-2'\'' &&
test_cmp expect err &&
cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock &&
@@ -60,6 +67,15 @@ test_expect_success 'ref name should be checked' '
error: refs/tags/.lock: badRefName: invalid refname format
EOF
rm $tag_dir_prefix/.lock &&
+ test_cmp expect err &&
+
+ mkdir $tag_dir_prefix/'\''~new-feature'\'' &&
+ cp $tag_dir_prefix/tag-1 $tag_dir_prefix/'\''~new-feature'\''/tag-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/~new-feature/tag-1: badRefName: invalid refname format
+ EOF
+ rm -rf $tag_dir_prefix/'\''~new-feature'\'' &&
test_cmp expect err
'
@@ -84,7 +100,7 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\''~branch-1'\'' &&
git -c fsck.badRefName=ignore refs verify 2>err &&
test_must_be_empty err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v6 2/9] ref: check the full refname instead of basename
2024-10-21 13:34 ` [PATCH v6 2/9] ref: check the full refname instead of basename shejialuo
@ 2024-10-21 15:38 ` karthik nayak
2024-10-22 11:42 ` shejialuo
2024-11-05 7:11 ` Patrick Steinhardt
1 sibling, 1 reply; 209+ messages in thread
From: karthik nayak @ 2024-10-21 15:38 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]
shejialuo <shejialuo@gmail.com> writes:
[snip]
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 71a4d1a5ae..0aee377439 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -25,6 +25,13 @@ test_expect_success 'ref name should be checked' '
> git tag tag-2 &&
> git tag multi_hierarchy/tag-2 &&
>
> + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> + git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + EOF
> + test_must_be_empty err &&
> + rm $branch_dir_prefix/@ &&
> +
> cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
> test_must_fail git refs verify 2>err &&
> cat >expect <<-EOF &&
> @@ -33,20 +40,20 @@ test_expect_success 'ref name should be checked' '
> rm $branch_dir_prefix/.branch-1 &&
> test_cmp expect err &&
>
> - cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\'' branch-1'\'' &&
Nit: Here and below we could use ${SQ} instead.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 2/9] ref: check the full refname instead of basename
2024-10-21 15:38 ` karthik nayak
@ 2024-10-22 11:42 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-22 11:42 UTC (permalink / raw)
To: karthik nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano
On Mon, Oct 21, 2024 at 10:38:02AM -0500, karthik nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> [snip]
>
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index 71a4d1a5ae..0aee377439 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -25,6 +25,13 @@ test_expect_success 'ref name should be checked' '
> > git tag tag-2 &&
> > git tag multi_hierarchy/tag-2 &&
> >
> > + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> > + git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + EOF
> > + test_must_be_empty err &&
> > + rm $branch_dir_prefix/@ &&
> > +
> > cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
> > test_must_fail git refs verify 2>err &&
> > cat >expect <<-EOF &&
> > @@ -33,20 +40,20 @@ test_expect_success 'ref name should be checked' '
> > rm $branch_dir_prefix/.branch-1 &&
> > test_cmp expect err &&
> >
> > - cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> > + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\'' branch-1'\'' &&
>
> Nit: Here and below we could use ${SQ} instead.
>
I agree.
> [snip]
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 2/9] ref: check the full refname instead of basename
2024-10-21 13:34 ` [PATCH v6 2/9] ref: check the full refname instead of basename shejialuo
2024-10-21 15:38 ` karthik nayak
@ 2024-11-05 7:11 ` Patrick Steinhardt
2024-11-06 12:37 ` shejialuo
1 sibling, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-05 7:11 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 21, 2024 at 09:34:22PM +0800, shejialuo wrote:
> In "files-backend.c::files_fsck_refs_name", we validate the refname
> format by using "check_refname_format" to check the basename of the
> iterator with "REFNAME_ALLOW_ONELEVEL" flag.
>
> However, this is a bad implementation. Although we doesn't allow a
> single "@" in ".git" directory, we do allow "refs/heads/@". So, we will
> report an error wrongly when there is a "refs/heads/@" ref by using one
> level refname "@".
>
> Because we just check one level refname, we either cannot check the
> other parts of the full refname. And we will ignore the following
> errors:
>
> "refs/heads/ new-feature/test"
> "refs/heads/~new-feature/test"
>
> In order to fix the above problem, enhance "files_fsck_refs_name" to use
> the full name for "check_refname_format". Then, replace the tests which
> are related to "@" and add tests to exercise the above situations.
Okay, makes sense.
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 03d2503276..f246c92684 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3519,10 +3519,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
> if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
> goto cleanup;
>
> - if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
> + strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
> + if (check_refname_format(sb.buf, 0)) {
> struct fsck_ref_report report = { 0 };
>
> - strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
> report.path = sb.buf;
> ret = fsck_report_ref(o, &report,
> FSCK_MSG_BAD_REF_NAME,
So this only works right now because we never check root refs in the
first place? Maybe that is worth a comment.
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 71a4d1a5ae..0aee377439 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -25,6 +25,13 @@ test_expect_success 'ref name should be checked' '
> git tag tag-2 &&
> git tag multi_hierarchy/tag-2 &&
>
> + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> + git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + EOF
> + test_must_be_empty err &&
> + rm $branch_dir_prefix/@ &&
`expect` isn't used here as you use `test_must_be_empty`.
> cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
> test_must_fail git refs verify 2>err &&
> cat >expect <<-EOF &&
> @@ -33,20 +40,20 @@ test_expect_success 'ref name should be checked' '
> rm $branch_dir_prefix/.branch-1 &&
> test_cmp expect err &&
>
> - cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\'' branch-1'\'' &&
> test_must_fail git refs verify 2>err &&
> cat >expect <<-EOF &&
> - error: refs/heads/@: badRefName: invalid refname format
> + error: refs/heads/ branch-1: badRefName: invalid refname format
> EOF
> - rm $branch_dir_prefix/@ &&
> + rm $branch_dir_prefix/'\'' branch-1'\'' &&
> test_cmp expect err &&
Okay, we now allow `refs/heads/@`, but still don't allow other bad
formatting like spaces in the refname.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 2/9] ref: check the full refname instead of basename
2024-11-05 7:11 ` Patrick Steinhardt
@ 2024-11-06 12:37 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-06 12:37 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Tue, Nov 05, 2024 at 08:11:42AM +0100, Patrick Steinhardt wrote:
[snip]
> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index 03d2503276..f246c92684 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -3519,10 +3519,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
> > if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
> > goto cleanup;
> >
> > - if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
> > + strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
> > + if (check_refname_format(sb.buf, 0)) {
> > struct fsck_ref_report report = { 0 };
> >
> > - strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
> > report.path = sb.buf;
> > ret = fsck_report_ref(o, &report,
> > FSCK_MSG_BAD_REF_NAME,
>
> So this only works right now because we never check root refs in the
> first place? Maybe that is worth a comment.
>
Yes, I agree. I will improve this in the next version.
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index 71a4d1a5ae..0aee377439 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -25,6 +25,13 @@ test_expect_success 'ref name should be checked' '
> > git tag tag-2 &&
> > git tag multi_hierarchy/tag-2 &&
> >
> > + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> > + git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + EOF
> > + test_must_be_empty err &&
> > + rm $branch_dir_prefix/@ &&
>
> `expect` isn't used here as you use `test_must_be_empty`.
>
Thanks, I will improve this in the next version.
> > cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
> > test_must_fail git refs verify 2>err &&
> > cat >expect <<-EOF &&
> > @@ -33,20 +40,20 @@ test_expect_success 'ref name should be checked' '
> > rm $branch_dir_prefix/.branch-1 &&
> > test_cmp expect err &&
> >
> > - cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
> > + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\'' branch-1'\'' &&
> > test_must_fail git refs verify 2>err &&
> > cat >expect <<-EOF &&
> > - error: refs/heads/@: badRefName: invalid refname format
> > + error: refs/heads/ branch-1: badRefName: invalid refname format
> > EOF
> > - rm $branch_dir_prefix/@ &&
> > + rm $branch_dir_prefix/'\'' branch-1'\'' &&
> > test_cmp expect err &&
>
> Okay, we now allow `refs/heads/@`, but still don't allow other bad
> formatting like spaces in the refname.
>
Yes, this is a mistake. Junio have told me in this patch and I have
realized this.
https://lore.kernel.org/git/xmqqjzei1mtb.fsf@gitster.g/
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v6 3/9] ref: initialize target name outside of check functions
2024-10-21 13:32 ` [PATCH v6 " shejialuo
2024-10-21 13:34 ` [PATCH v6 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-10-21 13:34 ` [PATCH v6 2/9] ref: check the full refname instead of basename shejialuo
@ 2024-10-21 13:34 ` shejialuo
2024-10-21 15:49 ` karthik nayak
2024-11-05 7:11 ` Patrick Steinhardt
2024-10-21 13:34 ` [PATCH v6 4/9] ref: support multiple worktrees check for refs shejialuo
` (8 subsequent siblings)
11 siblings, 2 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:34 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We passes "refs_check_dir" to the "files_fsck_refs_name" function which
allows it to create the checked ref name later. However, when we
introduce a new check function, we have to re-calculate the target name.
It's bad for us to do repeat calculation. Instead, we should calculate
it only once and pass the target name to the check functions.
In order not to do repeat calculation, rename "refs_check_dir" to
"target_name". And in "files_fsck_refs_dir", create a new strbuf
"target_name", thus whenever we handle a new target, calculate the
name and call the check functions one by one.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index f246c92684..fbfcd1115c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3501,12 +3501,12 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
*/
typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *target_name,
struct dir_iterator *iter);
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *target_name,
struct dir_iterator *iter)
{
struct strbuf sb = STRBUF_INIT;
@@ -3519,11 +3519,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
goto cleanup;
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
- if (check_refname_format(sb.buf, 0)) {
+ if (check_refname_format(target_name, 0)) {
struct fsck_ref_report report = { 0 };
- report.path = sb.buf;
+ report.path = target_name;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
"invalid refname format");
@@ -3539,6 +3538,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
const char *refs_check_dir,
files_fsck_refs_fn *fsck_refs_fn)
{
+ struct strbuf target_name = STRBUF_INIT;
struct strbuf sb = STRBUF_INIT;
struct dir_iterator *iter;
int iter_status;
@@ -3557,11 +3557,15 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
continue;
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
+ strbuf_reset(&target_name);
+ strbuf_addf(&target_name, "%s/%s", refs_check_dir,
+ iter->relative_path);
+
if (o->verbose)
- fprintf_ln(stderr, "Checking %s/%s",
- refs_check_dir, iter->relative_path);
+ fprintf_ln(stderr, "Checking %s", target_name.buf);
+
for (size_t i = 0; fsck_refs_fn[i]; i++) {
- if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
+ if (fsck_refs_fn[i](ref_store, o, target_name.buf, iter))
ret = -1;
}
} else {
@@ -3578,6 +3582,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
out:
strbuf_release(&sb);
+ strbuf_release(&target_name);
return ret;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v6 3/9] ref: initialize target name outside of check functions
2024-10-21 13:34 ` [PATCH v6 3/9] ref: initialize target name outside of check functions shejialuo
@ 2024-10-21 15:49 ` karthik nayak
2024-11-05 7:11 ` Patrick Steinhardt
1 sibling, 0 replies; 209+ messages in thread
From: karthik nayak @ 2024-10-21 15:49 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 967 bytes --]
shejialuo <shejialuo@gmail.com> writes:
> We passes "refs_check_dir" to the "files_fsck_refs_name" function which
> allows it to create the checked ref name later. However, when we
> introduce a new check function, we have to re-calculate the target name.
> It's bad for us to do repeat calculation. Instead, we should calculate
> it only once and pass the target name to the check functions.
>
> In order not to do repeat calculation, rename "refs_check_dir" to
> "target_name". And in "files_fsck_refs_dir", create a new strbuf
Nit: Why `target_name` and not simply `target`?
> "target_name", thus whenever we handle a new target, calculate the
> name and call the check functions one by one.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
> refs/files-backend.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 3/9] ref: initialize target name outside of check functions
2024-10-21 13:34 ` [PATCH v6 3/9] ref: initialize target name outside of check functions shejialuo
2024-10-21 15:49 ` karthik nayak
@ 2024-11-05 7:11 ` Patrick Steinhardt
2024-11-06 12:32 ` shejialuo
1 sibling, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-05 7:11 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 21, 2024 at 09:34:31PM +0800, shejialuo wrote:
> We passes "refs_check_dir" to the "files_fsck_refs_name" function which
> allows it to create the checked ref name later. However, when we
> introduce a new check function, we have to re-calculate the target name.
> It's bad for us to do repeat calculation. Instead, we should calculate
> it only once and pass the target name to the check functions.
It would be nice to clarify what exactly is bad about it. Does it create
extra memory churn? Or is this about not duplicating logic?
> In order not to do repeat calculation, rename "refs_check_dir" to
> "target_name". And in "files_fsck_refs_dir", create a new strbuf
> "target_name", thus whenever we handle a new target, calculate the
> name and call the check functions one by one.
"target_name" is somewhat of a weird name. I'd expect that this is
either the path to the reference, in which case I'd call this "path", or
the name of the reference that is to be checked, in which case I'd call
this "refname".
> @@ -3539,6 +3538,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
> const char *refs_check_dir,
> files_fsck_refs_fn *fsck_refs_fn)
> {
> + struct strbuf target_name = STRBUF_INIT;
> struct strbuf sb = STRBUF_INIT;
> struct dir_iterator *iter;
> int iter_status;
> @@ -3557,11 +3557,15 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
> continue;
> } else if (S_ISREG(iter->st.st_mode) ||
> S_ISLNK(iter->st.st_mode)) {
> + strbuf_reset(&target_name);
> + strbuf_addf(&target_name, "%s/%s", refs_check_dir,
> + iter->relative_path);
> +
> if (o->verbose)
> - fprintf_ln(stderr, "Checking %s/%s",
> - refs_check_dir, iter->relative_path);
> + fprintf_ln(stderr, "Checking %s", target_name.buf);
> +
> for (size_t i = 0; fsck_refs_fn[i]; i++) {
> - if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
> + if (fsck_refs_fn[i](ref_store, o, target_name.buf, iter))
> ret = -1;
> }
> } else {
The change itself does make sense though. We indeed avoid reallocating
the array for every single ref, which is a worthwhile change.
I was wondering whether we could reuse `sb` here, but we do use it at
the end of the function to potentially print an error message.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 3/9] ref: initialize target name outside of check functions
2024-11-05 7:11 ` Patrick Steinhardt
@ 2024-11-06 12:32 ` shejialuo
2024-11-06 13:14 ` Patrick Steinhardt
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-11-06 12:32 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Tue, Nov 05, 2024 at 08:11:46AM +0100, Patrick Steinhardt wrote:
> On Mon, Oct 21, 2024 at 09:34:31PM +0800, shejialuo wrote:
> > We passes "refs_check_dir" to the "files_fsck_refs_name" function which
> > allows it to create the checked ref name later. However, when we
> > introduce a new check function, we have to re-calculate the target name.
> > It's bad for us to do repeat calculation. Instead, we should calculate
> > it only once and pass the target name to the check functions.
>
> It would be nice to clarify what exactly is bad about it. Does it create
> extra memory churn? Or is this about not duplicating logic?
>
Thanks, I will improve this in the next version.
> > In order not to do repeat calculation, rename "refs_check_dir" to
> > "target_name". And in "files_fsck_refs_dir", create a new strbuf
> > "target_name", thus whenever we handle a new target, calculate the
> > name and call the check functions one by one.
>
> "target_name" is somewhat of a weird name. I'd expect that this is
> either the path to the reference, in which case I'd call this "path", or
> the name of the reference that is to be checked, in which case I'd call
> this "refname".
>
I felt quite hard to name this variable when I wrote the code. "refname"
is not suitable due to we may check the reflog later by calling
"files_fsck_refs_dir" function.
So, we should use "path" here.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 3/9] ref: initialize target name outside of check functions
2024-11-06 12:32 ` shejialuo
@ 2024-11-06 13:14 ` Patrick Steinhardt
0 siblings, 0 replies; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-06 13:14 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Wed, Nov 06, 2024 at 08:32:19PM +0800, shejialuo wrote:
> On Tue, Nov 05, 2024 at 08:11:46AM +0100, Patrick Steinhardt wrote:
> > On Mon, Oct 21, 2024 at 09:34:31PM +0800, shejialuo wrote:
> > > In order not to do repeat calculation, rename "refs_check_dir" to
> > > "target_name". And in "files_fsck_refs_dir", create a new strbuf
> > > "target_name", thus whenever we handle a new target, calculate the
> > > name and call the check functions one by one.
> >
> > "target_name" is somewhat of a weird name. I'd expect that this is
> > either the path to the reference, in which case I'd call this "path", or
> > the name of the reference that is to be checked, in which case I'd call
> > this "refname".
> >
>
> I felt quite hard to name this variable when I wrote the code. "refname"
> is not suitable due to we may check the reflog later by calling
> "files_fsck_refs_dir" function.
I anticipate that we'll likely have separate infra for checking reflogs
as they are both stored in a different directory and because their
format is completely different compared to normal refs. So there isn't
really too much of a point to plan ahead for sharing logic here, I'd
think, and thus "refname" might be a better fit. If that changes in the
future we can still refactor the code.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v6 4/9] ref: support multiple worktrees check for refs
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (2 preceding siblings ...)
2024-10-21 13:34 ` [PATCH v6 3/9] ref: initialize target name outside of check functions shejialuo
@ 2024-10-21 13:34 ` shejialuo
2024-10-21 15:56 ` karthik nayak
2024-11-05 7:11 ` Patrick Steinhardt
2024-10-21 13:34 ` [PATCH v6 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
` (7 subsequent siblings)
11 siblings, 2 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:34 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. As we decide to add more
checks for ref content, we need to set up support for multiple
worktrees.
Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
"worktrees/<id>/refs/worktree/foo". So we should know the id of the
worktree to get the full name. Add a new parameter "struct worktree *"
for "refs-internal.h::fsck_fn". Then change the related functions to
follow this new interface.
The "packed-refs" only exists in the main worktree, so we should only
check "packed-refs" in the main worktree. Use "is_main_worktree" method
to skip checking "packed-refs" in "packed_fsck" function.
Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add
"worktree/<id>/" prefix when we are not in the main worktree.
Last, add a new test to check the refname when there are multiple
worktrees to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 12 ++++++--
refs.c | 5 ++--
refs.h | 3 +-
refs/debug.c | 5 ++--
refs/files-backend.c | 17 ++++++++----
refs/packed-backend.c | 8 +++++-
refs/refs-internal.h | 3 +-
refs/reftable-backend.c | 3 +-
t/t0602-reffiles-fsck.sh | 59 ++++++++++++++++++++++++++++++++++++++++
9 files changed, 100 insertions(+), 15 deletions(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index 24978a7b7b..886c4ceae3 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -5,6 +5,7 @@
#include "parse-options.h"
#include "refs.h"
#include "strbuf.h"
+#include "worktree.h"
#define REFS_MIGRATE_USAGE \
N_("git refs migrate --ref-format=<format> [--dry-run]")
@@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
{
struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
+ struct worktree **worktrees, **p;
const char * const verify_usage[] = {
REFS_VERIFY_USAGE,
NULL,
@@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
OPT_END(),
};
- int ret;
+ int ret = 0;
argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
if (argc)
@@ -84,9 +86,15 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+ worktrees = get_worktrees();
+ for (p = worktrees; *p; p++) {
+ struct worktree *wt = *p;
+ ret |= refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options, wt);
+ }
+
fsck_options_clear(&fsck_refs_options);
+ free_worktrees(worktrees);
return ret;
}
diff --git a/refs.c b/refs.c
index 5f729ed412..395a17273c 100644
--- a/refs.c
+++ b/refs.c
@@ -318,9 +318,10 @@ int check_refname_format(const char *refname, int flags)
return check_or_sanitize_refname(refname, flags, NULL);
}
-int refs_fsck(struct ref_store *refs, struct fsck_options *o)
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt)
{
- return refs->be->fsck(refs, o);
+ return refs->be->fsck(refs, o, wt);
}
void sanitize_refname_component(const char *refname, struct strbuf *out)
diff --git a/refs.h b/refs.h
index 108dfc93b3..341d43239c 100644
--- a/refs.h
+++ b/refs.h
@@ -549,7 +549,8 @@ int check_refname_format(const char *refname, int flags);
* reflogs are consistent, and non-zero otherwise. The errors will be
* written to stderr.
*/
-int refs_fsck(struct ref_store *refs, struct fsck_options *o);
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt);
/*
* Apply the rules from check_refname_format, but mutate the result until it
diff --git a/refs/debug.c b/refs/debug.c
index 45e2e784a0..72e80ddd6d 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -420,10 +420,11 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
}
static int debug_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
- int res = drefs->refs->be->fsck(drefs->refs, o);
+ int res = drefs->refs->be->fsck(drefs->refs, o, wt);
trace_printf_key(&trace_refs, "fsck: %d\n", res);
return res;
}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index fbfcd1115c..24ad73faba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -23,6 +23,7 @@
#include "../dir.h"
#include "../chdir-notify.h"
#include "../setup.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../revision.h"
@@ -3536,6 +3537,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
static int files_fsck_refs_dir(struct ref_store *ref_store,
struct fsck_options *o,
const char *refs_check_dir,
+ struct worktree *wt,
files_fsck_refs_fn *fsck_refs_fn)
{
struct strbuf target_name = STRBUF_INIT;
@@ -3558,6 +3560,9 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
strbuf_reset(&target_name);
+
+ if (!is_main_worktree(wt))
+ strbuf_addf(&target_name, "worktrees/%s/", wt->id);
strbuf_addf(&target_name, "%s/%s", refs_check_dir,
iter->relative_path);
@@ -3587,7 +3592,8 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
}
static int files_fsck_refs(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
@@ -3596,17 +3602,18 @@ static int files_fsck_refs(struct ref_store *ref_store,
if (o->verbose)
fprintf_ln(stderr, _("Checking references consistency"));
- return files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fn);
+ return files_fsck_refs_dir(ref_store, o, "refs", wt, fsck_refs_fn);
}
static int files_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_READ, "fsck");
- return files_fsck_refs(ref_store, o) |
- refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+ return files_fsck_refs(ref_store, o, wt) |
+ refs->packed_ref_store->be->fsck(refs->packed_ref_store, o, wt);
}
struct ref_storage_be refs_be_files = {
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 07c57fd541..46dcaec654 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -13,6 +13,7 @@
#include "../lockfile.h"
#include "../chdir-notify.h"
#include "../statinfo.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../trace2.h"
@@ -1754,8 +1755,13 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
}
static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt)
{
+
+ if (!is_main_worktree(wt))
+ return 0;
+
return 0;
}
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 2313c830d8..037d7991cd 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -653,7 +653,8 @@ typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refnam
struct strbuf *referent);
typedef int fsck_fn(struct ref_store *ref_store,
- struct fsck_options *o);
+ struct fsck_options *o,
+ struct worktree *wt);
struct ref_storage_be {
const char *name;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f5f957e6de..b6a63c1015 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -2443,7 +2443,8 @@ static int reftable_be_reflog_expire(struct ref_store *ref_store,
}
static int reftable_be_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt UNUSED)
{
return 0;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 0aee377439..6eb1385c50 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -105,4 +105,63 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
test_must_be_empty err
'
+test_expect_success 'ref name check should work for multiple worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ (
+ cd worktree-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ ) &&
+
+ (
+ cd worktree-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+ )
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v6 4/9] ref: support multiple worktrees check for refs
2024-10-21 13:34 ` [PATCH v6 4/9] ref: support multiple worktrees check for refs shejialuo
@ 2024-10-21 15:56 ` karthik nayak
2024-10-22 11:44 ` shejialuo
2024-11-05 7:11 ` Patrick Steinhardt
1 sibling, 1 reply; 209+ messages in thread
From: karthik nayak @ 2024-10-21 15:56 UTC (permalink / raw)
To: shejialuo, git; +Cc: Patrick Steinhardt, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 2514 bytes --]
shejialuo <shejialuo@gmail.com> writes:
[snip]
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 0aee377439..6eb1385c50 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -105,4 +105,63 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
> test_must_be_empty err
> '
>
> +test_expect_success 'ref name check should work for multiple worktrees' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> +
> + cd repo &&
> + test_commit initial &&
> + git checkout -b branch-1 &&
> + test_commit second &&
> + git checkout -b branch-2 &&
> + test_commit third &&
> + git checkout -b branch-3 &&
> + git worktree add ./worktree-1 branch-1 &&
> + git worktree add ./worktree-2 branch-2 &&
> + worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
> + worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
> +
> + (
> + cd worktree-1 &&
> + git update-ref refs/worktree/branch-4 refs/heads/branch-3
> + ) &&
> + (
> + cd worktree-2 &&
> + git update-ref refs/worktree/branch-4 refs/heads/branch-3
> + ) &&
> +
> + cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
> + cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
> +
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
> + error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
> + EOF
> + sort err >sorted_err &&
> + test_cmp expect sorted_err &&
> +
> + (
> + cd worktree-1 &&
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
> + error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
> + EOF
> + sort err >sorted_err &&
> + test_cmp expect sorted_err
> + ) &&
> +
> + (
> + cd worktree-2 &&
> + test_must_fail git refs verify 2>err &&
> + cat >expect <<-EOF &&
> + error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
> + error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
> + EOF
> + sort err >sorted_err &&
> + test_cmp expect sorted_err
> + )
These last three loops are the same, couldn't we loop?
for dir in "." "worktree-1" "worktree-2"
do
...
done
> +'
> +
> test_done
> --
> 2.47.0
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 4/9] ref: support multiple worktrees check for refs
2024-10-21 15:56 ` karthik nayak
@ 2024-10-22 11:44 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-22 11:44 UTC (permalink / raw)
To: karthik nayak; +Cc: git, Patrick Steinhardt, Junio C Hamano
On Mon, Oct 21, 2024 at 10:56:30AM -0500, karthik nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> [snip]
>
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> > index 0aee377439..6eb1385c50 100755
> > --- a/t/t0602-reffiles-fsck.sh
> > +++ b/t/t0602-reffiles-fsck.sh
> > @@ -105,4 +105,63 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
> > test_must_be_empty err
> > '
> >
> > +test_expect_success 'ref name check should work for multiple worktrees' '
> > + test_when_finished "rm -rf repo" &&
> > + git init repo &&
> > +
> > + cd repo &&
> > + test_commit initial &&
> > + git checkout -b branch-1 &&
> > + test_commit second &&
> > + git checkout -b branch-2 &&
> > + test_commit third &&
> > + git checkout -b branch-3 &&
> > + git worktree add ./worktree-1 branch-1 &&
> > + git worktree add ./worktree-2 branch-2 &&
> > + worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
> > + worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
> > +
> > + (
> > + cd worktree-1 &&
> > + git update-ref refs/worktree/branch-4 refs/heads/branch-3
> > + ) &&
> > + (
> > + cd worktree-2 &&
> > + git update-ref refs/worktree/branch-4 refs/heads/branch-3
> > + ) &&
> > +
> > + cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
> > + cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
> > +
> > + test_must_fail git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
> > + error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
> > + EOF
> > + sort err >sorted_err &&
> > + test_cmp expect sorted_err &&
> > +
> > + (
> > + cd worktree-1 &&
> > + test_must_fail git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
> > + error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
> > + EOF
> > + sort err >sorted_err &&
> > + test_cmp expect sorted_err
> > + ) &&
> > +
> > + (
> > + cd worktree-2 &&
> > + test_must_fail git refs verify 2>err &&
> > + cat >expect <<-EOF &&
> > + error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
> > + error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
> > + EOF
> > + sort err >sorted_err &&
> > + test_cmp expect sorted_err
> > + )
>
> These last three loops are the same, couldn't we loop?
>
> for dir in "." "worktree-1" "worktree-2"
> do
> ...
> done
>
Actually, I guess all the tests could be written with that way. I need
to refactor in the next version to make the tests cleaner.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 4/9] ref: support multiple worktrees check for refs
2024-10-21 13:34 ` [PATCH v6 4/9] ref: support multiple worktrees check for refs shejialuo
2024-10-21 15:56 ` karthik nayak
@ 2024-11-05 7:11 ` Patrick Steinhardt
2024-11-05 12:52 ` shejialuo
1 sibling, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-05 7:11 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 21, 2024 at 09:34:40PM +0800, shejialuo wrote:
> We have already set up the infrastructure to check the consistency for
> refs, but we do not support multiple worktrees. As we decide to add more
> checks for ref content, we need to set up support for multiple
> worktrees.
I don't quite follow that logic: the fact that we perform more checks
for the ref content doesn't necessarily mean that we also have to check
worktree refs. We rather want to do that so that we get feature parity
with git-fsck(1) eventually, don't we?
> @@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
> static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> {
> struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
> + struct worktree **worktrees, **p;
> const char * const verify_usage[] = {
> REFS_VERIFY_USAGE,
> NULL,
Instead of declaring the `**p` variable we can instead...
> @@ -84,9 +86,15 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> git_config(git_fsck_config, &fsck_refs_options);
> prepare_repo_settings(the_repository);
>
> - ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
> + worktrees = get_worktrees();
> + for (p = worktrees; *p; p++) {
> + struct worktree *wt = *p;
> + ret |= refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options, wt);
> + }
> +
... refactor this loop like this:
for (size_t i = 0; worktrees[i]; i++)
ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
&fsck_refs_options, worktrees[i]);
I was briefly wondering whether we also get worktrees in case the repo
is bare, as we don't actually have a proper worktree there. But the
answer seems to be "yes".
> @@ -3558,6 +3560,9 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
> } else if (S_ISREG(iter->st.st_mode) ||
> S_ISLNK(iter->st.st_mode)) {
> strbuf_reset(&target_name);
> +
> + if (!is_main_worktree(wt))
> + strbuf_addf(&target_name, "worktrees/%s/", wt->id);
> strbuf_addf(&target_name, "%s/%s", refs_check_dir,
> iter->relative_path);
>
Hm. Isn't it somewhat duplicate to pass both the prepended target name
_and_ the worktree to the callback? I imagine that we'd have to
eventually strip the worktree prefix to find the correct ref, unless we
end up using the main ref store to look up the ref.
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 07c57fd541..46dcaec654 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -13,6 +13,7 @@
> #include "../lockfile.h"
> #include "../chdir-notify.h"
> #include "../statinfo.h"
> +#include "../worktree.h"
> #include "../wrapper.h"
> #include "../write-or-die.h"
> #include "../trace2.h"
> @@ -1754,8 +1755,13 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> }
>
> static int packed_fsck(struct ref_store *ref_store UNUSED,
> - struct fsck_options *o UNUSED)
> + struct fsck_options *o UNUSED,
> + struct worktree *wt)
> {
> +
> + if (!is_main_worktree(wt))
> + return 0;
> +
> return 0;
> }
It's somewhat funny to have this condition here, but it does make sense
overall as worktrees never have packed refs in the first place.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 4/9] ref: support multiple worktrees check for refs
2024-11-05 7:11 ` Patrick Steinhardt
@ 2024-11-05 12:52 ` shejialuo
2024-11-06 6:34 ` Patrick Steinhardt
0 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-11-05 12:52 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Tue, Nov 05, 2024 at 08:11:49AM +0100, Patrick Steinhardt wrote:
> On Mon, Oct 21, 2024 at 09:34:40PM +0800, shejialuo wrote:
> > We have already set up the infrastructure to check the consistency for
> > refs, but we do not support multiple worktrees. As we decide to add more
> > checks for ref content, we need to set up support for multiple
> > worktrees.
>
> I don't quite follow that logic: the fact that we perform more checks
> for the ref content doesn't necessarily mean that we also have to check
> worktree refs. We rather want to do that so that we get feature parity
> with git-fsck(1) eventually, don't we?
>
Yes, I agree. I come across why I wrote such message. Actually, in the
very early implementation, I didn't consider about worktree situation
for the "escape". And I thought I should add support for worktree. So, I
made a mistake.
[snip]
> > @@ -3558,6 +3560,9 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
> > } else if (S_ISREG(iter->st.st_mode) ||
> > S_ISLNK(iter->st.st_mode)) {
> > strbuf_reset(&target_name);
> > +
> > + if (!is_main_worktree(wt))
> > + strbuf_addf(&target_name, "worktrees/%s/", wt->id);
> > strbuf_addf(&target_name, "%s/%s", refs_check_dir,
> > iter->relative_path);
> >
>
> Hm. Isn't it somewhat duplicate to pass both the prepended target name
> _and_ the worktree to the callback? I imagine that we'd have to
> eventually strip the worktree prefix to find the correct ref, unless we
> end up using the main ref store to look up the ref.
>
Actually, the worktree won't be passed to the callback. The
`fsck_refs_fn` function will never use worktree `wt`. The reason why I
use `wt` is that we need to print _full_ path information to the user
when error happens for the situation where worktree A and worktree B has
the same ref name "refs/worktree/foo".
I agree that we will strip the worktree prefix to find the correct ref
in the file system. This is done by the following statement:
strbuf_addf(&sb, "%s/%s", ref_store->gitdir, refs_check_dir);
For worktree, `ref_store->gitdir` will automatically be
`.git/worktrees/<id>`.
In the v5, I didn't print the full path and we even didn't need the
parameter `wt`. However, if we want to print the following info:
worktrees/<id>/refs/worktree/a
So, just because we need the `worktrees/<id>` information. Actually, we
could also get the information by using "ref_store->gitdir" and
"ref_store->repo->gitdir". However, this is cumbersome and it's a bad
idea. So I change the prototype of "fsck_fn" to add a new parameter
"struct worktree *".
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index 07c57fd541..46dcaec654 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -13,6 +13,7 @@
> > #include "../lockfile.h"
> > #include "../chdir-notify.h"
> > #include "../statinfo.h"
> > +#include "../worktree.h"
> > #include "../wrapper.h"
> > #include "../write-or-die.h"
> > #include "../trace2.h"
> > @@ -1754,8 +1755,13 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> > }
> >
> > static int packed_fsck(struct ref_store *ref_store UNUSED,
> > - struct fsck_options *o UNUSED)
> > + struct fsck_options *o UNUSED,
> > + struct worktree *wt)
> > {
> > +
> > + if (!is_main_worktree(wt))
> > + return 0;
> > +
> > return 0;
> > }
>
> It's somewhat funny to have this condition here, but it does make sense
> overall as worktrees never have packed refs in the first place.
>
Yes, there is no packed-refs in the worktree. And we need to prevent
calling multiple times.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 4/9] ref: support multiple worktrees check for refs
2024-11-05 12:52 ` shejialuo
@ 2024-11-06 6:34 ` Patrick Steinhardt
2024-11-06 12:20 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-06 6:34 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Tue, Nov 05, 2024 at 08:52:19PM +0800, shejialuo wrote:
> On Tue, Nov 05, 2024 at 08:11:49AM +0100, Patrick Steinhardt wrote:
> > On Mon, Oct 21, 2024 at 09:34:40PM +0800, shejialuo wrote:
> > > @@ -3558,6 +3560,9 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
> > > } else if (S_ISREG(iter->st.st_mode) ||
> > > S_ISLNK(iter->st.st_mode)) {
> > > strbuf_reset(&target_name);
> > > +
> > > + if (!is_main_worktree(wt))
> > > + strbuf_addf(&target_name, "worktrees/%s/", wt->id);
> > > strbuf_addf(&target_name, "%s/%s", refs_check_dir,
> > > iter->relative_path);
> > >
> >
> > Hm. Isn't it somewhat duplicate to pass both the prepended target name
> > _and_ the worktree to the callback? I imagine that we'd have to
> > eventually strip the worktree prefix to find the correct ref, unless we
> > end up using the main ref store to look up the ref.
> >
>
> Actually, the worktree won't be passed to the callback. The
> `fsck_refs_fn` function will never use worktree `wt`. The reason why I
> use `wt` is that we need to print _full_ path information to the user
> when error happens for the situation where worktree A and worktree B has
> the same ref name "refs/worktree/foo".
>
> I agree that we will strip the worktree prefix to find the correct ref
> in the file system. This is done by the following statement:
>
> strbuf_addf(&sb, "%s/%s", ref_store->gitdir, refs_check_dir);
>
> For worktree, `ref_store->gitdir` will automatically be
> `.git/worktrees/<id>`.
>
> In the v5, I didn't print the full path and we even didn't need the
> parameter `wt`. However, if we want to print the following info:
>
> worktrees/<id>/refs/worktree/a
>
> So, just because we need the `worktrees/<id>` information. Actually, we
> could also get the information by using "ref_store->gitdir" and
> "ref_store->repo->gitdir". However, this is cumbersome and it's a bad
> idea. So I change the prototype of "fsck_fn" to add a new parameter
> "struct worktree *".
In practice you can also derive that full refname from the worktree
itself, as the ID is stored in `struct worktree::id`. Would that maybe
be a better solution?
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 4/9] ref: support multiple worktrees check for refs
2024-11-06 6:34 ` Patrick Steinhardt
@ 2024-11-06 12:20 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-06 12:20 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Wed, Nov 06, 2024 at 07:34:08AM +0100, Patrick Steinhardt wrote:
[snip]
>
> In practice you can also derive that full refname from the worktree
> itself, as the ID is stored in `struct worktree::id`. Would that maybe
> be a better solution?
>
I think we are on the same boat. This is exactly what I have done in
this patch.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v6 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (3 preceding siblings ...)
2024-10-21 13:34 ` [PATCH v6 4/9] ref: support multiple worktrees check for refs shejialuo
@ 2024-10-21 13:34 ` shejialuo
2024-11-05 7:11 ` Patrick Steinhardt
2024-10-21 13:34 ` [PATCH v6 6/9] ref: add more strict checks for regular refs shejialuo
` (6 subsequent siblings)
11 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:34 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
"git-fsck(1)" implicitly checks the ref content by passing the
callback "fsck_handle_ref" to the "refs.c::refs_for_each_rawref".
Then, it will check whether the ref content (eventually "oid")
is valid. If not, it will report the following error to the user.
error: refs/heads/main: invalid sha1 pointer 0000...
And it will also report above errors when there are dangling symrefs
in the repository wrongly. This does not align with the behavior of
the "git symbolic-ref" command which allows users to create dangling
symrefs.
As we have already introduced the "git refs verify" command, we'd better
check the ref content explicitly in the "git refs verify" command thus
later we could remove these checks in "git-fsck(1)" and launch a
subprocess to call "git refs verify" in "git-fsck(1)" to make the
"git-fsck(1)" more clean.
Following what "git-fsck(1)" does, add a similar check to "git refs
verify". Then add a new fsck error message "badRefContent(ERROR)" to
represent that a ref has an invalid content.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 43 +++++++++++++
t/t0602-reffiles-fsck.sh | 117 ++++++++++++++++++++++++++++++++++
4 files changed, 164 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 68a2801f15..22c385ea22 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -19,6 +19,9 @@
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
+`badRefContent`::
+ (ERROR) A ref has bad content.
+
`badRefFiletype`::
(ERROR) A ref has a bad file type.
diff --git a/fsck.h b/fsck.h
index 500b4c04d2..0d99a87911 100644
--- a/fsck.h
+++ b/fsck.h
@@ -31,6 +31,7 @@ enum fsck_msg_type {
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
+ FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 24ad73faba..2861980bdd 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3505,6 +3505,48 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *target_name,
struct dir_iterator *iter);
+static int files_fsck_refs_content(struct ref_store *ref_store,
+ struct fsck_options *o,
+ const char *target_name,
+ struct dir_iterator *iter)
+{
+ struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf referent = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ unsigned int type = 0;
+ int failure_errno = 0;
+ struct object_id oid;
+ int ret = 0;
+
+ report.path = target_name;
+
+ if (S_ISLNK(iter->st.st_mode))
+ goto cleanup;
+
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "cannot read ref file '%s': (%s)",
+ iter->path.buf, strerror(errno));
+ goto cleanup;
+ }
+
+ if (parse_loose_ref_contents(ref_store->repo->hash_algo,
+ ref_content.buf, &oid, &referent,
+ &type, &failure_errno)) {
+ strbuf_rtrim(&ref_content);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "%s", ref_content.buf);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&ref_content);
+ strbuf_release(&referent);
+ return ret;
+}
+
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
const char *target_name,
@@ -3597,6 +3639,7 @@ static int files_fsck_refs(struct ref_store *ref_store,
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
+ files_fsck_refs_content,
NULL,
};
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 6eb1385c50..29bdd3fc01 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -164,4 +164,121 @@ test_expect_success 'ref name check should work for multiple worktrees' '
)
'
+test_expect_success 'regular ref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ bad_content=$(git rev-parse main)x &&
+ printf "%s" $bad_content >$tag_dir_prefix/tag-bad-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content
+ EOF
+ rm $tag_dir_prefix/tag-bad-1 &&
+ test_cmp expect err &&
+
+ bad_content=xfsazqfxcadas &&
+ printf "%s" $bad_content >$tag_dir_prefix/tag-bad-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content
+ EOF
+ rm $tag_dir_prefix/tag-bad-2 &&
+ test_cmp expect err &&
+
+ bad_content=Xfsazqfxcadas &&
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err
+'
+
+test_expect_success 'regular ref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
+test_expect_success 'ref content checks should work with worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ bad_content_1=$(git rev-parse HEAD)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+
+ printf "%s" $bad_content_1 >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content_1
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err &&
+
+ printf "%s" $bad_content_2 >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content_2
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err &&
+
+ printf "%s" $bad_content_3 >$worktree1_refdir_prefix/bad-branch-3 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-3: badRefContent: $bad_content_3
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-3 &&
+ test_cmp expect err
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v6 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-10-21 13:34 ` [PATCH v6 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-11-05 7:11 ` Patrick Steinhardt
0 siblings, 0 replies; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-05 7:11 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Mon, Oct 21, 2024 at 09:34:47PM +0800, shejialuo wrote:
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 24ad73faba..2861980bdd 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3505,6 +3505,48 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
> const char *target_name,
> struct dir_iterator *iter);
>
> +static int files_fsck_refs_content(struct ref_store *ref_store,
> + struct fsck_options *o,
> + const char *target_name,
> + struct dir_iterator *iter)
> +{
> + struct strbuf ref_content = STRBUF_INIT;
> + struct strbuf referent = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> + unsigned int type = 0;
> + int failure_errno = 0;
> + struct object_id oid;
> + int ret = 0;
> +
> + report.path = target_name;
> +
> + if (S_ISLNK(iter->st.st_mode))
> + goto cleanup;
> +
> + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_CONTENT,
> + "cannot read ref file '%s': (%s)",
> + iter->path.buf, strerror(errno));
> + goto cleanup;
> + }
Let's drop the braces around `(%s)`, we don't print such braces in
`warning_errno()` or `die_errno()`, either.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v6 6/9] ref: add more strict checks for regular refs
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (4 preceding siblings ...)
2024-10-21 13:34 ` [PATCH v6 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-10-21 13:34 ` shejialuo
2024-10-21 13:35 ` [PATCH v6 7/9] ref: add basic symref content check for files backend shejialuo
` (5 subsequent siblings)
11 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:34 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already used "parse_loose_ref_contents" function to check
whether the ref content is valid in files backend. However, by
using "parse_loose_ref_contents", we allow the ref's content to end with
garbage or without a newline.
Even though we never create such loose refs ourselves, we have accepted
such loose refs. So, it is entirely possible that some third-party tools
may rely on such loose refs being valid. We should not report an error
fsck message at current. We should notify the users about such
"curiously formatted" loose refs so that adequate care is taken before
we decide to tighten the rules in the future.
And it's not suitable either to report a warn fsck message to the user.
We don't yet want the "--strict" flag that controls this bit to end up
generating errors for such weirdly-formatted reference contents, as we
first want to assess whether this retroactive tightening will cause
issues for any tools out there. It may cause compatibility issues which
may break the repository. So, we add the following two fsck infos to
represent the situation where the ref content ends without newline or
has trailing garbages:
1. refMissingNewline(INFO): A loose ref that does not end with
newline(LF).
2. trailingRefContent(INFO): A loose ref has trailing content.
It might appear that we can't provide the user with any warnings by
using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert
FSCK_INFO to FSCK_WARN and we can still warn the user about these
situations when using "git refs verify" without introducing
compatibility issues.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 14 ++++++++
fsck.h | 2 ++
refs.c | 2 +-
refs/files-backend.c | 26 ++++++++++++--
refs/refs-internal.h | 2 +-
t/t0602-reffiles-fsck.sh | 67 +++++++++++++++++++++++++++++++++++
6 files changed, 108 insertions(+), 5 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 22c385ea22..6db0eaa84a 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -173,6 +173,20 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`refMissingNewline`::
+ (INFO) A loose ref that does not end with newline(LF). As
+ valid implementations of Git never created such a loose ref
+ file, it may become an error in the future. Report to the
+ git@vger.kernel.org mailing list if you see this error, as
+ we need to know what tools created such a file.
+
+`trailingRefContent`::
+ (INFO) A loose ref has trailing content. As valid implementations
+ of Git never created such a loose ref file, it may become an
+ error in the future. Report to the git@vger.kernel.org mailing
+ list if you see this error, as we need to know what tools
+ created such a file.
+
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
diff --git a/fsck.h b/fsck.h
index 0d99a87911..b85072df57 100644
--- a/fsck.h
+++ b/fsck.h
@@ -85,6 +85,8 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs.c b/refs.c
index 395a17273c..f88b32a633 100644
--- a/refs.c
+++ b/refs.c
@@ -1789,7 +1789,7 @@ static int refs_read_special_head(struct ref_store *ref_store,
}
result = parse_loose_ref_contents(ref_store->repo->hash_algo, content.buf,
- oid, referent, type, failure_errno);
+ oid, referent, type, NULL, failure_errno);
done:
strbuf_release(&full_path);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 2861980bdd..b1fba92e5f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -569,7 +569,7 @@ static int read_ref_internal(struct ref_store *ref_store, const char *refname,
buf = sb_contents.buf;
ret = parse_loose_ref_contents(ref_store->repo->hash_algo, buf,
- oid, referent, type, &myerr);
+ oid, referent, type, NULL, &myerr);
out:
if (ret && !myerr)
@@ -606,7 +606,7 @@ static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno)
+ const char **trailing, int *failure_errno)
{
const char *p;
if (skip_prefix(buf, "ref:", &buf)) {
@@ -628,6 +628,10 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
*failure_errno = EINVAL;
return -1;
}
+
+ if (trailing)
+ *trailing = p;
+
return 0;
}
@@ -3513,6 +3517,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct strbuf ref_content = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
+ const char *trailing = NULL;
unsigned int type = 0;
int failure_errno = 0;
struct object_id oid;
@@ -3533,7 +3538,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
if (parse_loose_ref_contents(ref_store->repo->hash_algo,
ref_content.buf, &oid, &referent,
- &type, &failure_errno)) {
+ &type, &trailing, &failure_errno)) {
strbuf_rtrim(&ref_content);
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_CONTENT,
@@ -3541,6 +3546,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
+ if (!(type & REF_ISSYMREF)) {
+ if (!*trailing) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ goto cleanup;
+ }
+ if (*trailing != '\n' || *(trailing + 1)) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing garbage: '%s'", trailing);
+ goto cleanup;
+ }
+ }
+
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 037d7991cd..125f1fe735 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -716,7 +716,7 @@ struct ref_store {
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno);
+ const char **trailing, int *failure_errno);
/*
* Fill in the generic part of refs and add it to our collection of
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 29bdd3fc01..0418d79c4f 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -201,6 +201,61 @@ test_expect_success 'regular ref content should be checked (individual)' '
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
EOF
rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-garbage-1: trailingRefContent: has trailing garbage: '\''
+
+
+ '\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-1 &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-2 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-garbage-2: trailingRefContent: has trailing garbage: '\''
+
+
+ garbage'\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-2 &&
+ test_cmp expect err &&
+
+ printf "%s garbage\na" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-garbage-3: trailingRefContent: has trailing garbage: '\'' garbage
+ a'\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-3 &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-4 &&
+ test_must_fail git -c fsck.trailingRefContent=error refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/tag-garbage-4: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $tag_dir_prefix/tag-garbage-4 &&
test_cmp expect err
'
@@ -219,12 +274,16 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
@@ -278,6 +337,14 @@ test_expect_success 'ref content checks should work with worktrees' '
error: worktrees/worktree-1/refs/worktree/bad-branch-3: badRefContent: $bad_content_3
EOF
rm $worktree1_refdir_prefix/bad-branch-3 &&
+ test_cmp expect err &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $worktree1_refdir_prefix/branch-no-newline &&
test_cmp expect err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v6 7/9] ref: add basic symref content check for files backend
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (5 preceding siblings ...)
2024-10-21 13:34 ` [PATCH v6 6/9] ref: add more strict checks for regular refs shejialuo
@ 2024-10-21 13:35 ` shejialuo
2024-10-21 13:35 ` [PATCH v6 8/9] ref: check whether the target of the symref is a ref shejialuo
` (4 subsequent siblings)
11 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:35 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have code that checks regular ref contents, but we do not yet check
the contents of symbolic refs. By using "parse_loose_ref_content" for
symbolic refs, we will get the information of the "referent".
We do not need to check the "referent" by opening the file. This is
because if "referent" exists in the file system, we will eventually
check its correctness by inspecting every file in the "refs" directory.
If the "referent" does not exist in the filesystem, this is OK as it is
seen as the dangling symref.
So we just need to check the "referent" string content. A regular ref
could be accepted as a textual symref if it begins with "ref:", followed
by zero or more whitespaces, followed by the full refname, followed only
by whitespace characters. However, we always write a single SP after
"ref:" and a single LF after the refname. It may seem that we should
report a fsck error message when the "referent" does not apply above
rules and we should not be so aggressive because third-party
reimplementations of Git may have taken advantage of the looser syntax.
Put it more specific, we accept the following contents:
1. "ref: refs/heads/master "
2. "ref: refs/heads/master \n \n"
3. "ref: refs/heads/master\n\n"
When introducing the regular ref content checks, we created two fsck
infos "refMissingNewline" and "trailingRefContent" which exactly
represents above situations. So we will reuse these two fsck messages to
write checks to info the user about these situations.
But we do not allow any other trailing garbage. The followings are bad
symref contents which will be reported as fsck error by "git-fsck(1)".
1. "ref: refs/heads/master garbage\n"
2. "ref: refs/heads/master \n\n\n garbage "
And we introduce a new "badReferentName(ERROR)" fsck message to report
above errors by using "is_root_ref" and "check_refname_format" to check
the "referent". Since both "is_root_ref" and "check_refname_format"
don't work with whitespaces, we use the trimmed version of "referent"
with these functions.
In order to add checks, we will do the following things:
1. Record the untrimmed length "orig_len" and untrimmed last byte
"orig_last_byte".
2. Use "strbuf_rtrim" to trim the whitespaces or newlines to make sure
"is_root_ref" and "check_refname_format" won't be failed by them.
3. Use "orig_len" and "orig_last_byte" to check whether the "referent"
misses '\n' at the end or it has trailing whitespaces or newlines.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 40 ++++++++++++
t/t0602-reffiles-fsck.sh | 111 ++++++++++++++++++++++++++++++++++
4 files changed, 155 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 6db0eaa84a..dcea05edfc 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -28,6 +28,9 @@
`badRefName`::
(ERROR) A ref has an invalid format.
+`badReferentName`::
+ (ERROR) The referent name of a symref is invalid.
+
`badTagName`::
(INFO) A tag has an invalid format.
diff --git a/fsck.h b/fsck.h
index b85072df57..5227dfdef2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -34,6 +34,7 @@ enum fsck_msg_type {
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
+ FUNC(BAD_REFERENT_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
FUNC(BAD_TREE, ERROR) \
FUNC(BAD_TREE_SHA1, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b1fba92e5f..1a267547f2 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3509,6 +3509,43 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *target_name,
struct dir_iterator *iter);
+static int files_fsck_symref_target(struct fsck_options *o,
+ struct fsck_ref_report *report,
+ struct strbuf *referent)
+{
+ char orig_last_byte;
+ size_t orig_len;
+ int ret = 0;
+
+ orig_len = referent->len;
+ orig_last_byte = referent->buf[orig_len - 1];
+ strbuf_rtrim(referent);
+
+ if (!is_root_ref(referent->buf) &&
+ check_refname_format(referent->buf, 0)) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_BAD_REFERENT_NAME,
+ "points to invalid refname '%s'", referent->buf);
+ goto out;
+ }
+
+ if (referent->len == orig_len ||
+ (referent->len < orig_len && orig_last_byte != '\n')) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ }
+
+ if (referent->len != orig_len && referent->len != orig_len - 1) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing whitespaces or newlines");
+ }
+
+out:
+ return ret;
+}
+
static int files_fsck_refs_content(struct ref_store *ref_store,
struct fsck_options *o,
const char *target_name,
@@ -3559,6 +3596,9 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
"has trailing garbage: '%s'", trailing);
goto cleanup;
}
+ } else {
+ ret = files_fsck_symref_target(o, &report, &referent);
+ goto cleanup;
}
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 0418d79c4f..f475966d7b 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -289,6 +289,109 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'textual symref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err &&
+
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-head &&
+ test_must_be_empty err &&
+
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err
+'
+
+test_expect_success 'textual symref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
@@ -345,6 +448,14 @@ test_expect_success 'ref content checks should work with worktrees' '
warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $worktree1_refdir_prefix/branch-garbage &&
test_cmp expect err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v6 8/9] ref: check whether the target of the symref is a ref
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (6 preceding siblings ...)
2024-10-21 13:35 ` [PATCH v6 7/9] ref: add basic symref content check for files backend shejialuo
@ 2024-10-21 13:35 ` shejialuo
2024-10-21 13:35 ` [PATCH v6 9/9] ref: add symlink ref content check for files backend shejialuo
` (3 subsequent siblings)
11 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:35 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Ideally, we want to the users use "git symbolic-ref" to create symrefs
instead of writing raw contents into the filesystem. However, "git
symbolic-ref" is strict with the refname but not strict with the
referent. For example, we can make the "referent" located at the
"$(gitdir)/logs/aaa" and manually write the content into this where we
can still successfully parse this symref by using "git rev-parse".
$ git init repo && cd repo && git commit --allow-empty -mx
$ git symbolic-ref refs/heads/test logs/aaa
$ echo $(git rev-parse HEAD) > .git/logs/aaa
$ git rev-parse test
We may need to add some restrictions for "referent" parameter when using
"git symbolic-ref" to create symrefs because ideally all the
nonpseudo-refs should be located under the "refs" directory and we may
tighten this in the future.
In order to tell the user we may tighten the above situation, create
a new fsck message "symrefTargetIsNotARef" to notify the user that this
may become an error in the future.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 9 +++++++++
fsck.h | 1 +
refs/files-backend.c | 14 ++++++++++++--
t/t0602-reffiles-fsck.sh | 28 ++++++++++++++++++++++++++++
4 files changed, 50 insertions(+), 2 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index dcea05edfc..f82ebc58e8 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,15 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symrefTargetIsNotARef`::
+ (INFO) The target of a symbolic reference points neither to
+ a root reference nor to a reference starting with "refs/".
+ Although we allow create a symref pointing to the referent which
+ is outside the "ref" by using `git symbolic-ref`, we may tighten
+ the rule in the future. Report to the git@vger.kernel.org
+ mailing list if you see this error, as we need to know what tools
+ created such a file.
+
`trailingRefContent`::
(INFO) A loose ref has trailing content. As valid implementations
of Git never created such a loose ref file, it may become an
diff --git a/fsck.h b/fsck.h
index 5227dfdef2..53a47612e6 100644
--- a/fsck.h
+++ b/fsck.h
@@ -87,6 +87,7 @@ enum fsck_msg_type {
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1a267547f2..b4912af3b5 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3513,6 +3513,7 @@ static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
struct strbuf *referent)
{
+ int is_referent_root;
char orig_last_byte;
size_t orig_len;
int ret = 0;
@@ -3521,8 +3522,17 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_last_byte = referent->buf[orig_len - 1];
strbuf_rtrim(referent);
- if (!is_root_ref(referent->buf) &&
- check_refname_format(referent->buf, 0)) {
+ is_referent_root = is_root_ref(referent->buf);
+ if (!is_referent_root &&
+ !starts_with(referent->buf, "refs/") &&
+ !starts_with(referent->buf, "worktrees/")) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_SYMREF_TARGET_IS_NOT_A_REF,
+ "points to non-ref target '%s'", referent->buf);
+
+ }
+
+ if (!is_referent_root && check_refname_format(referent->buf, 0)) {
ret = fsck_report_ref(o, report,
FSCK_MSG_BAD_REFERENT_NAME,
"points to invalid refname '%s'", referent->buf);
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index f475966d7b..c6d40ce9a1 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -392,6 +392,34 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'the target of the textual symref should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err &&
+
+ printf "ref: refs/foo\n" >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err &&
+
+ printf "ref: refs-back/heads/main\n" >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''refs-back/heads/main'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v6 9/9] ref: add symlink ref content check for files backend
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (7 preceding siblings ...)
2024-10-21 13:35 ` [PATCH v6 8/9] ref: check whether the target of the symref is a ref shejialuo
@ 2024-10-21 13:35 ` shejialuo
2024-10-21 16:09 ` [PATCH v6 0/9] add " Taylor Blau
` (2 subsequent siblings)
11 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-21 13:35 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Besides the textual symref, we also allow symbolic links as the symref.
So, we should also provide the consistency check as what we have done
for textual symref. And also we consider deprecating writing the
symbolic links. We first need to access whether symbolic links still
be used. So, add a new fsck message "symlinkRef(INFO)" to tell the
user be aware of this information.
We have already introduced "files_fsck_symref_target". We should reuse
this function to handle the symrefs which use legacy symbolic links. We
should not check the trailing garbage for symbolic refs. Add a new
parameter "symbolic_link" to disable some checks which should only be
executed for textual symrefs.
And we need to also generate the "referent" parameter for reusing
"files_fsck_symref_target" by the following steps:
1. Use "strbuf_add_real_path" to resolve the symlink and get the
absolute path "ref_content" which the symlink ref points to.
2. Generate the absolute path "abs_gitdir" of "gitdir" and combine
"ref_content" and "abs_gitdir" to extract the relative path
"relative_referent_path".
3. If "ref_content" is outside of "gitdir", we just set "referent" with
"ref_content". Instead, we set "referent" with
"relative_referent_path".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 6 +++++
fsck.h | 1 +
refs/files-backend.c | 38 +++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 45 +++++++++++++++++++++++++++++++++++
4 files changed, 86 insertions(+), 4 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index f82ebc58e8..b14bc44ca4 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,12 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symlinkRef`::
+ (INFO) A symbolic link is used as a symref. Report to the
+ git@vger.kernel.org mailing list if you see this error, as we
+ are assessing the feasibility of dropping the support to drop
+ creating symbolic links as symrefs.
+
`symrefTargetIsNotARef`::
(INFO) The target of a symbolic reference points neither to
a root reference nor to a reference starting with "refs/".
diff --git a/fsck.h b/fsck.h
index 53a47612e6..a44c231a5f 100644
--- a/fsck.h
+++ b/fsck.h
@@ -86,6 +86,7 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(SYMLINK_REF, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b4912af3b5..180f8e28b7 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1,6 +1,7 @@
#define USE_THE_REPOSITORY_VARIABLE
#include "../git-compat-util.h"
+#include "../abspath.h"
#include "../config.h"
#include "../copy.h"
#include "../environment.h"
@@ -3511,7 +3512,8 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
- struct strbuf *referent)
+ struct strbuf *referent,
+ unsigned int symbolic_link)
{
int is_referent_root;
char orig_last_byte;
@@ -3520,7 +3522,8 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_len = referent->len;
orig_last_byte = referent->buf[orig_len - 1];
- strbuf_rtrim(referent);
+ if (!symbolic_link)
+ strbuf_rtrim(referent);
is_referent_root = is_root_ref(referent->buf);
if (!is_referent_root &&
@@ -3539,6 +3542,9 @@ static int files_fsck_symref_target(struct fsck_options *o,
goto out;
}
+ if (symbolic_link)
+ goto out;
+
if (referent->len == orig_len ||
(referent->len < orig_len && orig_last_byte != '\n')) {
ret = fsck_report_ref(o, report,
@@ -3562,6 +3568,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct dir_iterator *iter)
{
struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf abs_gitdir = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
const char *trailing = NULL;
@@ -3572,8 +3579,30 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
report.path = target_name;
- if (S_ISLNK(iter->st.st_mode))
+ if (S_ISLNK(iter->st.st_mode)) {
+ const char* relative_referent_path = NULL;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_SYMLINK_REF,
+ "use deprecated symbolic link for symref");
+
+ strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
+ strbuf_normalize_path(&abs_gitdir);
+ if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
+ strbuf_addch(&abs_gitdir, '/');
+
+ strbuf_add_real_path(&ref_content, iter->path.buf);
+ skip_prefix(ref_content.buf, abs_gitdir.buf,
+ &relative_referent_path);
+
+ if (relative_referent_path)
+ strbuf_addstr(&referent, relative_referent_path);
+ else
+ strbuf_addbuf(&referent, &ref_content);
+
+ ret |= files_fsck_symref_target(o, &report, &referent, 1);
goto cleanup;
+ }
if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
ret = fsck_report_ref(o, &report,
@@ -3607,13 +3636,14 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
} else {
- ret = files_fsck_symref_target(o, &report, &referent);
+ ret = files_fsck_symref_target(o, &report, &referent, 0);
goto cleanup;
}
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
+ strbuf_release(&abs_gitdir);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index c6d40ce9a1..aee7e04b82 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -420,6 +420,51 @@ test_expect_success 'the target of the textual symref should be checked' '
test_cmp expect err
'
+test_expect_success SYMLINKS 'symlink symref content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v6 0/9] add ref content check for files backend
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (8 preceding siblings ...)
2024-10-21 13:35 ` [PATCH v6 9/9] ref: add symlink ref content check for files backend shejialuo
@ 2024-10-21 16:09 ` Taylor Blau
2024-10-22 11:41 ` shejialuo
2024-10-21 16:18 ` Taylor Blau
2024-11-10 12:07 ` [PATCH v7 " shejialuo
11 siblings, 1 reply; 209+ messages in thread
From: Taylor Blau @ 2024-10-21 16:09 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Junio C Hamano
On Mon, Oct 21, 2024 at 09:32:20PM +0800, shejialuo wrote:
> Hi All:
>
> This new version updates the following things.
I am assuming that this new round was rebased onto the tip of 'master',
since I could not apply it on top of its original base
b3d175409d9 (Merge branch 'sj/ref-fsck', 2024-08-16)
In the future, please indicate when you rebase your series so that I
know what the correct base is for that round.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 0/9] add ref content check for files backend
2024-10-21 16:09 ` [PATCH v6 0/9] add " Taylor Blau
@ 2024-10-22 11:41 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-10-22 11:41 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Patrick Steinhardt, Karthik Nayak, Junio C Hamano
On Mon, Oct 21, 2024 at 12:09:44PM -0400, Taylor Blau wrote:
> On Mon, Oct 21, 2024 at 09:32:20PM +0800, shejialuo wrote:
> > Hi All:
> >
> > This new version updates the following things.
>
> I am assuming that this new round was rebased onto the tip of 'master',
> since I could not apply it on top of its original base
>
> b3d175409d9 (Merge branch 'sj/ref-fsck', 2024-08-16)
>
> In the future, please indicate when you rebase your series so that I
> know what the correct base is for that round.
>
Sorry for that Taylor. I have told Junio that I rebased the series in
the previous version. And I forgot you have become the intermediate
maintainer and didn't provide this information for you.
Thanks,
Jiauo
> Thanks,
> Taylor
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v6 0/9] add ref content check for files backend
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (9 preceding siblings ...)
2024-10-21 16:09 ` [PATCH v6 0/9] add " Taylor Blau
@ 2024-10-21 16:18 ` Taylor Blau
2024-11-10 12:07 ` [PATCH v7 " shejialuo
11 siblings, 0 replies; 209+ messages in thread
From: Taylor Blau @ 2024-10-21 16:18 UTC (permalink / raw)
To: shejialuo; +Cc: git, Patrick Steinhardt, Karthik Nayak, Junio C Hamano
On Mon, Oct 21, 2024 at 09:32:20PM +0800, shejialuo wrote:
> shejialuo (9):
> ref: initialize "fsck_ref_report" with zero
> ref: check the full refname instead of basename
> ref: initialize target name outside of check functions
> ref: support multiple worktrees check for refs
> ref: port git-fsck(1) regular refs check for files backend
> ref: add more strict checks for regular refs
> ref: add basic symref content check for files backend
> ref: check whether the target of the symref is a ref
> ref: add symlink ref content check for files backend
>
> Documentation/fsck-msgids.txt | 35 +++
> builtin/refs.c | 12 +-
> fsck.h | 6 +
> refs.c | 7 +-
> refs.h | 3 +-
> refs/debug.c | 5 +-
> refs/files-backend.c | 187 ++++++++++++--
> refs/packed-backend.c | 8 +-
> refs/refs-internal.h | 5 +-
> refs/reftable-backend.c | 3 +-
> t/t0602-reffiles-fsck.sh | 457 +++++++++++++++++++++++++++++++++-
> 11 files changed, 693 insertions(+), 35 deletions(-)
Great, thanks for the new round. Looking at the inter-diff, it looks
like this round also needs a fresh review. I'm catching up on new
threads from the weekend, so I'll put this on my review queue. But in
the meantime, if your mentors can look at it, that would be much
appreciated.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v7 0/9] add ref content check for files backend
2024-10-21 13:32 ` [PATCH v6 " shejialuo
` (10 preceding siblings ...)
2024-10-21 16:18 ` Taylor Blau
@ 2024-11-10 12:07 ` shejialuo
2024-11-10 12:09 ` [PATCH v7 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
` (10 more replies)
11 siblings, 11 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:07 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Hi All:
This new version solves the follow problems:
1. Enhance the commit message suggested by Patrick.
2. Rename "target_name" to "refname".
3. Enhance the shell scripts to use `for in` to avoid repetition. And
this is the main change of this new version.
Thanks,
Jialuo
shejialuo (9):
ref: initialize "fsck_ref_report" with zero
ref: check the full refname instead of basename
ref: initialize ref name outside of check functions
ref: support multiple worktrees check for refs
ref: port git-fsck(1) regular refs check for files backend
ref: add more strict checks for regular refs
ref: add basic symref content check for files backend
ref: check whether the target of the symref is a ref
ref: add symlink ref content check for files backend
Documentation/fsck-msgids.txt | 35 +++
builtin/refs.c | 10 +-
fsck.h | 6 +
refs.c | 7 +-
refs.h | 3 +-
refs/debug.c | 5 +-
refs/files-backend.c | 190 ++++++++++++--
refs/packed-backend.c | 8 +-
refs/refs-internal.h | 5 +-
refs/reftable-backend.c | 3 +-
t/t0602-reffiles-fsck.sh | 480 +++++++++++++++++++++++++++++++---
11 files changed, 690 insertions(+), 62 deletions(-)
Range-diff against v6:
1: 319f384f1c = 1: bfb2a21af4 ref: initialize "fsck_ref_report" with zero
2: 8662fc9679 ! 2: 9efc83f7ea ref: check the full refname instead of basename
@@ Commit message
In order to fix the above problem, enhance "files_fsck_refs_name" to use
the full name for "check_refname_format". Then, replace the tests which
- are related to "@" and add tests to exercise the above situations.
+ are related to "@" and add tests to exercise the above situations using
+ for loop to avoid repetition.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
@@ refs/files-backend.c: static int files_fsck_refs_name(struct ref_store *ref_stor
goto cleanup;
- if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
++ /*
++ * This works right now because we never check the root refs.
++ */
+ strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
+ if (check_refname_format(sb.buf, 0)) {
struct fsck_ref_report report = { 0 };
@@ refs/files-backend.c: static int files_fsck_refs_name(struct ref_store *ref_stor
## t/t0602-reffiles-fsck.sh ##
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name should be checked' '
- git tag tag-2 &&
- git tag multi_hierarchy/tag-2 &&
+ cd repo &&
-+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
-+ git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ EOF
-+ test_must_be_empty err &&
-+ rm $branch_dir_prefix/@ &&
-+
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
-@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name should be checked' '
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
+ git commit --allow-empty -m initial &&
+- git checkout -b branch-1 &&
+- git tag tag-1 &&
+- git commit --allow-empty -m second &&
+- git checkout -b branch-2 &&
+- git tag tag-2 &&
+- git tag multi_hierarchy/tag-2 &&
++ git checkout -b default-branch &&
++ git tag default-tag &&
++ git tag multi_hierarchy/default-tag &&
+- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+- test_must_fail git refs verify 2>err &&
+- cat >expect <<-EOF &&
+- error: refs/heads/.branch-1: badRefName: invalid refname format
+- EOF
+- rm $branch_dir_prefix/.branch-1 &&
+- test_cmp expect err &&
+-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
-+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\'' branch-1'\'' &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
+- test_must_fail git refs verify 2>err &&
+- cat >expect <<-EOF &&
- error: refs/heads/@: badRefName: invalid refname format
-+ error: refs/heads/ branch-1: badRefName: invalid refname format
- EOF
-- rm $branch_dir_prefix/@ &&
-+ rm $branch_dir_prefix/'\'' branch-1'\'' &&
- test_cmp expect err &&
+- EOF
++ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
++ git refs verify 2>err &&
++ test_must_be_empty err &&
+ rm $branch_dir_prefix/@ &&
+- test_cmp expect err &&
- cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ &&
-+ cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/'\''~tag-2'\'' &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
+- test_must_fail git refs verify 2>err &&
+- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format
-+ error: refs/tags/multi_hierarchy/~tag-2: badRefName: invalid refname format
- EOF
+- EOF
- rm $tag_dir_prefix/multi_hierarchy/@ &&
-+ rm $tag_dir_prefix/multi_hierarchy/'\''~tag-2'\'' &&
- test_cmp expect err &&
+- test_cmp expect err &&
+-
+- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock &&
++ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
+ git refs verify 2>err &&
+ rm $tag_dir_prefix/tag-1.lock &&
+ test_must_be_empty err &&
- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock &&
-@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name should be checked' '
+- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/.lock &&
++ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
error: refs/tags/.lock: badRefName: invalid refname format
EOF
rm $tag_dir_prefix/.lock &&
+- test_cmp expect err
+ test_cmp expect err &&
+
-+ mkdir $tag_dir_prefix/'\''~new-feature'\'' &&
-+ cp $tag_dir_prefix/tag-1 $tag_dir_prefix/'\''~new-feature'\''/tag-1 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: refs/tags/~new-feature/tag-1: badRefName: invalid refname format
-+ EOF
-+ rm -rf $tag_dir_prefix/'\''~new-feature'\'' &&
- test_cmp expect err
++ for refname in ".refname-starts-with-dot" "~refname-has-stride"
++ do
++ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: refs/heads/$refname: badRefName: invalid refname format
++ EOF
++ rm "$branch_dir_prefix/$refname" &&
++ test_cmp expect err || return 1
++ done &&
++
++ for refname in ".refname-starts-with-dot" "~refname-has-stride"
++ do
++ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: refs/tags/$refname: badRefName: invalid refname format
++ EOF
++ rm "$tag_dir_prefix/$refname" &&
++ test_cmp expect err || return 1
++ done &&
++
++ for refname in ".refname-starts-with-dot" "~refname-has-stride"
++ do
++ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
++ EOF
++ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
++ test_cmp expect err || return 1
++ done &&
++
++ for refname in ".refname-starts-with-dot" "~refname-has-stride"
++ do
++ mkdir "$branch_dir_prefix/$refname" &&
++ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
++ EOF
++ rm -r "$branch_dir_prefix/$refname" &&
++ test_cmp expect err || return 1
++ done
'
+ test_expect_success 'ref name check should be adapted into fsck messages' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+- tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ git commit --allow-empty -m initial &&
+ git checkout -b branch-1 &&
+- git tag tag-1 &&
+- git commit --allow-empty -m second &&
+- git checkout -b branch-2 &&
+- git tag tag-2 &&
+
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
+ git -c fsck.badRefName=warn refs verify 2>err &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name check should be adapted into fsck messages' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
-+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/'\''~branch-1'\'' &&
++ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
git -c fsck.badRefName=ignore refs verify 2>err &&
test_must_be_empty err
'
3: 96144756fe ! 3: 5ea7d18203 ref: initialize target name outside of check functions
@@ Metadata
Author: shejialuo <shejialuo@gmail.com>
## Commit message ##
- ref: initialize target name outside of check functions
+ ref: initialize ref name outside of check functions
We passes "refs_check_dir" to the "files_fsck_refs_name" function which
allows it to create the checked ref name later. However, when we
- introduce a new check function, we have to re-calculate the target name.
- It's bad for us to do repeat calculation. Instead, we should calculate
- it only once and pass the target name to the check functions.
+ introduce a new check function, we have to allocate redundant memory and
+ re-calculate the ref name. It's bad for us to allocate redundant memory
+ and duplicate logic. Instead, we should allocate and calculate it only
+ once and pass the ref name to the check functions.
In order not to do repeat calculation, rename "refs_check_dir" to
- "target_name". And in "files_fsck_refs_dir", create a new strbuf
- "target_name", thus whenever we handle a new target, calculate the
- name and call the check functions one by one.
+ "refname". And in "files_fsck_refs_dir", create a new strbuf "refname",
+ thus whenever we handle a new ref, calculate the name and call the check
+ functions one by one.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
@@ refs/files-backend.c: static int files_ref_store_remove_on_disk(struct ref_store
typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
struct fsck_options *o,
- const char *refs_check_dir,
-+ const char *target_name,
++ const char *refname,
struct dir_iterator *iter);
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
- const char *refs_check_dir,
-+ const char *target_name,
++ const char *refname,
struct dir_iterator *iter)
{
struct strbuf sb = STRBUF_INIT;
@@ refs/files-backend.c: static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
- if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
- goto cleanup;
-
+ /*
+ * This works right now because we never check the root refs.
+ */
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
- if (check_refname_format(sb.buf, 0)) {
-+ if (check_refname_format(target_name, 0)) {
++ if (check_refname_format(refname, 0)) {
struct fsck_ref_report report = { 0 };
- report.path = sb.buf;
-+ report.path = target_name;
++ report.path = refname;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
"invalid refname format");
@@ refs/files-backend.c: static int files_fsck_refs_dir(struct ref_store *ref_store
const char *refs_check_dir,
files_fsck_refs_fn *fsck_refs_fn)
{
-+ struct strbuf target_name = STRBUF_INIT;
++ struct strbuf refname = STRBUF_INIT;
struct strbuf sb = STRBUF_INIT;
struct dir_iterator *iter;
int iter_status;
@@ refs/files-backend.c: static int files_fsck_refs_dir(struct ref_store *ref_store
continue;
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
-+ strbuf_reset(&target_name);
-+ strbuf_addf(&target_name, "%s/%s", refs_check_dir,
++ strbuf_reset(&refname);
++ strbuf_addf(&refname, "%s/%s", refs_check_dir,
+ iter->relative_path);
+
if (o->verbose)
- fprintf_ln(stderr, "Checking %s/%s",
- refs_check_dir, iter->relative_path);
-+ fprintf_ln(stderr, "Checking %s", target_name.buf);
++ fprintf_ln(stderr, "Checking %s", refname.buf);
+
for (size_t i = 0; fsck_refs_fn[i]; i++) {
- if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
-+ if (fsck_refs_fn[i](ref_store, o, target_name.buf, iter))
++ if (fsck_refs_fn[i](ref_store, o, refname.buf, iter))
ret = -1;
}
} else {
@@ refs/files-backend.c: static int files_fsck_refs_dir(struct ref_store *ref_store
out:
strbuf_release(&sb);
-+ strbuf_release(&target_name);
++ strbuf_release(&refname);
return ret;
}
4: b396bf6bc2 ! 4: cb4669b64d ref: support multiple worktrees check for refs
@@ Commit message
ref: support multiple worktrees check for refs
We have already set up the infrastructure to check the consistency for
- refs, but we do not support multiple worktrees. As we decide to add more
- checks for ref content, we need to set up support for multiple
- worktrees.
+ refs, but we do not support multiple worktrees. However, "git-fsck(1)"
+ will check the refs of worktrees. As we decide to get feature parity
+ with "git-fsck(1)", we need to set up support for multiple worktrees.
Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
@@ builtin/refs.c: static int cmd_refs_migrate(int argc, const char **argv, const c
static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
{
struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
-+ struct worktree **worktrees, **p;
++ struct worktree **worktrees;
const char * const verify_usage[] = {
REFS_VERIFY_USAGE,
NULL,
@@ builtin/refs.c: static int cmd_refs_verify(int argc, const char **argv, const ch
- ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+ worktrees = get_worktrees();
-+ for (p = worktrees; *p; p++) {
-+ struct worktree *wt = *p;
-+ ret |= refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options, wt);
-+ }
-+
++ for (size_t i = 0; worktrees[i]; i++)
++ ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
++ &fsck_refs_options, worktrees[i]);
fsck_options_clear(&fsck_refs_options);
+ free_worktrees(worktrees);
@@ refs/files-backend.c: static int files_fsck_refs_name(struct ref_store *ref_stor
+ struct worktree *wt,
files_fsck_refs_fn *fsck_refs_fn)
{
- struct strbuf target_name = STRBUF_INIT;
+ struct strbuf refname = STRBUF_INIT;
@@ refs/files-backend.c: static int files_fsck_refs_dir(struct ref_store *ref_store,
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
- strbuf_reset(&target_name);
+ strbuf_reset(&refname);
+
+ if (!is_main_worktree(wt))
-+ strbuf_addf(&target_name, "worktrees/%s/", wt->id);
- strbuf_addf(&target_name, "%s/%s", refs_check_dir,
++ strbuf_addf(&refname, "worktrees/%s/", wt->id);
+ strbuf_addf(&refname, "%s/%s", refs_check_dir,
iter->relative_path);
@@ refs/files-backend.c: static int files_fsck_refs_dir(struct ref_store *ref_store,
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name check should be adapted
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
-+ (
-+ cd worktree-1 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
-+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
-+ EOF
-+ sort err >sorted_err &&
-+ test_cmp expect sorted_err
-+ ) &&
-+
-+ (
-+ cd worktree-2 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
-+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
-+ EOF
-+ sort err >sorted_err &&
-+ test_cmp expect sorted_err
-+ )
++ for worktree in "worktree-1" "worktree-2"
++ do
++ (
++ cd $worktree &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
++ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
++ EOF
++ sort err >sorted_err &&
++ test_cmp expect sorted_err || return 1
++ )
++ done
+'
+
test_done
5: 6a9e297dfc ! 5: 4e1add6465 ref: port git-fsck(1) regular refs check for files backend
@@ fsck.h: enum fsck_msg_type {
## refs/files-backend.c ##
@@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
- const char *target_name,
+ const char *refname,
struct dir_iterator *iter);
+static int files_fsck_refs_content(struct ref_store *ref_store,
@@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
-+ "cannot read ref file '%s': (%s)",
++ "cannot read ref file '%s': %s",
+ iter->path.buf, strerror(errno));
+ goto cleanup;
+ }
@@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
+
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
- const char *target_name,
+ const char *refname,
@@ refs/files-backend.c: static int files_fsck_refs(struct ref_store *ref_store,
{
files_fsck_refs_fn fsck_refs_fn[]= {
@@ refs/files-backend.c: static int files_fsck_refs(struct ref_store *ref_store,
## t/t0602-reffiles-fsck.sh ##
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name check should work for multiple worktrees' '
- )
+ done
'
+test_expect_success 'regular ref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
-+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name check should work for mu
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
-+ bad_content=$(git rev-parse main)x &&
-+ printf "%s" $bad_content >$tag_dir_prefix/tag-bad-1 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: refs/tags/tag-bad-1: badRefContent: $bad_content
-+ EOF
-+ rm $tag_dir_prefix/tag-bad-1 &&
-+ test_cmp expect err &&
-+
-+ bad_content=xfsazqfxcadas &&
-+ printf "%s" $bad_content >$tag_dir_prefix/tag-bad-2 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: refs/tags/tag-bad-2: badRefContent: $bad_content
-+ EOF
-+ rm $tag_dir_prefix/tag-bad-2 &&
-+ test_cmp expect err &&
-+
-+ bad_content=Xfsazqfxcadas &&
-+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
-+ EOF
-+ rm $branch_dir_prefix/a/b/branch-bad &&
-+ test_cmp expect err
++ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
++ do
++ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: refs/heads/branch-bad: badRefContent: $bad_content
++ EOF
++ rm $branch_dir_prefix/branch-bad &&
++ test_cmp expect err || return 1
++ done &&
++
++ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
++ do
++ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
++ EOF
++ rm $branch_dir_prefix/a/b/branch-bad &&
++ test_cmp expect err || return 1
++ done
+'
+
+test_expect_success 'regular ref content should be checked (aggregate)' '
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref name check should work for mu
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
-+ bad_content_1=$(git rev-parse HEAD)x &&
-+ bad_content_2=xfsazqfxcadas &&
-+ bad_content_3=Xfsazqfxcadas &&
-+
-+ printf "%s" $bad_content_1 >$worktree1_refdir_prefix/bad-branch-1 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content_1
-+ EOF
-+ rm $worktree1_refdir_prefix/bad-branch-1 &&
-+ test_cmp expect err &&
-+
-+ printf "%s" $bad_content_2 >$worktree2_refdir_prefix/bad-branch-2 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content_2
-+ EOF
-+ rm $worktree2_refdir_prefix/bad-branch-2 &&
-+ test_cmp expect err &&
-+
-+ printf "%s" $bad_content_3 >$worktree1_refdir_prefix/bad-branch-3 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: worktrees/worktree-1/refs/worktree/bad-branch-3: badRefContent: $bad_content_3
-+ EOF
-+ rm $worktree1_refdir_prefix/bad-branch-3 &&
-+ test_cmp expect err
++ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
++ do
++ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
++ EOF
++ rm $worktree1_refdir_prefix/bad-branch-1 &&
++ test_cmp expect err || return 1
++ done &&
++
++ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
++ do
++ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
++ EOF
++ rm $worktree2_refdir_prefix/bad-branch-2 &&
++ test_cmp expect err || return 1
++ done
+'
+
test_done
6: 7eea024182 ! 6: 945322fab7 ref: add more strict checks for regular refs
@@ refs/refs-internal.h: struct ref_store {
## t/t0602-reffiles-fsck.sh ##
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be checked (individual)' '
- error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
- EOF
- rm $branch_dir_prefix/a/b/branch-bad &&
-+ test_cmp expect err &&
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+- done
++ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be che
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
-+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
-+ git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
-+ EOF
-+ rm $branch_dir_prefix/branch-garbage &&
-+ test_cmp expect err &&
-+
-+ printf "%s\n\n\n" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-1 &&
++ for trailing_content in " garbage" " more garbage"
++ do
++ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
++ git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
++ EOF
++ rm $branch_dir_prefix/branch-garbage &&
++ test_cmp expect err || return 1
++ done &&
++
++ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
-+ warning: refs/tags/tag-garbage-1: trailingRefContent: has trailing garbage: '\''
++ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ '\''
+ EOF
-+ rm $tag_dir_prefix/tag-garbage-1 &&
++ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
-+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-2 &&
++ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
-+ warning: refs/tags/tag-garbage-2: trailingRefContent: has trailing garbage: '\''
++ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ garbage'\''
+ EOF
-+ rm $tag_dir_prefix/tag-garbage-2 &&
-+ test_cmp expect err &&
-+
-+ printf "%s garbage\na" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-3 &&
-+ git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ warning: refs/tags/tag-garbage-3: trailingRefContent: has trailing garbage: '\'' garbage
-+ a'\''
-+ EOF
-+ rm $tag_dir_prefix/tag-garbage-3 &&
-+ test_cmp expect err &&
-+
-+ printf "%s garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-4 &&
-+ test_must_fail git -c fsck.trailingRefContent=error refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: refs/tags/tag-garbage-4: trailingRefContent: has trailing garbage: '\'' garbage'\''
-+ EOF
-+ rm $tag_dir_prefix/tag-garbage-4 &&
- test_cmp expect err
++ rm $branch_dir_prefix/branch-garbage-special &&
++ test_cmp expect err
'
+ test_expect_success 'regular ref content should be checked (aggregate)' '
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be checked (aggregate)' '
printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be che
sort err >sorted_err &&
test_cmp expect sorted_err
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref content checks should work with worktrees' '
- error: worktrees/worktree-1/refs/worktree/bad-branch-3: badRefContent: $bad_content_3
- EOF
- rm $worktree1_refdir_prefix/bad-branch-3 &&
-+ test_cmp expect err &&
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+- done
++ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref content checks should work wi
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $worktree1_refdir_prefix/branch-no-newline &&
- test_cmp expect err
++ test_cmp expect err
'
+ test_done
7: 1bf36dd644 ! 7: 3006eb9431 ref: add basic symref content check for files backend
@@ fsck.h: enum fsck_msg_type {
## refs/files-backend.c ##
@@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
- const char *target_name,
+ const char *refname,
struct dir_iterator *iter);
+static int files_fsck_symref_target(struct fsck_options *o,
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be che
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
-+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
-+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
-+ git refs verify 2>err &&
-+ rm $branch_dir_prefix/branch-good &&
-+ test_must_be_empty err &&
++ for good_referent in "refs/heads/branch" "HEAD"
++ do
++ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
++ git refs verify 2>err &&
++ rm $branch_dir_prefix/branch-good &&
++ test_must_be_empty err || return 1
++ done &&
+
-+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
-+ git refs verify 2>err &&
-+ rm $branch_dir_prefix/branch-head &&
-+ test_must_be_empty err &&
++ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
++ do
++ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
++ EOF
++ rm $branch_dir_prefix/branch-bad &&
++ test_cmp expect err || return 1
++ done &&
+
-+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
++ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
-+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
++ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
-+ rm $branch_dir_prefix/branch-no-newline-1 &&
++ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be che
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
-+ test_cmp expect err &&
-+
-+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
-+ test_must_fail git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
-+ EOF
-+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err
+'
+
8: 1d200f2ade ! 8: c59d003d78 ref: check whether the target of the symref is a ref
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'textual symref content should be
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
-+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-good &&
-+ git refs verify 2>err &&
-+ rm $branch_dir_prefix/branch-good &&
-+ test_must_be_empty err &&
++ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
++ do
++ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
++ git refs verify 2>err &&
++ rm $branch_dir_prefix/branch-good &&
++ test_must_be_empty err || return 1
++ done &&
+
-+ printf "ref: refs/foo\n" >$branch_dir_prefix/branch-good &&
-+ git refs verify 2>err &&
-+ rm $branch_dir_prefix/branch-good &&
-+ test_must_be_empty err &&
-+
-+ printf "ref: refs-back/heads/main\n" >$branch_dir_prefix/branch-bad-1 &&
-+ git refs verify 2>err &&
-+ cat >expect <<-EOF &&
-+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''refs-back/heads/main'\''
-+ EOF
-+ rm $branch_dir_prefix/branch-bad-1 &&
-+ test_cmp expect err
++ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
++ do
++ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
++ git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
++ EOF
++ rm $branch_dir_prefix/branch-bad-1 &&
++ test_cmp expect err || return 1
++ done
+'
+
test_expect_success 'ref content checks should work with worktrees' '
9: 752f0ad22e ! 9: bb6d7f3323 ref: add symlink ref content check for files backend
@@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_s
## t/t0602-reffiles-fsck.sh ##
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'the target of the textual symref should be checked' '
- test_cmp expect err
+ done
'
+test_expect_success SYMLINKS 'symlink symref content should be checked' '
--
2.47.0
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v7 1/9] ref: initialize "fsck_ref_report" with zero
2024-11-10 12:07 ` [PATCH v7 " shejialuo
@ 2024-11-10 12:09 ` shejialuo
2024-11-10 12:09 ` [PATCH v7 2/9] ref: check the full refname instead of basename shejialuo
` (9 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:09 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "fsck.c::fsck_refs_error_function", we need to tell whether "oid" and
"referent" is NULL. So, we need to always initialize these parameters to
NULL instead of letting them point to anywhere when creating a new
"fsck_ref_report" structure.
The original code explicitly initializes the "path" member in the
"struct fsck_ref_report" to NULL (which implicitly 0-initializes other
members in the struct). It is more customary to use "{ 0 }" to express
that we are 0-initializing everything. In order to align with the
codebase, initialize "fsck_ref_report" with zero.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0824c0b8a9..03d2503276 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3520,7 +3520,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
goto cleanup;
if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
- struct fsck_ref_report report = { .path = NULL };
+ struct fsck_ref_report report = { 0 };
strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v7 2/9] ref: check the full refname instead of basename
2024-11-10 12:07 ` [PATCH v7 " shejialuo
2024-11-10 12:09 ` [PATCH v7 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
@ 2024-11-10 12:09 ` shejialuo
2024-11-10 12:09 ` [PATCH v7 3/9] ref: initialize ref name outside of check functions shejialuo
` (8 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:09 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "files-backend.c::files_fsck_refs_name", we validate the refname
format by using "check_refname_format" to check the basename of the
iterator with "REFNAME_ALLOW_ONELEVEL" flag.
However, this is a bad implementation. Although we doesn't allow a
single "@" in ".git" directory, we do allow "refs/heads/@". So, we will
report an error wrongly when there is a "refs/heads/@" ref by using one
level refname "@".
Because we just check one level refname, we either cannot check the
other parts of the full refname. And we will ignore the following
errors:
"refs/heads/ new-feature/test"
"refs/heads/~new-feature/test"
In order to fix the above problem, enhance "files_fsck_refs_name" to use
the full name for "check_refname_format". Then, replace the tests which
are related to "@" and add tests to exercise the above situations using
for loop to avoid repetition.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 7 ++-
t/t0602-reffiles-fsck.sh | 92 ++++++++++++++++++++++++----------------
2 files changed, 60 insertions(+), 39 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 03d2503276..b055edc061 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3519,10 +3519,13 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
goto cleanup;
- if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
+ /*
+ * This works right now because we never check the root refs.
+ */
+ strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
+ if (check_refname_format(sb.buf, 0)) {
struct fsck_ref_report report = { 0 };
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 71a4d1a5ae..2a172c913d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -18,63 +18,81 @@ test_expect_success 'ref name should be checked' '
cd repo &&
git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
- git tag tag-1 &&
- git commit --allow-empty -m second &&
- git checkout -b branch-2 &&
- git tag tag-2 &&
- git tag multi_hierarchy/tag-2 &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/@: badRefName: invalid refname format
- EOF
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
rm $branch_dir_prefix/@ &&
- test_cmp expect err &&
- cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/multi_hierarchy/@ &&
- test_cmp expect err &&
-
- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock &&
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
git refs verify 2>err &&
rm $tag_dir_prefix/tag-1.lock &&
test_must_be_empty err &&
- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/.lock &&
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/tags/.lock: badRefName: invalid refname format
EOF
rm $tag_dir_prefix/.lock &&
- test_cmp expect err
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- tag_dir_prefix=.git/refs/tags &&
cd repo &&
git commit --allow-empty -m initial &&
git checkout -b branch-1 &&
- git tag tag-1 &&
- git commit --allow-empty -m second &&
- git checkout -b branch-2 &&
- git tag tag-2 &&
cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
git -c fsck.badRefName=warn refs verify 2>err &&
@@ -84,7 +102,7 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
git -c fsck.badRefName=ignore refs verify 2>err &&
test_must_be_empty err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v7 3/9] ref: initialize ref name outside of check functions
2024-11-10 12:07 ` [PATCH v7 " shejialuo
2024-11-10 12:09 ` [PATCH v7 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-11-10 12:09 ` [PATCH v7 2/9] ref: check the full refname instead of basename shejialuo
@ 2024-11-10 12:09 ` shejialuo
2024-11-10 12:09 ` [PATCH v7 4/9] ref: support multiple worktrees check for refs shejialuo
` (7 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:09 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We passes "refs_check_dir" to the "files_fsck_refs_name" function which
allows it to create the checked ref name later. However, when we
introduce a new check function, we have to allocate redundant memory and
re-calculate the ref name. It's bad for us to allocate redundant memory
and duplicate logic. Instead, we should allocate and calculate it only
once and pass the ref name to the check functions.
In order not to do repeat calculation, rename "refs_check_dir" to
"refname". And in "files_fsck_refs_dir", create a new strbuf "refname",
thus whenever we handle a new ref, calculate the name and call the check
functions one by one.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b055edc061..8edb700568 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3501,12 +3501,12 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
*/
typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *refname,
struct dir_iterator *iter);
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *refname,
struct dir_iterator *iter)
{
struct strbuf sb = STRBUF_INIT;
@@ -3522,11 +3522,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
/*
* This works right now because we never check the root refs.
*/
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
- if (check_refname_format(sb.buf, 0)) {
+ if (check_refname_format(refname, 0)) {
struct fsck_ref_report report = { 0 };
- report.path = sb.buf;
+ report.path = refname;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
"invalid refname format");
@@ -3542,6 +3541,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
const char *refs_check_dir,
files_fsck_refs_fn *fsck_refs_fn)
{
+ struct strbuf refname = STRBUF_INIT;
struct strbuf sb = STRBUF_INIT;
struct dir_iterator *iter;
int iter_status;
@@ -3560,11 +3560,15 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
continue;
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
+ strbuf_reset(&refname);
+ strbuf_addf(&refname, "%s/%s", refs_check_dir,
+ iter->relative_path);
+
if (o->verbose)
- fprintf_ln(stderr, "Checking %s/%s",
- refs_check_dir, iter->relative_path);
+ fprintf_ln(stderr, "Checking %s", refname.buf);
+
for (size_t i = 0; fsck_refs_fn[i]; i++) {
- if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
+ if (fsck_refs_fn[i](ref_store, o, refname.buf, iter))
ret = -1;
}
} else {
@@ -3581,6 +3585,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
out:
strbuf_release(&sb);
+ strbuf_release(&refname);
return ret;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v7 4/9] ref: support multiple worktrees check for refs
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (2 preceding siblings ...)
2024-11-10 12:09 ` [PATCH v7 3/9] ref: initialize ref name outside of check functions shejialuo
@ 2024-11-10 12:09 ` shejialuo
2024-11-10 12:09 ` [PATCH v7 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
` (6 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:09 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. However, "git-fsck(1)"
will check the refs of worktrees. As we decide to get feature parity
with "git-fsck(1)", we need to set up support for multiple worktrees.
Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
"worktrees/<id>/refs/worktree/foo". So we should know the id of the
worktree to get the full name. Add a new parameter "struct worktree *"
for "refs-internal.h::fsck_fn". Then change the related functions to
follow this new interface.
The "packed-refs" only exists in the main worktree, so we should only
check "packed-refs" in the main worktree. Use "is_main_worktree" method
to skip checking "packed-refs" in "packed_fsck" function.
Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add
"worktree/<id>/" prefix when we are not in the main worktree.
Last, add a new test to check the refname when there are multiple
worktrees to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 10 ++++++--
refs.c | 5 ++--
refs.h | 3 ++-
refs/debug.c | 5 ++--
refs/files-backend.c | 17 ++++++++++----
refs/packed-backend.c | 8 ++++++-
refs/refs-internal.h | 3 ++-
refs/reftable-backend.c | 3 ++-
t/t0602-reffiles-fsck.sh | 51 ++++++++++++++++++++++++++++++++++++++++
9 files changed, 90 insertions(+), 15 deletions(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index 24978a7b7b..394b4101c6 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -5,6 +5,7 @@
#include "parse-options.h"
#include "refs.h"
#include "strbuf.h"
+#include "worktree.h"
#define REFS_MIGRATE_USAGE \
N_("git refs migrate --ref-format=<format> [--dry-run]")
@@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
{
struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
+ struct worktree **worktrees;
const char * const verify_usage[] = {
REFS_VERIFY_USAGE,
NULL,
@@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
OPT_END(),
};
- int ret;
+ int ret = 0;
argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
if (argc)
@@ -84,9 +86,13 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+ worktrees = get_worktrees();
+ for (size_t i = 0; worktrees[i]; i++)
+ ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
+ &fsck_refs_options, worktrees[i]);
fsck_options_clear(&fsck_refs_options);
+ free_worktrees(worktrees);
return ret;
}
diff --git a/refs.c b/refs.c
index 5f729ed412..395a17273c 100644
--- a/refs.c
+++ b/refs.c
@@ -318,9 +318,10 @@ int check_refname_format(const char *refname, int flags)
return check_or_sanitize_refname(refname, flags, NULL);
}
-int refs_fsck(struct ref_store *refs, struct fsck_options *o)
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt)
{
- return refs->be->fsck(refs, o);
+ return refs->be->fsck(refs, o, wt);
}
void sanitize_refname_component(const char *refname, struct strbuf *out)
diff --git a/refs.h b/refs.h
index 108dfc93b3..341d43239c 100644
--- a/refs.h
+++ b/refs.h
@@ -549,7 +549,8 @@ int check_refname_format(const char *refname, int flags);
* reflogs are consistent, and non-zero otherwise. The errors will be
* written to stderr.
*/
-int refs_fsck(struct ref_store *refs, struct fsck_options *o);
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt);
/*
* Apply the rules from check_refname_format, but mutate the result until it
diff --git a/refs/debug.c b/refs/debug.c
index 45e2e784a0..72e80ddd6d 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -420,10 +420,11 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
}
static int debug_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
- int res = drefs->refs->be->fsck(drefs->refs, o);
+ int res = drefs->refs->be->fsck(drefs->refs, o, wt);
trace_printf_key(&trace_refs, "fsck: %d\n", res);
return res;
}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8edb700568..8bfdce64bc 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -23,6 +23,7 @@
#include "../dir.h"
#include "../chdir-notify.h"
#include "../setup.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../revision.h"
@@ -3539,6 +3540,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
static int files_fsck_refs_dir(struct ref_store *ref_store,
struct fsck_options *o,
const char *refs_check_dir,
+ struct worktree *wt,
files_fsck_refs_fn *fsck_refs_fn)
{
struct strbuf refname = STRBUF_INIT;
@@ -3561,6 +3563,9 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
strbuf_reset(&refname);
+
+ if (!is_main_worktree(wt))
+ strbuf_addf(&refname, "worktrees/%s/", wt->id);
strbuf_addf(&refname, "%s/%s", refs_check_dir,
iter->relative_path);
@@ -3590,7 +3595,8 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
}
static int files_fsck_refs(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
@@ -3599,17 +3605,18 @@ static int files_fsck_refs(struct ref_store *ref_store,
if (o->verbose)
fprintf_ln(stderr, _("Checking references consistency"));
- return files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fn);
+ return files_fsck_refs_dir(ref_store, o, "refs", wt, fsck_refs_fn);
}
static int files_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_READ, "fsck");
- return files_fsck_refs(ref_store, o) |
- refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+ return files_fsck_refs(ref_store, o, wt) |
+ refs->packed_ref_store->be->fsck(refs->packed_ref_store, o, wt);
}
struct ref_storage_be refs_be_files = {
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 07c57fd541..46dcaec654 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -13,6 +13,7 @@
#include "../lockfile.h"
#include "../chdir-notify.h"
#include "../statinfo.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../trace2.h"
@@ -1754,8 +1755,13 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
}
static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt)
{
+
+ if (!is_main_worktree(wt))
+ return 0;
+
return 0;
}
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 2313c830d8..037d7991cd 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -653,7 +653,8 @@ typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refnam
struct strbuf *referent);
typedef int fsck_fn(struct ref_store *ref_store,
- struct fsck_options *o);
+ struct fsck_options *o,
+ struct worktree *wt);
struct ref_storage_be {
const char *name;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f5f957e6de..b6a63c1015 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -2443,7 +2443,8 @@ static int reftable_be_reflog_expire(struct ref_store *ref_store,
}
static int reftable_be_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt UNUSED)
{
return 0;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 2a172c913d..1e17393a3d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -107,4 +107,55 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
test_must_be_empty err
'
+test_expect_success 'ref name check should work for multiple worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v7 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (3 preceding siblings ...)
2024-11-10 12:09 ` [PATCH v7 4/9] ref: support multiple worktrees check for refs shejialuo
@ 2024-11-10 12:09 ` shejialuo
2024-11-13 7:36 ` Patrick Steinhardt
2024-11-10 12:10 ` [PATCH v7 6/9] ref: add more strict checks for regular refs shejialuo
` (5 subsequent siblings)
10 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:09 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
"git-fsck(1)" implicitly checks the ref content by passing the
callback "fsck_handle_ref" to the "refs.c::refs_for_each_rawref".
Then, it will check whether the ref content (eventually "oid")
is valid. If not, it will report the following error to the user.
error: refs/heads/main: invalid sha1 pointer 0000...
And it will also report above errors when there are dangling symrefs
in the repository wrongly. This does not align with the behavior of
the "git symbolic-ref" command which allows users to create dangling
symrefs.
As we have already introduced the "git refs verify" command, we'd better
check the ref content explicitly in the "git refs verify" command thus
later we could remove these checks in "git-fsck(1)" and launch a
subprocess to call "git refs verify" in "git-fsck(1)" to make the
"git-fsck(1)" more clean.
Following what "git-fsck(1)" does, add a similar check to "git refs
verify". Then add a new fsck error message "badRefContent(ERROR)" to
represent that a ref has an invalid content.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 43 ++++++++++++++
t/t0602-reffiles-fsck.sh | 105 ++++++++++++++++++++++++++++++++++
4 files changed, 152 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 68a2801f15..22c385ea22 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -19,6 +19,9 @@
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
+`badRefContent`::
+ (ERROR) A ref has bad content.
+
`badRefFiletype`::
(ERROR) A ref has a bad file type.
diff --git a/fsck.h b/fsck.h
index 500b4c04d2..0d99a87911 100644
--- a/fsck.h
+++ b/fsck.h
@@ -31,6 +31,7 @@ enum fsck_msg_type {
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
+ FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8bfdce64bc..2d126ecbbe 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3505,6 +3505,48 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refname,
struct dir_iterator *iter);
+static int files_fsck_refs_content(struct ref_store *ref_store,
+ struct fsck_options *o,
+ const char *target_name,
+ struct dir_iterator *iter)
+{
+ struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf referent = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ unsigned int type = 0;
+ int failure_errno = 0;
+ struct object_id oid;
+ int ret = 0;
+
+ report.path = target_name;
+
+ if (S_ISLNK(iter->st.st_mode))
+ goto cleanup;
+
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "cannot read ref file '%s': %s",
+ iter->path.buf, strerror(errno));
+ goto cleanup;
+ }
+
+ if (parse_loose_ref_contents(ref_store->repo->hash_algo,
+ ref_content.buf, &oid, &referent,
+ &type, &failure_errno)) {
+ strbuf_rtrim(&ref_content);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "%s", ref_content.buf);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&ref_content);
+ strbuf_release(&referent);
+ return ret;
+}
+
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
const char *refname,
@@ -3600,6 +3642,7 @@ static int files_fsck_refs(struct ref_store *ref_store,
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
+ files_fsck_refs_content,
NULL,
};
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 1e17393a3d..162370077b 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -158,4 +158,109 @@ test_expect_success 'ref name check should work for multiple worktrees' '
done
'
+test_expect_success 'regular ref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done
+'
+
+test_expect_success 'regular ref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
+test_expect_success 'ref content checks should work with worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v7 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-11-10 12:09 ` [PATCH v7 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-11-13 7:36 ` Patrick Steinhardt
2024-11-14 12:09 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-13 7:36 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Nov 10, 2024 at 08:09:51PM +0800, shejialuo wrote:
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 8bfdce64bc..2d126ecbbe 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3505,6 +3505,48 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
> const char *refname,
> struct dir_iterator *iter);
>
> +static int files_fsck_refs_content(struct ref_store *ref_store,
> + struct fsck_options *o,
> + const char *target_name,
> + struct dir_iterator *iter)
> +{
> + struct strbuf ref_content = STRBUF_INIT;
> + struct strbuf referent = STRBUF_INIT;
> + struct fsck_ref_report report = { 0 };
> + unsigned int type = 0;
> + int failure_errno = 0;
> + struct object_id oid;
> + int ret = 0;
> +
> + report.path = target_name;
> +
> + if (S_ISLNK(iter->st.st_mode))
> + goto cleanup;
> +
> + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_BAD_REF_CONTENT,
> + "cannot read ref file '%s': %s",
> + iter->path.buf, strerror(errno));
> + goto cleanup;
> + }
I didn't catch this in previous rounds, but it's a little dubious
whether we should report this as an actual fsck error. I can expect
multiple situations:
- The file has weird permissions and thus cannot be read, failing with
EPERM, which doesn't match well with BAD_REF_CONTENT.
- The file does not exist anymore because we were racing with a
concurrent writer, failing with ENOENT. This is benign and expected
to happen in busy repos, so generating an error here feels wrong.
- The file cannot be read at all due to an I/O error. This may be
reported with BAD_REF_CONTENT, but conflating this with the case
where we have actually bad content may not be the best idea.
So maybe we should ignore ENOENT, report bad permissions and otherwise
return an actual error to the caller?
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v7 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-11-13 7:36 ` Patrick Steinhardt
@ 2024-11-14 12:09 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 12:09 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Wed, Nov 13, 2024 at 08:36:12AM +0100, Patrick Steinhardt wrote:
> On Sun, Nov 10, 2024 at 08:09:51PM +0800, shejialuo wrote:
> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index 8bfdce64bc..2d126ecbbe 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -3505,6 +3505,48 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
> > const char *refname,
> > struct dir_iterator *iter);
> >
> > +static int files_fsck_refs_content(struct ref_store *ref_store,
> > + struct fsck_options *o,
> > + const char *target_name,
> > + struct dir_iterator *iter)
> > +{
> > + struct strbuf ref_content = STRBUF_INIT;
> > + struct strbuf referent = STRBUF_INIT;
> > + struct fsck_ref_report report = { 0 };
> > + unsigned int type = 0;
> > + int failure_errno = 0;
> > + struct object_id oid;
> > + int ret = 0;
> > +
> > + report.path = target_name;
> > +
> > + if (S_ISLNK(iter->st.st_mode))
> > + goto cleanup;
> > +
> > + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_BAD_REF_CONTENT,
> > + "cannot read ref file '%s': %s",
> > + iter->path.buf, strerror(errno));
> > + goto cleanup;
> > + }
>
> I didn't catch this in previous rounds, but it's a little dubious
> whether we should report this as an actual fsck error. I can expect
> multiple situations:
>
> - The file has weird permissions and thus cannot be read, failing with
> EPERM, which doesn't match well with BAD_REF_CONTENT.
>
> - The file does not exist anymore because we were racing with a
> concurrent writer, failing with ENOENT. This is benign and expected
> to happen in busy repos, so generating an error here feels wrong.
>
> - The file cannot be read at all due to an I/O error. This may be
> reported with BAD_REF_CONTENT, but conflating this with the case
> where we have actually bad content may not be the best idea.
>
> So maybe we should ignore ENOENT, report bad permissions and otherwise
> return an actual error to the caller?
>
So, I think we should just use "error_errno" method to report the actual
error to the caller. And we also need to add some comments.
Thanks for this wonderful suggestion.
> Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v7 6/9] ref: add more strict checks for regular refs
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (4 preceding siblings ...)
2024-11-10 12:09 ` [PATCH v7 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-11-10 12:10 ` shejialuo
2024-11-10 12:10 ` [PATCH v7 7/9] ref: add basic symref content check for files backend shejialuo
` (4 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:10 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already used "parse_loose_ref_contents" function to check
whether the ref content is valid in files backend. However, by
using "parse_loose_ref_contents", we allow the ref's content to end with
garbage or without a newline.
Even though we never create such loose refs ourselves, we have accepted
such loose refs. So, it is entirely possible that some third-party tools
may rely on such loose refs being valid. We should not report an error
fsck message at current. We should notify the users about such
"curiously formatted" loose refs so that adequate care is taken before
we decide to tighten the rules in the future.
And it's not suitable either to report a warn fsck message to the user.
We don't yet want the "--strict" flag that controls this bit to end up
generating errors for such weirdly-formatted reference contents, as we
first want to assess whether this retroactive tightening will cause
issues for any tools out there. It may cause compatibility issues which
may break the repository. So, we add the following two fsck infos to
represent the situation where the ref content ends without newline or
has trailing garbages:
1. refMissingNewline(INFO): A loose ref that does not end with
newline(LF).
2. trailingRefContent(INFO): A loose ref has trailing content.
It might appear that we can't provide the user with any warnings by
using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert
FSCK_INFO to FSCK_WARN and we can still warn the user about these
situations when using "git refs verify" without introducing
compatibility issues.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 14 +++++++++
fsck.h | 2 ++
refs.c | 2 +-
refs/files-backend.c | 26 ++++++++++++++--
refs/refs-internal.h | 2 +-
t/t0602-reffiles-fsck.sh | 57 +++++++++++++++++++++++++++++++++--
6 files changed, 96 insertions(+), 7 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 22c385ea22..6db0eaa84a 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -173,6 +173,20 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`refMissingNewline`::
+ (INFO) A loose ref that does not end with newline(LF). As
+ valid implementations of Git never created such a loose ref
+ file, it may become an error in the future. Report to the
+ git@vger.kernel.org mailing list if you see this error, as
+ we need to know what tools created such a file.
+
+`trailingRefContent`::
+ (INFO) A loose ref has trailing content. As valid implementations
+ of Git never created such a loose ref file, it may become an
+ error in the future. Report to the git@vger.kernel.org mailing
+ list if you see this error, as we need to know what tools
+ created such a file.
+
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
diff --git a/fsck.h b/fsck.h
index 0d99a87911..b85072df57 100644
--- a/fsck.h
+++ b/fsck.h
@@ -85,6 +85,8 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs.c b/refs.c
index 395a17273c..f88b32a633 100644
--- a/refs.c
+++ b/refs.c
@@ -1789,7 +1789,7 @@ static int refs_read_special_head(struct ref_store *ref_store,
}
result = parse_loose_ref_contents(ref_store->repo->hash_algo, content.buf,
- oid, referent, type, failure_errno);
+ oid, referent, type, NULL, failure_errno);
done:
strbuf_release(&full_path);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 2d126ecbbe..871c8946f8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -569,7 +569,7 @@ static int read_ref_internal(struct ref_store *ref_store, const char *refname,
buf = sb_contents.buf;
ret = parse_loose_ref_contents(ref_store->repo->hash_algo, buf,
- oid, referent, type, &myerr);
+ oid, referent, type, NULL, &myerr);
out:
if (ret && !myerr)
@@ -606,7 +606,7 @@ static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno)
+ const char **trailing, int *failure_errno)
{
const char *p;
if (skip_prefix(buf, "ref:", &buf)) {
@@ -628,6 +628,10 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
*failure_errno = EINVAL;
return -1;
}
+
+ if (trailing)
+ *trailing = p;
+
return 0;
}
@@ -3513,6 +3517,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct strbuf ref_content = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
+ const char *trailing = NULL;
unsigned int type = 0;
int failure_errno = 0;
struct object_id oid;
@@ -3533,7 +3538,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
if (parse_loose_ref_contents(ref_store->repo->hash_algo,
ref_content.buf, &oid, &referent,
- &type, &failure_errno)) {
+ &type, &trailing, &failure_errno)) {
strbuf_rtrim(&ref_content);
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_CONTENT,
@@ -3541,6 +3546,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
+ if (!(type & REF_ISSYMREF)) {
+ if (!*trailing) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ goto cleanup;
+ }
+ if (*trailing != '\n' || *(trailing + 1)) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing garbage: '%s'", trailing);
+ goto cleanup;
+ }
+ }
+
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 037d7991cd..125f1fe735 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -716,7 +716,7 @@ struct ref_store {
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno);
+ const char **trailing, int *failure_errno);
/*
* Fill in the generic part of refs and add it to our collection of
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 162370077b..33e7a390ad 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -189,7 +189,48 @@ test_expect_success 'regular ref content should be checked (individual)' '
EOF
rm $branch_dir_prefix/a/b/branch-bad &&
test_cmp expect err || return 1
- done
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -207,12 +248,16 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
@@ -260,7 +305,15 @@ test_expect_success 'ref content checks should work with worktrees' '
EOF
rm $worktree2_refdir_prefix/bad-branch-2 &&
test_cmp expect err || return 1
- done
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err
'
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v7 7/9] ref: add basic symref content check for files backend
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (5 preceding siblings ...)
2024-11-10 12:10 ` [PATCH v7 6/9] ref: add more strict checks for regular refs shejialuo
@ 2024-11-10 12:10 ` shejialuo
2024-11-10 12:10 ` [PATCH v7 8/9] ref: check whether the target of the symref is a ref shejialuo
` (3 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:10 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have code that checks regular ref contents, but we do not yet check
the contents of symbolic refs. By using "parse_loose_ref_content" for
symbolic refs, we will get the information of the "referent".
We do not need to check the "referent" by opening the file. This is
because if "referent" exists in the file system, we will eventually
check its correctness by inspecting every file in the "refs" directory.
If the "referent" does not exist in the filesystem, this is OK as it is
seen as the dangling symref.
So we just need to check the "referent" string content. A regular ref
could be accepted as a textual symref if it begins with "ref:", followed
by zero or more whitespaces, followed by the full refname, followed only
by whitespace characters. However, we always write a single SP after
"ref:" and a single LF after the refname. It may seem that we should
report a fsck error message when the "referent" does not apply above
rules and we should not be so aggressive because third-party
reimplementations of Git may have taken advantage of the looser syntax.
Put it more specific, we accept the following contents:
1. "ref: refs/heads/master "
2. "ref: refs/heads/master \n \n"
3. "ref: refs/heads/master\n\n"
When introducing the regular ref content checks, we created two fsck
infos "refMissingNewline" and "trailingRefContent" which exactly
represents above situations. So we will reuse these two fsck messages to
write checks to info the user about these situations.
But we do not allow any other trailing garbage. The followings are bad
symref contents which will be reported as fsck error by "git-fsck(1)".
1. "ref: refs/heads/master garbage\n"
2. "ref: refs/heads/master \n\n\n garbage "
And we introduce a new "badReferentName(ERROR)" fsck message to report
above errors by using "is_root_ref" and "check_refname_format" to check
the "referent". Since both "is_root_ref" and "check_refname_format"
don't work with whitespaces, we use the trimmed version of "referent"
with these functions.
In order to add checks, we will do the following things:
1. Record the untrimmed length "orig_len" and untrimmed last byte
"orig_last_byte".
2. Use "strbuf_rtrim" to trim the whitespaces or newlines to make sure
"is_root_ref" and "check_refname_format" won't be failed by them.
3. Use "orig_len" and "orig_last_byte" to check whether the "referent"
misses '\n' at the end or it has trailing whitespaces or newlines.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 40 ++++++++++++
t/t0602-reffiles-fsck.sh | 111 ++++++++++++++++++++++++++++++++++
4 files changed, 155 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 6db0eaa84a..dcea05edfc 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -28,6 +28,9 @@
`badRefName`::
(ERROR) A ref has an invalid format.
+`badReferentName`::
+ (ERROR) The referent name of a symref is invalid.
+
`badTagName`::
(INFO) A tag has an invalid format.
diff --git a/fsck.h b/fsck.h
index b85072df57..5227dfdef2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -34,6 +34,7 @@ enum fsck_msg_type {
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
+ FUNC(BAD_REFERENT_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
FUNC(BAD_TREE, ERROR) \
FUNC(BAD_TREE_SHA1, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 871c8946f8..8bc7c6e0c2 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3509,6 +3509,43 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refname,
struct dir_iterator *iter);
+static int files_fsck_symref_target(struct fsck_options *o,
+ struct fsck_ref_report *report,
+ struct strbuf *referent)
+{
+ char orig_last_byte;
+ size_t orig_len;
+ int ret = 0;
+
+ orig_len = referent->len;
+ orig_last_byte = referent->buf[orig_len - 1];
+ strbuf_rtrim(referent);
+
+ if (!is_root_ref(referent->buf) &&
+ check_refname_format(referent->buf, 0)) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_BAD_REFERENT_NAME,
+ "points to invalid refname '%s'", referent->buf);
+ goto out;
+ }
+
+ if (referent->len == orig_len ||
+ (referent->len < orig_len && orig_last_byte != '\n')) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ }
+
+ if (referent->len != orig_len && referent->len != orig_len - 1) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing whitespaces or newlines");
+ }
+
+out:
+ return ret;
+}
+
static int files_fsck_refs_content(struct ref_store *ref_store,
struct fsck_options *o,
const char *target_name,
@@ -3559,6 +3596,9 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
"has trailing garbage: '%s'", trailing);
goto cleanup;
}
+ } else {
+ ret = files_fsck_symref_target(o, &report, &referent);
+ goto cleanup;
}
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 33e7a390ad..ee1e5f2864 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -263,6 +263,109 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'textual symref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+'
+
+test_expect_success 'textual symref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
@@ -313,6 +416,14 @@ test_expect_success 'ref content checks should work with worktrees' '
warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $worktree1_refdir_prefix/branch-garbage &&
test_cmp expect err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v7 8/9] ref: check whether the target of the symref is a ref
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (6 preceding siblings ...)
2024-11-10 12:10 ` [PATCH v7 7/9] ref: add basic symref content check for files backend shejialuo
@ 2024-11-10 12:10 ` shejialuo
2024-11-10 12:10 ` [PATCH v7 9/9] ref: add symlink ref content check for files backend shejialuo
` (2 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:10 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Ideally, we want to the users use "git symbolic-ref" to create symrefs
instead of writing raw contents into the filesystem. However, "git
symbolic-ref" is strict with the refname but not strict with the
referent. For example, we can make the "referent" located at the
"$(gitdir)/logs/aaa" and manually write the content into this where we
can still successfully parse this symref by using "git rev-parse".
$ git init repo && cd repo && git commit --allow-empty -mx
$ git symbolic-ref refs/heads/test logs/aaa
$ echo $(git rev-parse HEAD) > .git/logs/aaa
$ git rev-parse test
We may need to add some restrictions for "referent" parameter when using
"git symbolic-ref" to create symrefs because ideally all the
nonpseudo-refs should be located under the "refs" directory and we may
tighten this in the future.
In order to tell the user we may tighten the above situation, create
a new fsck message "symrefTargetIsNotARef" to notify the user that this
may become an error in the future.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 9 +++++++++
fsck.h | 1 +
refs/files-backend.c | 14 ++++++++++++--
t/t0602-reffiles-fsck.sh | 29 +++++++++++++++++++++++++++++
4 files changed, 51 insertions(+), 2 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index dcea05edfc..f82ebc58e8 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,15 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symrefTargetIsNotARef`::
+ (INFO) The target of a symbolic reference points neither to
+ a root reference nor to a reference starting with "refs/".
+ Although we allow create a symref pointing to the referent which
+ is outside the "ref" by using `git symbolic-ref`, we may tighten
+ the rule in the future. Report to the git@vger.kernel.org
+ mailing list if you see this error, as we need to know what tools
+ created such a file.
+
`trailingRefContent`::
(INFO) A loose ref has trailing content. As valid implementations
of Git never created such a loose ref file, it may become an
diff --git a/fsck.h b/fsck.h
index 5227dfdef2..53a47612e6 100644
--- a/fsck.h
+++ b/fsck.h
@@ -87,6 +87,7 @@ enum fsck_msg_type {
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8bc7c6e0c2..b3ec409920 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3513,6 +3513,7 @@ static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
struct strbuf *referent)
{
+ int is_referent_root;
char orig_last_byte;
size_t orig_len;
int ret = 0;
@@ -3521,8 +3522,17 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_last_byte = referent->buf[orig_len - 1];
strbuf_rtrim(referent);
- if (!is_root_ref(referent->buf) &&
- check_refname_format(referent->buf, 0)) {
+ is_referent_root = is_root_ref(referent->buf);
+ if (!is_referent_root &&
+ !starts_with(referent->buf, "refs/") &&
+ !starts_with(referent->buf, "worktrees/")) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_SYMREF_TARGET_IS_NOT_A_REF,
+ "points to non-ref target '%s'", referent->buf);
+
+ }
+
+ if (!is_referent_root && check_refname_format(referent->buf, 0)) {
ret = fsck_report_ref(o, report,
FSCK_MSG_BAD_REFERENT_NAME,
"points to invalid refname '%s'", referent->buf);
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index ee1e5f2864..692b30727a 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -366,6 +366,35 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'the target of the textual symref should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v7 9/9] ref: add symlink ref content check for files backend
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (7 preceding siblings ...)
2024-11-10 12:10 ` [PATCH v7 8/9] ref: check whether the target of the symref is a ref shejialuo
@ 2024-11-10 12:10 ` shejialuo
2024-11-13 7:36 ` Patrick Steinhardt
2024-11-13 7:36 ` [PATCH v7 0/9] add " Patrick Steinhardt
2024-11-14 16:51 ` [PATCH v8 " shejialuo
10 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-11-10 12:10 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Besides the textual symref, we also allow symbolic links as the symref.
So, we should also provide the consistency check as what we have done
for textual symref. And also we consider deprecating writing the
symbolic links. We first need to access whether symbolic links still
be used. So, add a new fsck message "symlinkRef(INFO)" to tell the
user be aware of this information.
We have already introduced "files_fsck_symref_target". We should reuse
this function to handle the symrefs which use legacy symbolic links. We
should not check the trailing garbage for symbolic refs. Add a new
parameter "symbolic_link" to disable some checks which should only be
executed for textual symrefs.
And we need to also generate the "referent" parameter for reusing
"files_fsck_symref_target" by the following steps:
1. Use "strbuf_add_real_path" to resolve the symlink and get the
absolute path "ref_content" which the symlink ref points to.
2. Generate the absolute path "abs_gitdir" of "gitdir" and combine
"ref_content" and "abs_gitdir" to extract the relative path
"relative_referent_path".
3. If "ref_content" is outside of "gitdir", we just set "referent" with
"ref_content". Instead, we set "referent" with
"relative_referent_path".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 6 +++++
fsck.h | 1 +
refs/files-backend.c | 38 +++++++++++++++++++++++++----
t/t0602-reffiles-fsck.sh | 45 +++++++++++++++++++++++++++++++++++
4 files changed, 86 insertions(+), 4 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index f82ebc58e8..b14bc44ca4 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,12 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symlinkRef`::
+ (INFO) A symbolic link is used as a symref. Report to the
+ git@vger.kernel.org mailing list if you see this error, as we
+ are assessing the feasibility of dropping the support to drop
+ creating symbolic links as symrefs.
+
`symrefTargetIsNotARef`::
(INFO) The target of a symbolic reference points neither to
a root reference nor to a reference starting with "refs/".
diff --git a/fsck.h b/fsck.h
index 53a47612e6..a44c231a5f 100644
--- a/fsck.h
+++ b/fsck.h
@@ -86,6 +86,7 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(SYMLINK_REF, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b3ec409920..37c669a30f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1,6 +1,7 @@
#define USE_THE_REPOSITORY_VARIABLE
#include "../git-compat-util.h"
+#include "../abspath.h"
#include "../config.h"
#include "../copy.h"
#include "../environment.h"
@@ -3511,7 +3512,8 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
- struct strbuf *referent)
+ struct strbuf *referent,
+ unsigned int symbolic_link)
{
int is_referent_root;
char orig_last_byte;
@@ -3520,7 +3522,8 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_len = referent->len;
orig_last_byte = referent->buf[orig_len - 1];
- strbuf_rtrim(referent);
+ if (!symbolic_link)
+ strbuf_rtrim(referent);
is_referent_root = is_root_ref(referent->buf);
if (!is_referent_root &&
@@ -3539,6 +3542,9 @@ static int files_fsck_symref_target(struct fsck_options *o,
goto out;
}
+ if (symbolic_link)
+ goto out;
+
if (referent->len == orig_len ||
(referent->len < orig_len && orig_last_byte != '\n')) {
ret = fsck_report_ref(o, report,
@@ -3562,6 +3568,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct dir_iterator *iter)
{
struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf abs_gitdir = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
const char *trailing = NULL;
@@ -3572,8 +3579,30 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
report.path = target_name;
- if (S_ISLNK(iter->st.st_mode))
+ if (S_ISLNK(iter->st.st_mode)) {
+ const char* relative_referent_path = NULL;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_SYMLINK_REF,
+ "use deprecated symbolic link for symref");
+
+ strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
+ strbuf_normalize_path(&abs_gitdir);
+ if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
+ strbuf_addch(&abs_gitdir, '/');
+
+ strbuf_add_real_path(&ref_content, iter->path.buf);
+ skip_prefix(ref_content.buf, abs_gitdir.buf,
+ &relative_referent_path);
+
+ if (relative_referent_path)
+ strbuf_addstr(&referent, relative_referent_path);
+ else
+ strbuf_addbuf(&referent, &ref_content);
+
+ ret |= files_fsck_symref_target(o, &report, &referent, 1);
goto cleanup;
+ }
if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
ret = fsck_report_ref(o, &report,
@@ -3607,13 +3636,14 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
} else {
- ret = files_fsck_symref_target(o, &report, &referent);
+ ret = files_fsck_symref_target(o, &report, &referent, 0);
goto cleanup;
}
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
+ strbuf_release(&abs_gitdir);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 692b30727a..0d5eda6d22 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -395,6 +395,51 @@ test_expect_success 'the target of the textual symref should be checked' '
done
'
+test_expect_success SYMLINKS 'symlink symref content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v7 9/9] ref: add symlink ref content check for files backend
2024-11-10 12:10 ` [PATCH v7 9/9] ref: add symlink ref content check for files backend shejialuo
@ 2024-11-13 7:36 ` Patrick Steinhardt
2024-11-14 12:18 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-13 7:36 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Nov 10, 2024 at 08:10:27PM +0800, shejialuo wrote:
> @@ -3572,8 +3579,30 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
>
> report.path = target_name;
>
> - if (S_ISLNK(iter->st.st_mode))
> + if (S_ISLNK(iter->st.st_mode)) {
> + const char* relative_referent_path = NULL;
Nit: the asterisk should stick with the variable name.
> + ret = fsck_report_ref(o, &report,
> + FSCK_MSG_SYMLINK_REF,
> + "use deprecated symbolic link for symref");
> +
> + strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
> + strbuf_normalize_path(&abs_gitdir);
> + if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
> + strbuf_addch(&abs_gitdir, '/');
> +
> + strbuf_add_real_path(&ref_content, iter->path.buf);
> + skip_prefix(ref_content.buf, abs_gitdir.buf,
> + &relative_referent_path);
> +
> + if (relative_referent_path)
> + strbuf_addstr(&referent, relative_referent_path);
> + else
> + strbuf_addbuf(&referent, &ref_content);
> +
> + ret |= files_fsck_symref_target(o, &report, &referent, 1);
> goto cleanup;
> + }
I wonder whether this logic works as expected with per-worktree symbolic
refs which are a symlink. On the other hand I wonder whether those work
as expected in the first place. Probably not. *shrug*
In any case, it would be nice to have a test for this.
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v7 9/9] ref: add symlink ref content check for files backend
2024-11-13 7:36 ` Patrick Steinhardt
@ 2024-11-14 12:18 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 12:18 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Wed, Nov 13, 2024 at 08:36:16AM +0100, Patrick Steinhardt wrote:
> On Sun, Nov 10, 2024 at 08:10:27PM +0800, shejialuo wrote:
> > @@ -3572,8 +3579,30 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
> >
> > report.path = target_name;
> >
> > - if (S_ISLNK(iter->st.st_mode))
> > + if (S_ISLNK(iter->st.st_mode)) {
> > + const char* relative_referent_path = NULL;
>
> Nit: the asterisk should stick with the variable name.
>
I will improve this in the next version.
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_SYMLINK_REF,
> > + "use deprecated symbolic link for symref");
> > +
> > + strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
> > + strbuf_normalize_path(&abs_gitdir);
> > + if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
> > + strbuf_addch(&abs_gitdir, '/');
> > +
> > + strbuf_add_real_path(&ref_content, iter->path.buf);
> > + skip_prefix(ref_content.buf, abs_gitdir.buf,
> > + &relative_referent_path);
> > +
> > + if (relative_referent_path)
> > + strbuf_addstr(&referent, relative_referent_path);
> > + else
> > + strbuf_addbuf(&referent, &ref_content);
> > +
> > + ret |= files_fsck_symref_target(o, &report, &referent, 1);
> > goto cleanup;
> > + }
>
> I wonder whether this logic works as expected with per-worktree symbolic
> refs which are a symlink. On the other hand I wonder whether those work
> as expected in the first place. Probably not. *shrug*
>
> In any case, it would be nice to have a test for this.
>
Correct, I have ignored because I add worktree support in the later
version. Let me add a new test to verify this.
> Patrick
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v7 0/9] add ref content check for files backend
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (8 preceding siblings ...)
2024-11-10 12:10 ` [PATCH v7 9/9] ref: add symlink ref content check for files backend shejialuo
@ 2024-11-13 7:36 ` Patrick Steinhardt
2024-11-14 16:51 ` [PATCH v8 " shejialuo
10 siblings, 0 replies; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-13 7:36 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Sun, Nov 10, 2024 at 08:07:36PM +0800, shejialuo wrote:
> Hi All:
>
> This new version solves the follow problems:
>
> 1. Enhance the commit message suggested by Patrick.
> 2. Rename "target_name" to "refname".
> 3. Enhance the shell scripts to use `for in` to avoid repetition. And
> this is the main change of this new version.
>
> Thanks,
> Jialuo
I've got two more comments, but otherwise this series looks close now.
Thanks!
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v8 0/9] add ref content check for files backend
2024-11-10 12:07 ` [PATCH v7 " shejialuo
` (9 preceding siblings ...)
2024-11-13 7:36 ` [PATCH v7 0/9] add " Patrick Steinhardt
@ 2024-11-14 16:51 ` shejialuo
2024-11-14 16:53 ` [PATCH v8 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
` (10 more replies)
10 siblings, 11 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Hi all:
This new version solves the following problem:
1. when reading the content of the ref file, we do not use
"fsck_report_ref" function. It's not suitable.
2. Add a new test for symlink worktree test in the last patch. After
writing the tets, find a bug. Fix the bug described below.
Because we have introduced the check for worktrees, we should not use
"ref_store->gitdir", instead we need to use "ref_store->repo->gitdir" to
get the main worktree "gitdir". After fixing this, the test is passed.
Thank Patrick to remind me about this. I forgot to add test thus making
mistakes.
Thanks,
Jialuo
shejialuo (9):
ref: initialize "fsck_ref_report" with zero
ref: check the full refname instead of basename
ref: initialize ref name outside of check functions
ref: support multiple worktrees check for refs
ref: port git-fsck(1) regular refs check for files backend
ref: add more strict checks for regular refs
ref: add basic symref content check for files backend
ref: check whether the target of the symref is a ref
ref: add symlink ref content check for files backend
Documentation/fsck-msgids.txt | 35 +++
builtin/refs.c | 10 +-
fsck.h | 6 +
refs.c | 7 +-
refs.h | 3 +-
refs/debug.c | 5 +-
refs/files-backend.c | 195 +++++++++++-
refs/packed-backend.c | 8 +-
refs/refs-internal.h | 5 +-
refs/reftable-backend.c | 3 +-
t/t0602-reffiles-fsck.sh | 576 ++++++++++++++++++++++++++++++++--
11 files changed, 791 insertions(+), 62 deletions(-)
Range-diff against v7:
1: bfb2a21af4 = 1: bfb2a21af4 ref: initialize "fsck_ref_report" with zero
2: 9efc83f7ea = 2: 9efc83f7ea ref: check the full refname instead of basename
3: 5ea7d18203 = 3: 5ea7d18203 ref: initialize ref name outside of check functions
4: cb4669b64d = 4: cb4669b64d ref: support multiple worktrees check for refs
5: 4e1add6465 ! 5: c6c128c922 ref: port git-fsck(1) regular refs check for files backend
@@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
+ if (S_ISLNK(iter->st.st_mode))
+ goto cleanup;
+
-+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
-+ ret = fsck_report_ref(o, &report,
-+ FSCK_MSG_BAD_REF_CONTENT,
-+ "cannot read ref file '%s': %s",
-+ iter->path.buf, strerror(errno));
++ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
++ /*
++ * Ref file could be removed by another concurrent process. We should
++ * ignore this error and continue to the next ref.
++ */
++ if (errno == ENOENT)
++ goto cleanup;
++
++ ret = error_errno(_("cannot read ref file '%s': %s"),
++ iter->path.buf, strerror(errno));
+ goto cleanup;
+ }
+
6: 945322fab7 = 6: 911fa42717 ref: add more strict checks for regular refs
7: 3006eb9431 = 7: 7aa6a99206 ref: add basic symref content check for files backend
8: c59d003d78 = 8: dbb0787ad1 ref: check whether the target of the symref is a ref
9: bb6d7f3323 ! 9: a6d85b4864 ref: add symlink ref content check for files backend
@@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_s
- if (S_ISLNK(iter->st.st_mode))
+ if (S_ISLNK(iter->st.st_mode)) {
-+ const char* relative_referent_path = NULL;
++ const char *relative_referent_path = NULL;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_SYMLINK_REF,
+ "use deprecated symbolic link for symref");
+
-+ strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
++ strbuf_add_absolute_path(&abs_gitdir, ref_store->repo->gitdir);
+ strbuf_normalize_path(&abs_gitdir);
+ if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
+ strbuf_addch(&abs_gitdir, '/');
@@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_s
goto cleanup;
+ }
- if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
- ret = fsck_report_ref(o, &report,
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
+ /*
@@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
@@ t/t0602-reffiles-fsck.sh: test_expect_success 'the target of the textual symref
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+'
++
++test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
++ test_when_finished "rm -rf repo" &&
++ git init repo &&
++ cd repo &&
++ test_commit default &&
++ git branch branch-1 &&
++ git branch branch-2 &&
++ git branch branch-3 &&
++ git worktree add ./worktree-1 branch-2 &&
++ git worktree add ./worktree-2 branch-3 &&
++ main_worktree_refdir_prefix=.git/refs/heads &&
++ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
++ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
++
++ (
++ cd worktree-1 &&
++ git update-ref refs/worktree/branch-4 refs/heads/branch-1
++ ) &&
++ (
++ cd worktree-2 &&
++ git update-ref refs/worktree/branch-4 refs/heads/branch-1
++ ) &&
++
++ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
++ git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
++ EOF
++ rm $worktree1_refdir_prefix/branch-symbolic-good &&
++ test_cmp expect err &&
++
++ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
++ git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
++ EOF
++ rm $worktree2_refdir_prefix/branch-symbolic-good &&
++ test_cmp expect err &&
++
++ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
++ git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
++ EOF
++ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
++ test_cmp expect err &&
++
++ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
++ git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
++ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
++ EOF
++ rm $worktree1_refdir_prefix/branch-symbolic &&
++ test_cmp expect err &&
++
++ for bad_referent_name in ".tag" "branch "
++ do
++ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
++ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
++ EOF
++ rm $worktree1_refdir_prefix/bad-symbolic &&
++ test_cmp expect err &&
++
++ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
++ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
++ EOF
++ rm $worktree1_refdir_prefix/bad-symbolic &&
++ test_cmp expect err &&
++
++ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
++ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
++ EOF
++ rm $worktree2_refdir_prefix/bad-symbolic &&
++ test_cmp expect err &&
++
++ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
++ test_must_fail git refs verify 2>err &&
++ cat >expect <<-EOF &&
++ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
++ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
++ EOF
++ rm $worktree2_refdir_prefix/bad-symbolic &&
++ test_cmp expect err || return 1
++ done
++'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
--
2.47.0
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v8 1/9] ref: initialize "fsck_ref_report" with zero
2024-11-14 16:51 ` [PATCH v8 " shejialuo
@ 2024-11-14 16:53 ` shejialuo
2024-11-14 16:54 ` [PATCH v8 2/9] ref: check the full refname instead of basename shejialuo
` (9 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:53 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "fsck.c::fsck_refs_error_function", we need to tell whether "oid" and
"referent" is NULL. So, we need to always initialize these parameters to
NULL instead of letting them point to anywhere when creating a new
"fsck_ref_report" structure.
The original code explicitly initializes the "path" member in the
"struct fsck_ref_report" to NULL (which implicitly 0-initializes other
members in the struct). It is more customary to use "{ 0 }" to express
that we are 0-initializing everything. In order to align with the
codebase, initialize "fsck_ref_report" with zero.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0824c0b8a9..03d2503276 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3520,7 +3520,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
goto cleanup;
if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
- struct fsck_ref_report report = { .path = NULL };
+ struct fsck_ref_report report = { 0 };
strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v8 2/9] ref: check the full refname instead of basename
2024-11-14 16:51 ` [PATCH v8 " shejialuo
2024-11-14 16:53 ` [PATCH v8 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
@ 2024-11-14 16:54 ` shejialuo
2024-11-14 16:54 ` [PATCH v8 3/9] ref: initialize ref name outside of check functions shejialuo
` (8 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "files-backend.c::files_fsck_refs_name", we validate the refname
format by using "check_refname_format" to check the basename of the
iterator with "REFNAME_ALLOW_ONELEVEL" flag.
However, this is a bad implementation. Although we doesn't allow a
single "@" in ".git" directory, we do allow "refs/heads/@". So, we will
report an error wrongly when there is a "refs/heads/@" ref by using one
level refname "@".
Because we just check one level refname, we either cannot check the
other parts of the full refname. And we will ignore the following
errors:
"refs/heads/ new-feature/test"
"refs/heads/~new-feature/test"
In order to fix the above problem, enhance "files_fsck_refs_name" to use
the full name for "check_refname_format". Then, replace the tests which
are related to "@" and add tests to exercise the above situations using
for loop to avoid repetition.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 7 ++-
t/t0602-reffiles-fsck.sh | 92 ++++++++++++++++++++++++----------------
2 files changed, 60 insertions(+), 39 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 03d2503276..b055edc061 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3519,10 +3519,13 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
goto cleanup;
- if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
+ /*
+ * This works right now because we never check the root refs.
+ */
+ strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
+ if (check_refname_format(sb.buf, 0)) {
struct fsck_ref_report report = { 0 };
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 71a4d1a5ae..2a172c913d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -18,63 +18,81 @@ test_expect_success 'ref name should be checked' '
cd repo &&
git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
- git tag tag-1 &&
- git commit --allow-empty -m second &&
- git checkout -b branch-2 &&
- git tag tag-2 &&
- git tag multi_hierarchy/tag-2 &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/@: badRefName: invalid refname format
- EOF
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
rm $branch_dir_prefix/@ &&
- test_cmp expect err &&
- cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/multi_hierarchy/@ &&
- test_cmp expect err &&
-
- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock &&
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
git refs verify 2>err &&
rm $tag_dir_prefix/tag-1.lock &&
test_must_be_empty err &&
- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/.lock &&
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/tags/.lock: badRefName: invalid refname format
EOF
rm $tag_dir_prefix/.lock &&
- test_cmp expect err
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- tag_dir_prefix=.git/refs/tags &&
cd repo &&
git commit --allow-empty -m initial &&
git checkout -b branch-1 &&
- git tag tag-1 &&
- git commit --allow-empty -m second &&
- git checkout -b branch-2 &&
- git tag tag-2 &&
cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
git -c fsck.badRefName=warn refs verify 2>err &&
@@ -84,7 +102,7 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
git -c fsck.badRefName=ignore refs verify 2>err &&
test_must_be_empty err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v8 3/9] ref: initialize ref name outside of check functions
2024-11-14 16:51 ` [PATCH v8 " shejialuo
2024-11-14 16:53 ` [PATCH v8 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-11-14 16:54 ` [PATCH v8 2/9] ref: check the full refname instead of basename shejialuo
@ 2024-11-14 16:54 ` shejialuo
2024-11-14 16:54 ` [PATCH v8 4/9] ref: support multiple worktrees check for refs shejialuo
` (7 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We passes "refs_check_dir" to the "files_fsck_refs_name" function which
allows it to create the checked ref name later. However, when we
introduce a new check function, we have to allocate redundant memory and
re-calculate the ref name. It's bad for us to allocate redundant memory
and duplicate logic. Instead, we should allocate and calculate it only
once and pass the ref name to the check functions.
In order not to do repeat calculation, rename "refs_check_dir" to
"refname". And in "files_fsck_refs_dir", create a new strbuf "refname",
thus whenever we handle a new ref, calculate the name and call the check
functions one by one.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b055edc061..8edb700568 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3501,12 +3501,12 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
*/
typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *refname,
struct dir_iterator *iter);
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *refname,
struct dir_iterator *iter)
{
struct strbuf sb = STRBUF_INIT;
@@ -3522,11 +3522,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
/*
* This works right now because we never check the root refs.
*/
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
- if (check_refname_format(sb.buf, 0)) {
+ if (check_refname_format(refname, 0)) {
struct fsck_ref_report report = { 0 };
- report.path = sb.buf;
+ report.path = refname;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
"invalid refname format");
@@ -3542,6 +3541,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
const char *refs_check_dir,
files_fsck_refs_fn *fsck_refs_fn)
{
+ struct strbuf refname = STRBUF_INIT;
struct strbuf sb = STRBUF_INIT;
struct dir_iterator *iter;
int iter_status;
@@ -3560,11 +3560,15 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
continue;
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
+ strbuf_reset(&refname);
+ strbuf_addf(&refname, "%s/%s", refs_check_dir,
+ iter->relative_path);
+
if (o->verbose)
- fprintf_ln(stderr, "Checking %s/%s",
- refs_check_dir, iter->relative_path);
+ fprintf_ln(stderr, "Checking %s", refname.buf);
+
for (size_t i = 0; fsck_refs_fn[i]; i++) {
- if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
+ if (fsck_refs_fn[i](ref_store, o, refname.buf, iter))
ret = -1;
}
} else {
@@ -3581,6 +3585,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
out:
strbuf_release(&sb);
+ strbuf_release(&refname);
return ret;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v8 4/9] ref: support multiple worktrees check for refs
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (2 preceding siblings ...)
2024-11-14 16:54 ` [PATCH v8 3/9] ref: initialize ref name outside of check functions shejialuo
@ 2024-11-14 16:54 ` shejialuo
2024-11-14 16:54 ` [PATCH v8 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
` (6 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. However, "git-fsck(1)"
will check the refs of worktrees. As we decide to get feature parity
with "git-fsck(1)", we need to set up support for multiple worktrees.
Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
"worktrees/<id>/refs/worktree/foo". So we should know the id of the
worktree to get the full name. Add a new parameter "struct worktree *"
for "refs-internal.h::fsck_fn". Then change the related functions to
follow this new interface.
The "packed-refs" only exists in the main worktree, so we should only
check "packed-refs" in the main worktree. Use "is_main_worktree" method
to skip checking "packed-refs" in "packed_fsck" function.
Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add
"worktree/<id>/" prefix when we are not in the main worktree.
Last, add a new test to check the refname when there are multiple
worktrees to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 10 ++++++--
refs.c | 5 ++--
refs.h | 3 ++-
refs/debug.c | 5 ++--
refs/files-backend.c | 17 ++++++++++----
refs/packed-backend.c | 8 ++++++-
refs/refs-internal.h | 3 ++-
refs/reftable-backend.c | 3 ++-
t/t0602-reffiles-fsck.sh | 51 ++++++++++++++++++++++++++++++++++++++++
9 files changed, 90 insertions(+), 15 deletions(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index 24978a7b7b..394b4101c6 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -5,6 +5,7 @@
#include "parse-options.h"
#include "refs.h"
#include "strbuf.h"
+#include "worktree.h"
#define REFS_MIGRATE_USAGE \
N_("git refs migrate --ref-format=<format> [--dry-run]")
@@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
{
struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
+ struct worktree **worktrees;
const char * const verify_usage[] = {
REFS_VERIFY_USAGE,
NULL,
@@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
OPT_END(),
};
- int ret;
+ int ret = 0;
argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
if (argc)
@@ -84,9 +86,13 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+ worktrees = get_worktrees();
+ for (size_t i = 0; worktrees[i]; i++)
+ ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
+ &fsck_refs_options, worktrees[i]);
fsck_options_clear(&fsck_refs_options);
+ free_worktrees(worktrees);
return ret;
}
diff --git a/refs.c b/refs.c
index 5f729ed412..395a17273c 100644
--- a/refs.c
+++ b/refs.c
@@ -318,9 +318,10 @@ int check_refname_format(const char *refname, int flags)
return check_or_sanitize_refname(refname, flags, NULL);
}
-int refs_fsck(struct ref_store *refs, struct fsck_options *o)
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt)
{
- return refs->be->fsck(refs, o);
+ return refs->be->fsck(refs, o, wt);
}
void sanitize_refname_component(const char *refname, struct strbuf *out)
diff --git a/refs.h b/refs.h
index 108dfc93b3..341d43239c 100644
--- a/refs.h
+++ b/refs.h
@@ -549,7 +549,8 @@ int check_refname_format(const char *refname, int flags);
* reflogs are consistent, and non-zero otherwise. The errors will be
* written to stderr.
*/
-int refs_fsck(struct ref_store *refs, struct fsck_options *o);
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt);
/*
* Apply the rules from check_refname_format, but mutate the result until it
diff --git a/refs/debug.c b/refs/debug.c
index 45e2e784a0..72e80ddd6d 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -420,10 +420,11 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
}
static int debug_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
- int res = drefs->refs->be->fsck(drefs->refs, o);
+ int res = drefs->refs->be->fsck(drefs->refs, o, wt);
trace_printf_key(&trace_refs, "fsck: %d\n", res);
return res;
}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8edb700568..8bfdce64bc 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -23,6 +23,7 @@
#include "../dir.h"
#include "../chdir-notify.h"
#include "../setup.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../revision.h"
@@ -3539,6 +3540,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
static int files_fsck_refs_dir(struct ref_store *ref_store,
struct fsck_options *o,
const char *refs_check_dir,
+ struct worktree *wt,
files_fsck_refs_fn *fsck_refs_fn)
{
struct strbuf refname = STRBUF_INIT;
@@ -3561,6 +3563,9 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
strbuf_reset(&refname);
+
+ if (!is_main_worktree(wt))
+ strbuf_addf(&refname, "worktrees/%s/", wt->id);
strbuf_addf(&refname, "%s/%s", refs_check_dir,
iter->relative_path);
@@ -3590,7 +3595,8 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
}
static int files_fsck_refs(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
@@ -3599,17 +3605,18 @@ static int files_fsck_refs(struct ref_store *ref_store,
if (o->verbose)
fprintf_ln(stderr, _("Checking references consistency"));
- return files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fn);
+ return files_fsck_refs_dir(ref_store, o, "refs", wt, fsck_refs_fn);
}
static int files_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_READ, "fsck");
- return files_fsck_refs(ref_store, o) |
- refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+ return files_fsck_refs(ref_store, o, wt) |
+ refs->packed_ref_store->be->fsck(refs->packed_ref_store, o, wt);
}
struct ref_storage_be refs_be_files = {
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 07c57fd541..46dcaec654 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -13,6 +13,7 @@
#include "../lockfile.h"
#include "../chdir-notify.h"
#include "../statinfo.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../trace2.h"
@@ -1754,8 +1755,13 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
}
static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt)
{
+
+ if (!is_main_worktree(wt))
+ return 0;
+
return 0;
}
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 2313c830d8..037d7991cd 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -653,7 +653,8 @@ typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refnam
struct strbuf *referent);
typedef int fsck_fn(struct ref_store *ref_store,
- struct fsck_options *o);
+ struct fsck_options *o,
+ struct worktree *wt);
struct ref_storage_be {
const char *name;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f5f957e6de..b6a63c1015 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -2443,7 +2443,8 @@ static int reftable_be_reflog_expire(struct ref_store *ref_store,
}
static int reftable_be_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt UNUSED)
{
return 0;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 2a172c913d..1e17393a3d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -107,4 +107,55 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
test_must_be_empty err
'
+test_expect_success 'ref name check should work for multiple worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v8 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (3 preceding siblings ...)
2024-11-14 16:54 ` [PATCH v8 4/9] ref: support multiple worktrees check for refs shejialuo
@ 2024-11-14 16:54 ` shejialuo
2024-11-15 7:11 ` Patrick Steinhardt
2024-11-14 16:54 ` [PATCH v8 6/9] ref: add more strict checks for regular refs shejialuo
` (5 subsequent siblings)
10 siblings, 1 reply; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
"git-fsck(1)" implicitly checks the ref content by passing the
callback "fsck_handle_ref" to the "refs.c::refs_for_each_rawref".
Then, it will check whether the ref content (eventually "oid")
is valid. If not, it will report the following error to the user.
error: refs/heads/main: invalid sha1 pointer 0000...
And it will also report above errors when there are dangling symrefs
in the repository wrongly. This does not align with the behavior of
the "git symbolic-ref" command which allows users to create dangling
symrefs.
As we have already introduced the "git refs verify" command, we'd better
check the ref content explicitly in the "git refs verify" command thus
later we could remove these checks in "git-fsck(1)" and launch a
subprocess to call "git refs verify" in "git-fsck(1)" to make the
"git-fsck(1)" more clean.
Following what "git-fsck(1)" does, add a similar check to "git refs
verify". Then add a new fsck error message "badRefContent(ERROR)" to
represent that a ref has an invalid content.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 48 ++++++++++++++++
t/t0602-reffiles-fsck.sh | 105 ++++++++++++++++++++++++++++++++++
4 files changed, 157 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 68a2801f15..22c385ea22 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -19,6 +19,9 @@
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
+`badRefContent`::
+ (ERROR) A ref has bad content.
+
`badRefFiletype`::
(ERROR) A ref has a bad file type.
diff --git a/fsck.h b/fsck.h
index 500b4c04d2..0d99a87911 100644
--- a/fsck.h
+++ b/fsck.h
@@ -31,6 +31,7 @@ enum fsck_msg_type {
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
+ FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8bfdce64bc..f81b4c8dd5 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3505,6 +3505,53 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refname,
struct dir_iterator *iter);
+static int files_fsck_refs_content(struct ref_store *ref_store,
+ struct fsck_options *o,
+ const char *target_name,
+ struct dir_iterator *iter)
+{
+ struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf referent = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ unsigned int type = 0;
+ int failure_errno = 0;
+ struct object_id oid;
+ int ret = 0;
+
+ report.path = target_name;
+
+ if (S_ISLNK(iter->st.st_mode))
+ goto cleanup;
+
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
+ /*
+ * Ref file could be removed by another concurrent process. We should
+ * ignore this error and continue to the next ref.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ ret = error_errno(_("cannot read ref file '%s': %s"),
+ iter->path.buf, strerror(errno));
+ goto cleanup;
+ }
+
+ if (parse_loose_ref_contents(ref_store->repo->hash_algo,
+ ref_content.buf, &oid, &referent,
+ &type, &failure_errno)) {
+ strbuf_rtrim(&ref_content);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "%s", ref_content.buf);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&ref_content);
+ strbuf_release(&referent);
+ return ret;
+}
+
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
const char *refname,
@@ -3600,6 +3647,7 @@ static int files_fsck_refs(struct ref_store *ref_store,
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
+ files_fsck_refs_content,
NULL,
};
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 1e17393a3d..162370077b 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -158,4 +158,109 @@ test_expect_success 'ref name check should work for multiple worktrees' '
done
'
+test_expect_success 'regular ref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done
+'
+
+test_expect_success 'regular ref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
+test_expect_success 'ref content checks should work with worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v8 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-11-14 16:54 ` [PATCH v8 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-11-15 7:11 ` Patrick Steinhardt
2024-11-15 11:08 ` shejialuo
0 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-15 7:11 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Fri, Nov 15, 2024 at 12:54:28AM +0800, shejialuo wrote:
> + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
Nit: there's a space too much here now.
> + /*
> + * Ref file could be removed by another concurrent process. We should
> + * ignore this error and continue to the next ref.
> + */
> + if (errno == ENOENT)
> + goto cleanup;
> +
> + ret = error_errno(_("cannot read ref file '%s': %s"),
> + iter->path.buf, strerror(errno));
> + goto cleanup;
> + }
You report `errno` twice. This should be:
ret = error_errno(_("cannot read ref file '%s'"), iter->path.buf);
Other than that this version looks good to me, thanks!
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v8 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-11-15 7:11 ` Patrick Steinhardt
@ 2024-11-15 11:08 ` shejialuo
0 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-15 11:08 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano
On Fri, Nov 15, 2024 at 08:11:01AM +0100, Patrick Steinhardt wrote:
> On Fri, Nov 15, 2024 at 12:54:28AM +0800, shejialuo wrote:
> > + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
>
> Nit: there's a space too much here now.
>
I will improve this in the next version.
> > + /*
> > + * Ref file could be removed by another concurrent process. We should
> > + * ignore this error and continue to the next ref.
> > + */
> > + if (errno == ENOENT)
> > + goto cleanup;
> > +
> > + ret = error_errno(_("cannot read ref file '%s': %s"),
> > + iter->path.buf, strerror(errno));
> > + goto cleanup;
> > + }
>
> You report `errno` twice. This should be:
>
> ret = error_errno(_("cannot read ref file '%s'"), iter->path.buf);
>
> Other than that this version looks good to me, thanks!
>
Opps, I didn't think about it, I just copied it. I will fix this in the
next version.
> Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v8 6/9] ref: add more strict checks for regular refs
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (4 preceding siblings ...)
2024-11-14 16:54 ` [PATCH v8 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-11-14 16:54 ` shejialuo
2024-11-14 16:54 ` [PATCH v8 7/9] ref: add basic symref content check for files backend shejialuo
` (4 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already used "parse_loose_ref_contents" function to check
whether the ref content is valid in files backend. However, by
using "parse_loose_ref_contents", we allow the ref's content to end with
garbage or without a newline.
Even though we never create such loose refs ourselves, we have accepted
such loose refs. So, it is entirely possible that some third-party tools
may rely on such loose refs being valid. We should not report an error
fsck message at current. We should notify the users about such
"curiously formatted" loose refs so that adequate care is taken before
we decide to tighten the rules in the future.
And it's not suitable either to report a warn fsck message to the user.
We don't yet want the "--strict" flag that controls this bit to end up
generating errors for such weirdly-formatted reference contents, as we
first want to assess whether this retroactive tightening will cause
issues for any tools out there. It may cause compatibility issues which
may break the repository. So, we add the following two fsck infos to
represent the situation where the ref content ends without newline or
has trailing garbages:
1. refMissingNewline(INFO): A loose ref that does not end with
newline(LF).
2. trailingRefContent(INFO): A loose ref has trailing content.
It might appear that we can't provide the user with any warnings by
using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert
FSCK_INFO to FSCK_WARN and we can still warn the user about these
situations when using "git refs verify" without introducing
compatibility issues.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 14 +++++++++
fsck.h | 2 ++
refs.c | 2 +-
refs/files-backend.c | 26 ++++++++++++++--
refs/refs-internal.h | 2 +-
t/t0602-reffiles-fsck.sh | 57 +++++++++++++++++++++++++++++++++--
6 files changed, 96 insertions(+), 7 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 22c385ea22..6db0eaa84a 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -173,6 +173,20 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`refMissingNewline`::
+ (INFO) A loose ref that does not end with newline(LF). As
+ valid implementations of Git never created such a loose ref
+ file, it may become an error in the future. Report to the
+ git@vger.kernel.org mailing list if you see this error, as
+ we need to know what tools created such a file.
+
+`trailingRefContent`::
+ (INFO) A loose ref has trailing content. As valid implementations
+ of Git never created such a loose ref file, it may become an
+ error in the future. Report to the git@vger.kernel.org mailing
+ list if you see this error, as we need to know what tools
+ created such a file.
+
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
diff --git a/fsck.h b/fsck.h
index 0d99a87911..b85072df57 100644
--- a/fsck.h
+++ b/fsck.h
@@ -85,6 +85,8 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs.c b/refs.c
index 395a17273c..f88b32a633 100644
--- a/refs.c
+++ b/refs.c
@@ -1789,7 +1789,7 @@ static int refs_read_special_head(struct ref_store *ref_store,
}
result = parse_loose_ref_contents(ref_store->repo->hash_algo, content.buf,
- oid, referent, type, failure_errno);
+ oid, referent, type, NULL, failure_errno);
done:
strbuf_release(&full_path);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index f81b4c8dd5..a325b102b8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -569,7 +569,7 @@ static int read_ref_internal(struct ref_store *ref_store, const char *refname,
buf = sb_contents.buf;
ret = parse_loose_ref_contents(ref_store->repo->hash_algo, buf,
- oid, referent, type, &myerr);
+ oid, referent, type, NULL, &myerr);
out:
if (ret && !myerr)
@@ -606,7 +606,7 @@ static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno)
+ const char **trailing, int *failure_errno)
{
const char *p;
if (skip_prefix(buf, "ref:", &buf)) {
@@ -628,6 +628,10 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
*failure_errno = EINVAL;
return -1;
}
+
+ if (trailing)
+ *trailing = p;
+
return 0;
}
@@ -3513,6 +3517,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct strbuf ref_content = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
+ const char *trailing = NULL;
unsigned int type = 0;
int failure_errno = 0;
struct object_id oid;
@@ -3538,7 +3543,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
if (parse_loose_ref_contents(ref_store->repo->hash_algo,
ref_content.buf, &oid, &referent,
- &type, &failure_errno)) {
+ &type, &trailing, &failure_errno)) {
strbuf_rtrim(&ref_content);
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_CONTENT,
@@ -3546,6 +3551,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
+ if (!(type & REF_ISSYMREF)) {
+ if (!*trailing) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ goto cleanup;
+ }
+ if (*trailing != '\n' || *(trailing + 1)) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing garbage: '%s'", trailing);
+ goto cleanup;
+ }
+ }
+
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 037d7991cd..125f1fe735 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -716,7 +716,7 @@ struct ref_store {
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno);
+ const char **trailing, int *failure_errno);
/*
* Fill in the generic part of refs and add it to our collection of
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 162370077b..33e7a390ad 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -189,7 +189,48 @@ test_expect_success 'regular ref content should be checked (individual)' '
EOF
rm $branch_dir_prefix/a/b/branch-bad &&
test_cmp expect err || return 1
- done
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -207,12 +248,16 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
@@ -260,7 +305,15 @@ test_expect_success 'ref content checks should work with worktrees' '
EOF
rm $worktree2_refdir_prefix/bad-branch-2 &&
test_cmp expect err || return 1
- done
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err
'
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v8 7/9] ref: add basic symref content check for files backend
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (5 preceding siblings ...)
2024-11-14 16:54 ` [PATCH v8 6/9] ref: add more strict checks for regular refs shejialuo
@ 2024-11-14 16:54 ` shejialuo
2024-11-14 16:54 ` [PATCH v8 8/9] ref: check whether the target of the symref is a ref shejialuo
` (3 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have code that checks regular ref contents, but we do not yet check
the contents of symbolic refs. By using "parse_loose_ref_content" for
symbolic refs, we will get the information of the "referent".
We do not need to check the "referent" by opening the file. This is
because if "referent" exists in the file system, we will eventually
check its correctness by inspecting every file in the "refs" directory.
If the "referent" does not exist in the filesystem, this is OK as it is
seen as the dangling symref.
So we just need to check the "referent" string content. A regular ref
could be accepted as a textual symref if it begins with "ref:", followed
by zero or more whitespaces, followed by the full refname, followed only
by whitespace characters. However, we always write a single SP after
"ref:" and a single LF after the refname. It may seem that we should
report a fsck error message when the "referent" does not apply above
rules and we should not be so aggressive because third-party
reimplementations of Git may have taken advantage of the looser syntax.
Put it more specific, we accept the following contents:
1. "ref: refs/heads/master "
2. "ref: refs/heads/master \n \n"
3. "ref: refs/heads/master\n\n"
When introducing the regular ref content checks, we created two fsck
infos "refMissingNewline" and "trailingRefContent" which exactly
represents above situations. So we will reuse these two fsck messages to
write checks to info the user about these situations.
But we do not allow any other trailing garbage. The followings are bad
symref contents which will be reported as fsck error by "git-fsck(1)".
1. "ref: refs/heads/master garbage\n"
2. "ref: refs/heads/master \n\n\n garbage "
And we introduce a new "badReferentName(ERROR)" fsck message to report
above errors by using "is_root_ref" and "check_refname_format" to check
the "referent". Since both "is_root_ref" and "check_refname_format"
don't work with whitespaces, we use the trimmed version of "referent"
with these functions.
In order to add checks, we will do the following things:
1. Record the untrimmed length "orig_len" and untrimmed last byte
"orig_last_byte".
2. Use "strbuf_rtrim" to trim the whitespaces or newlines to make sure
"is_root_ref" and "check_refname_format" won't be failed by them.
3. Use "orig_len" and "orig_last_byte" to check whether the "referent"
misses '\n' at the end or it has trailing whitespaces or newlines.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 40 ++++++++++++
t/t0602-reffiles-fsck.sh | 111 ++++++++++++++++++++++++++++++++++
4 files changed, 155 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 6db0eaa84a..dcea05edfc 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -28,6 +28,9 @@
`badRefName`::
(ERROR) A ref has an invalid format.
+`badReferentName`::
+ (ERROR) The referent name of a symref is invalid.
+
`badTagName`::
(INFO) A tag has an invalid format.
diff --git a/fsck.h b/fsck.h
index b85072df57..5227dfdef2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -34,6 +34,7 @@ enum fsck_msg_type {
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
+ FUNC(BAD_REFERENT_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
FUNC(BAD_TREE, ERROR) \
FUNC(BAD_TREE_SHA1, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a325b102b8..c496006db1 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3509,6 +3509,43 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refname,
struct dir_iterator *iter);
+static int files_fsck_symref_target(struct fsck_options *o,
+ struct fsck_ref_report *report,
+ struct strbuf *referent)
+{
+ char orig_last_byte;
+ size_t orig_len;
+ int ret = 0;
+
+ orig_len = referent->len;
+ orig_last_byte = referent->buf[orig_len - 1];
+ strbuf_rtrim(referent);
+
+ if (!is_root_ref(referent->buf) &&
+ check_refname_format(referent->buf, 0)) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_BAD_REFERENT_NAME,
+ "points to invalid refname '%s'", referent->buf);
+ goto out;
+ }
+
+ if (referent->len == orig_len ||
+ (referent->len < orig_len && orig_last_byte != '\n')) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ }
+
+ if (referent->len != orig_len && referent->len != orig_len - 1) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing whitespaces or newlines");
+ }
+
+out:
+ return ret;
+}
+
static int files_fsck_refs_content(struct ref_store *ref_store,
struct fsck_options *o,
const char *target_name,
@@ -3564,6 +3601,9 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
"has trailing garbage: '%s'", trailing);
goto cleanup;
}
+ } else {
+ ret = files_fsck_symref_target(o, &report, &referent);
+ goto cleanup;
}
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 33e7a390ad..ee1e5f2864 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -263,6 +263,109 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'textual symref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+'
+
+test_expect_success 'textual symref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
@@ -313,6 +416,14 @@ test_expect_success 'ref content checks should work with worktrees' '
warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $worktree1_refdir_prefix/branch-garbage &&
test_cmp expect err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v8 8/9] ref: check whether the target of the symref is a ref
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (6 preceding siblings ...)
2024-11-14 16:54 ` [PATCH v8 7/9] ref: add basic symref content check for files backend shejialuo
@ 2024-11-14 16:54 ` shejialuo
2024-11-14 16:55 ` [PATCH v8 9/9] ref: add symlink ref content check for files backend shejialuo
` (2 subsequent siblings)
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Ideally, we want to the users use "git symbolic-ref" to create symrefs
instead of writing raw contents into the filesystem. However, "git
symbolic-ref" is strict with the refname but not strict with the
referent. For example, we can make the "referent" located at the
"$(gitdir)/logs/aaa" and manually write the content into this where we
can still successfully parse this symref by using "git rev-parse".
$ git init repo && cd repo && git commit --allow-empty -mx
$ git symbolic-ref refs/heads/test logs/aaa
$ echo $(git rev-parse HEAD) > .git/logs/aaa
$ git rev-parse test
We may need to add some restrictions for "referent" parameter when using
"git symbolic-ref" to create symrefs because ideally all the
nonpseudo-refs should be located under the "refs" directory and we may
tighten this in the future.
In order to tell the user we may tighten the above situation, create
a new fsck message "symrefTargetIsNotARef" to notify the user that this
may become an error in the future.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 9 +++++++++
fsck.h | 1 +
refs/files-backend.c | 14 ++++++++++++--
t/t0602-reffiles-fsck.sh | 29 +++++++++++++++++++++++++++++
4 files changed, 51 insertions(+), 2 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index dcea05edfc..f82ebc58e8 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,15 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symrefTargetIsNotARef`::
+ (INFO) The target of a symbolic reference points neither to
+ a root reference nor to a reference starting with "refs/".
+ Although we allow create a symref pointing to the referent which
+ is outside the "ref" by using `git symbolic-ref`, we may tighten
+ the rule in the future. Report to the git@vger.kernel.org
+ mailing list if you see this error, as we need to know what tools
+ created such a file.
+
`trailingRefContent`::
(INFO) A loose ref has trailing content. As valid implementations
of Git never created such a loose ref file, it may become an
diff --git a/fsck.h b/fsck.h
index 5227dfdef2..53a47612e6 100644
--- a/fsck.h
+++ b/fsck.h
@@ -87,6 +87,7 @@ enum fsck_msg_type {
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index c496006db1..edf73d6cce 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3513,6 +3513,7 @@ static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
struct strbuf *referent)
{
+ int is_referent_root;
char orig_last_byte;
size_t orig_len;
int ret = 0;
@@ -3521,8 +3522,17 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_last_byte = referent->buf[orig_len - 1];
strbuf_rtrim(referent);
- if (!is_root_ref(referent->buf) &&
- check_refname_format(referent->buf, 0)) {
+ is_referent_root = is_root_ref(referent->buf);
+ if (!is_referent_root &&
+ !starts_with(referent->buf, "refs/") &&
+ !starts_with(referent->buf, "worktrees/")) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_SYMREF_TARGET_IS_NOT_A_REF,
+ "points to non-ref target '%s'", referent->buf);
+
+ }
+
+ if (!is_referent_root && check_refname_format(referent->buf, 0)) {
ret = fsck_report_ref(o, report,
FSCK_MSG_BAD_REFERENT_NAME,
"points to invalid refname '%s'", referent->buf);
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index ee1e5f2864..692b30727a 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -366,6 +366,35 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'the target of the textual symref should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v8 9/9] ref: add symlink ref content check for files backend
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (7 preceding siblings ...)
2024-11-14 16:54 ` [PATCH v8 8/9] ref: check whether the target of the symref is a ref shejialuo
@ 2024-11-14 16:55 ` shejialuo
2024-11-15 11:10 ` [PATCH v8 0/9] add " shejialuo
2024-11-20 11:47 ` [PATCH v9 " shejialuo
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-14 16:55 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Besides the textual symref, we also allow symbolic links as the symref.
So, we should also provide the consistency check as what we have done
for textual symref. And also we consider deprecating writing the
symbolic links. We first need to access whether symbolic links still
be used. So, add a new fsck message "symlinkRef(INFO)" to tell the
user be aware of this information.
We have already introduced "files_fsck_symref_target". We should reuse
this function to handle the symrefs which use legacy symbolic links. We
should not check the trailing garbage for symbolic refs. Add a new
parameter "symbolic_link" to disable some checks which should only be
executed for textual symrefs.
And we need to also generate the "referent" parameter for reusing
"files_fsck_symref_target" by the following steps:
1. Use "strbuf_add_real_path" to resolve the symlink and get the
absolute path "ref_content" which the symlink ref points to.
2. Generate the absolute path "abs_gitdir" of "gitdir" and combine
"ref_content" and "abs_gitdir" to extract the relative path
"relative_referent_path".
3. If "ref_content" is outside of "gitdir", we just set "referent" with
"ref_content". Instead, we set "referent" with
"relative_referent_path".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 6 ++
fsck.h | 1 +
refs/files-backend.c | 38 ++++++++-
t/t0602-reffiles-fsck.sh | 141 ++++++++++++++++++++++++++++++++++
4 files changed, 182 insertions(+), 4 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index f82ebc58e8..b14bc44ca4 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,12 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symlinkRef`::
+ (INFO) A symbolic link is used as a symref. Report to the
+ git@vger.kernel.org mailing list if you see this error, as we
+ are assessing the feasibility of dropping the support to drop
+ creating symbolic links as symrefs.
+
`symrefTargetIsNotARef`::
(INFO) The target of a symbolic reference points neither to
a root reference nor to a reference starting with "refs/".
diff --git a/fsck.h b/fsck.h
index 53a47612e6..a44c231a5f 100644
--- a/fsck.h
+++ b/fsck.h
@@ -86,6 +86,7 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(SYMLINK_REF, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index edf73d6cce..c715e411f3 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1,6 +1,7 @@
#define USE_THE_REPOSITORY_VARIABLE
#include "../git-compat-util.h"
+#include "../abspath.h"
#include "../config.h"
#include "../copy.h"
#include "../environment.h"
@@ -3511,7 +3512,8 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
- struct strbuf *referent)
+ struct strbuf *referent,
+ unsigned int symbolic_link)
{
int is_referent_root;
char orig_last_byte;
@@ -3520,7 +3522,8 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_len = referent->len;
orig_last_byte = referent->buf[orig_len - 1];
- strbuf_rtrim(referent);
+ if (!symbolic_link)
+ strbuf_rtrim(referent);
is_referent_root = is_root_ref(referent->buf);
if (!is_referent_root &&
@@ -3539,6 +3542,9 @@ static int files_fsck_symref_target(struct fsck_options *o,
goto out;
}
+ if (symbolic_link)
+ goto out;
+
if (referent->len == orig_len ||
(referent->len < orig_len && orig_last_byte != '\n')) {
ret = fsck_report_ref(o, report,
@@ -3562,6 +3568,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct dir_iterator *iter)
{
struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf abs_gitdir = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
const char *trailing = NULL;
@@ -3572,8 +3579,30 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
report.path = target_name;
- if (S_ISLNK(iter->st.st_mode))
+ if (S_ISLNK(iter->st.st_mode)) {
+ const char *relative_referent_path = NULL;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_SYMLINK_REF,
+ "use deprecated symbolic link for symref");
+
+ strbuf_add_absolute_path(&abs_gitdir, ref_store->repo->gitdir);
+ strbuf_normalize_path(&abs_gitdir);
+ if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
+ strbuf_addch(&abs_gitdir, '/');
+
+ strbuf_add_real_path(&ref_content, iter->path.buf);
+ skip_prefix(ref_content.buf, abs_gitdir.buf,
+ &relative_referent_path);
+
+ if (relative_referent_path)
+ strbuf_addstr(&referent, relative_referent_path);
+ else
+ strbuf_addbuf(&referent, &ref_content);
+
+ ret |= files_fsck_symref_target(o, &report, &referent, 1);
goto cleanup;
+ }
if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
/*
@@ -3612,13 +3641,14 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
} else {
- ret = files_fsck_symref_target(o, &report, &referent);
+ ret = files_fsck_symref_target(o, &report, &referent, 0);
goto cleanup;
}
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
+ strbuf_release(&abs_gitdir);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 692b30727a..f8f27cfc6c 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -395,6 +395,147 @@ test_expect_success 'the target of the textual symref should be checked' '
done
'
+test_expect_success SYMLINKS 'symlink symref content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+'
+
+test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v8 0/9] add ref content check for files backend
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (8 preceding siblings ...)
2024-11-14 16:55 ` [PATCH v8 9/9] ref: add symlink ref content check for files backend shejialuo
@ 2024-11-15 11:10 ` shejialuo
2024-11-20 11:47 ` [PATCH v9 " shejialuo
10 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-15 11:10 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
On Fri, Nov 15, 2024 at 12:51:49AM +0800, shejialuo wrote:
> Hi all:
>
> This new version solves the following problem:
>
> 1. when reading the content of the ref file, we do not use
> "fsck_report_ref" function. It's not suitable.
> 2. Add a new test for symlink worktree test in the last patch. After
> writing the tets, find a bug. Fix the bug described below.
>
> Because we have introduced the check for worktrees, we should not use
> "ref_store->gitdir", instead we need to use "ref_store->repo->gitdir" to
> get the main worktree "gitdir". After fixing this, the test is passed.
>
> Thank Patrick to remind me about this. I forgot to add test thus making
> mistakes.
>
> Thanks,
> Jialuo
I'd like to wait for couple of days for more reviews and comments from
Junio and Karthik.
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v9 0/9] add ref content check for files backend
2024-11-14 16:51 ` [PATCH v8 " shejialuo
` (9 preceding siblings ...)
2024-11-15 11:10 ` [PATCH v8 0/9] add " shejialuo
@ 2024-11-20 11:47 ` shejialuo
2024-11-20 11:51 ` [PATCH v9 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
` (9 more replies)
10 siblings, 10 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:47 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Hi All:
This version fixes two problems:
1. Remove unnecessary space.
2. Drop extra "strerror(errno)".
Thanks,
Jialuo
shejialuo (9):
ref: initialize "fsck_ref_report" with zero
ref: check the full refname instead of basename
ref: initialize ref name outside of check functions
ref: support multiple worktrees check for refs
ref: port git-fsck(1) regular refs check for files backend
ref: add more strict checks for regular refs
ref: add basic symref content check for files backend
ref: check whether the target of the symref is a ref
ref: add symlink ref content check for files backend
Documentation/fsck-msgids.txt | 35 +++
builtin/refs.c | 10 +-
fsck.h | 6 +
refs.c | 7 +-
refs.h | 3 +-
refs/debug.c | 5 +-
refs/files-backend.c | 194 +++++++++++-
refs/packed-backend.c | 8 +-
refs/refs-internal.h | 5 +-
refs/reftable-backend.c | 3 +-
t/t0602-reffiles-fsck.sh | 576 ++++++++++++++++++++++++++++++++--
11 files changed, 790 insertions(+), 62 deletions(-)
Range-diff against v8:
1: bfb2a21af4 = 1: bfb2a21af4 ref: initialize "fsck_ref_report" with zero
2: 9efc83f7ea = 2: 9efc83f7ea ref: check the full refname instead of basename
3: 5ea7d18203 = 3: 5ea7d18203 ref: initialize ref name outside of check functions
4: cb4669b64d = 4: cb4669b64d ref: support multiple worktrees check for refs
5: c6c128c922 ! 5: d6188063d9 ref: port git-fsck(1) regular refs check for files backend
@@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
+ if (S_ISLNK(iter->st.st_mode))
+ goto cleanup;
+
-+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
++ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
+ /*
+ * Ref file could be removed by another concurrent process. We should
+ * ignore this error and continue to the next ref.
@@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
+ if (errno == ENOENT)
+ goto cleanup;
+
-+ ret = error_errno(_("cannot read ref file '%s': %s"),
-+ iter->path.buf, strerror(errno));
++ ret = error_errno(_("cannot read ref file '%s'"), iter->path.buf);
+ goto cleanup;
+ }
+
6: 911fa42717 = 6: e5e97ba3ad ref: add more strict checks for regular refs
7: 7aa6a99206 = 7: 1dec0a56d2 ref: add basic symref content check for files backend
8: dbb0787ad1 = 8: dcc4a02102 ref: check whether the target of the symref is a ref
9: a6d85b4864 ! 9: fc10862f6f ref: add symlink ref content check for files backend
@@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_s
goto cleanup;
+ }
- if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0 ) {
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
/*
@@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
--
2.47.0
^ permalink raw reply [flat|nested] 209+ messages in thread
* [PATCH v9 1/9] ref: initialize "fsck_ref_report" with zero
2024-11-20 11:47 ` [PATCH v9 " shejialuo
@ 2024-11-20 11:51 ` shejialuo
2024-11-20 11:51 ` [PATCH v9 2/9] ref: check the full refname instead of basename shejialuo
` (8 subsequent siblings)
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "fsck.c::fsck_refs_error_function", we need to tell whether "oid" and
"referent" is NULL. So, we need to always initialize these parameters to
NULL instead of letting them point to anywhere when creating a new
"fsck_ref_report" structure.
The original code explicitly initializes the "path" member in the
"struct fsck_ref_report" to NULL (which implicitly 0-initializes other
members in the struct). It is more customary to use "{ 0 }" to express
that we are 0-initializing everything. In order to align with the
codebase, initialize "fsck_ref_report" with zero.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0824c0b8a9..03d2503276 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3520,7 +3520,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
goto cleanup;
if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
- struct fsck_ref_report report = { .path = NULL };
+ struct fsck_ref_report report = { 0 };
strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 2/9] ref: check the full refname instead of basename
2024-11-20 11:47 ` [PATCH v9 " shejialuo
2024-11-20 11:51 ` [PATCH v9 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
@ 2024-11-20 11:51 ` shejialuo
2024-11-20 11:51 ` [PATCH v9 3/9] ref: initialize ref name outside of check functions shejialuo
` (7 subsequent siblings)
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
In "files-backend.c::files_fsck_refs_name", we validate the refname
format by using "check_refname_format" to check the basename of the
iterator with "REFNAME_ALLOW_ONELEVEL" flag.
However, this is a bad implementation. Although we doesn't allow a
single "@" in ".git" directory, we do allow "refs/heads/@". So, we will
report an error wrongly when there is a "refs/heads/@" ref by using one
level refname "@".
Because we just check one level refname, we either cannot check the
other parts of the full refname. And we will ignore the following
errors:
"refs/heads/ new-feature/test"
"refs/heads/~new-feature/test"
In order to fix the above problem, enhance "files_fsck_refs_name" to use
the full name for "check_refname_format". Then, replace the tests which
are related to "@" and add tests to exercise the above situations using
for loop to avoid repetition.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 7 ++-
t/t0602-reffiles-fsck.sh | 92 ++++++++++++++++++++++++----------------
2 files changed, 60 insertions(+), 39 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 03d2503276..b055edc061 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3519,10 +3519,13 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
if (iter->basename[0] != '.' && ends_with(iter->basename, ".lock"))
goto cleanup;
- if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
+ /*
+ * This works right now because we never check the root refs.
+ */
+ strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
+ if (check_refname_format(sb.buf, 0)) {
struct fsck_ref_report report = { 0 };
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
report.path = sb.buf;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 71a4d1a5ae..2a172c913d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -18,63 +18,81 @@ test_expect_success 'ref name should be checked' '
cd repo &&
git commit --allow-empty -m initial &&
- git checkout -b branch-1 &&
- git tag tag-1 &&
- git commit --allow-empty -m second &&
- git checkout -b branch-2 &&
- git tag tag-2 &&
- git tag multi_hierarchy/tag-2 &&
+ git checkout -b default-branch &&
+ git tag default-tag &&
+ git tag multi_hierarchy/default-tag &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/.branch-1: badRefName: invalid refname format
- EOF
- rm $branch_dir_prefix/.branch-1 &&
- test_cmp expect err &&
-
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/heads/@: badRefName: invalid refname format
- EOF
+ cp $branch_dir_prefix/default-branch $branch_dir_prefix/@ &&
+ git refs verify 2>err &&
+ test_must_be_empty err &&
rm $branch_dir_prefix/@ &&
- test_cmp expect err &&
- cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ &&
- test_must_fail git refs verify 2>err &&
- cat >expect <<-EOF &&
- error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format
- EOF
- rm $tag_dir_prefix/multi_hierarchy/@ &&
- test_cmp expect err &&
-
- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock &&
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/tag-1.lock &&
git refs verify 2>err &&
rm $tag_dir_prefix/tag-1.lock &&
test_must_be_empty err &&
- cp $tag_dir_prefix/tag-1 $tag_dir_prefix/.lock &&
+ cp $tag_dir_prefix/default-tag $tag_dir_prefix/.lock &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/tags/.lock: badRefName: invalid refname format
EOF
rm $tag_dir_prefix/.lock &&
- test_cmp expect err
+ test_cmp expect err &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname: badRefName: invalid refname format
+ EOF
+ rm "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/default-tag "$tag_dir_prefix/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ cp $tag_dir_prefix/multi_hierarchy/default-tag "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/tags/multi_hierarchy/$refname: badRefName: invalid refname format
+ EOF
+ rm "$tag_dir_prefix/multi_hierarchy/$refname" &&
+ test_cmp expect err || return 1
+ done &&
+
+ for refname in ".refname-starts-with-dot" "~refname-has-stride"
+ do
+ mkdir "$branch_dir_prefix/$refname" &&
+ cp $branch_dir_prefix/default-branch "$branch_dir_prefix/$refname/default-branch" &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/$refname/default-branch: badRefName: invalid refname format
+ EOF
+ rm -r "$branch_dir_prefix/$refname" &&
+ test_cmp expect err || return 1
+ done
'
test_expect_success 'ref name check should be adapted into fsck messages' '
test_when_finished "rm -rf repo" &&
git init repo &&
branch_dir_prefix=.git/refs/heads &&
- tag_dir_prefix=.git/refs/tags &&
cd repo &&
git commit --allow-empty -m initial &&
git checkout -b branch-1 &&
- git tag tag-1 &&
- git commit --allow-empty -m second &&
- git checkout -b branch-2 &&
- git tag tag-2 &&
cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
git -c fsck.badRefName=warn refs verify 2>err &&
@@ -84,7 +102,7 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
rm $branch_dir_prefix/.branch-1 &&
test_cmp expect err &&
- cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ &&
+ cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 &&
git -c fsck.badRefName=ignore refs verify 2>err &&
test_must_be_empty err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 3/9] ref: initialize ref name outside of check functions
2024-11-20 11:47 ` [PATCH v9 " shejialuo
2024-11-20 11:51 ` [PATCH v9 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-11-20 11:51 ` [PATCH v9 2/9] ref: check the full refname instead of basename shejialuo
@ 2024-11-20 11:51 ` shejialuo
2024-11-20 11:51 ` [PATCH v9 4/9] ref: support multiple worktrees check for refs shejialuo
` (6 subsequent siblings)
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We passes "refs_check_dir" to the "files_fsck_refs_name" function which
allows it to create the checked ref name later. However, when we
introduce a new check function, we have to allocate redundant memory and
re-calculate the ref name. It's bad for us to allocate redundant memory
and duplicate logic. Instead, we should allocate and calculate it only
once and pass the ref name to the check functions.
In order not to do repeat calculation, rename "refs_check_dir" to
"refname". And in "files_fsck_refs_dir", create a new strbuf "refname",
thus whenever we handle a new ref, calculate the name and call the check
functions one by one.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
refs/files-backend.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b055edc061..8edb700568 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3501,12 +3501,12 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
*/
typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *refname,
struct dir_iterator *iter);
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
- const char *refs_check_dir,
+ const char *refname,
struct dir_iterator *iter)
{
struct strbuf sb = STRBUF_INIT;
@@ -3522,11 +3522,10 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
/*
* This works right now because we never check the root refs.
*/
- strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
- if (check_refname_format(sb.buf, 0)) {
+ if (check_refname_format(refname, 0)) {
struct fsck_ref_report report = { 0 };
- report.path = sb.buf;
+ report.path = refname;
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_NAME,
"invalid refname format");
@@ -3542,6 +3541,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
const char *refs_check_dir,
files_fsck_refs_fn *fsck_refs_fn)
{
+ struct strbuf refname = STRBUF_INIT;
struct strbuf sb = STRBUF_INIT;
struct dir_iterator *iter;
int iter_status;
@@ -3560,11 +3560,15 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
continue;
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
+ strbuf_reset(&refname);
+ strbuf_addf(&refname, "%s/%s", refs_check_dir,
+ iter->relative_path);
+
if (o->verbose)
- fprintf_ln(stderr, "Checking %s/%s",
- refs_check_dir, iter->relative_path);
+ fprintf_ln(stderr, "Checking %s", refname.buf);
+
for (size_t i = 0; fsck_refs_fn[i]; i++) {
- if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
+ if (fsck_refs_fn[i](ref_store, o, refname.buf, iter))
ret = -1;
}
} else {
@@ -3581,6 +3585,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
out:
strbuf_release(&sb);
+ strbuf_release(&refname);
return ret;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 4/9] ref: support multiple worktrees check for refs
2024-11-20 11:47 ` [PATCH v9 " shejialuo
` (2 preceding siblings ...)
2024-11-20 11:51 ` [PATCH v9 3/9] ref: initialize ref name outside of check functions shejialuo
@ 2024-11-20 11:51 ` shejialuo
2024-11-20 11:51 ` [PATCH v9 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
` (5 subsequent siblings)
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. However, "git-fsck(1)"
will check the refs of worktrees. As we decide to get feature parity
with "git-fsck(1)", we need to set up support for multiple worktrees.
Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
"worktrees/<id>/refs/worktree/foo". So we should know the id of the
worktree to get the full name. Add a new parameter "struct worktree *"
for "refs-internal.h::fsck_fn". Then change the related functions to
follow this new interface.
The "packed-refs" only exists in the main worktree, so we should only
check "packed-refs" in the main worktree. Use "is_main_worktree" method
to skip checking "packed-refs" in "packed_fsck" function.
Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add
"worktree/<id>/" prefix when we are not in the main worktree.
Last, add a new test to check the refname when there are multiple
worktrees to exercise the code.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
builtin/refs.c | 10 ++++++--
refs.c | 5 ++--
refs.h | 3 ++-
refs/debug.c | 5 ++--
refs/files-backend.c | 17 ++++++++++----
refs/packed-backend.c | 8 ++++++-
refs/refs-internal.h | 3 ++-
refs/reftable-backend.c | 3 ++-
t/t0602-reffiles-fsck.sh | 51 ++++++++++++++++++++++++++++++++++++++++
9 files changed, 90 insertions(+), 15 deletions(-)
diff --git a/builtin/refs.c b/builtin/refs.c
index 24978a7b7b..394b4101c6 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -5,6 +5,7 @@
#include "parse-options.h"
#include "refs.h"
#include "strbuf.h"
+#include "worktree.h"
#define REFS_MIGRATE_USAGE \
N_("git refs migrate --ref-format=<format> [--dry-run]")
@@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
{
struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
+ struct worktree **worktrees;
const char * const verify_usage[] = {
REFS_VERIFY_USAGE,
NULL,
@@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
OPT_END(),
};
- int ret;
+ int ret = 0;
argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
if (argc)
@@ -84,9 +86,13 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
git_config(git_fsck_config, &fsck_refs_options);
prepare_repo_settings(the_repository);
- ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+ worktrees = get_worktrees();
+ for (size_t i = 0; worktrees[i]; i++)
+ ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
+ &fsck_refs_options, worktrees[i]);
fsck_options_clear(&fsck_refs_options);
+ free_worktrees(worktrees);
return ret;
}
diff --git a/refs.c b/refs.c
index 5f729ed412..395a17273c 100644
--- a/refs.c
+++ b/refs.c
@@ -318,9 +318,10 @@ int check_refname_format(const char *refname, int flags)
return check_or_sanitize_refname(refname, flags, NULL);
}
-int refs_fsck(struct ref_store *refs, struct fsck_options *o)
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt)
{
- return refs->be->fsck(refs, o);
+ return refs->be->fsck(refs, o, wt);
}
void sanitize_refname_component(const char *refname, struct strbuf *out)
diff --git a/refs.h b/refs.h
index 108dfc93b3..341d43239c 100644
--- a/refs.h
+++ b/refs.h
@@ -549,7 +549,8 @@ int check_refname_format(const char *refname, int flags);
* reflogs are consistent, and non-zero otherwise. The errors will be
* written to stderr.
*/
-int refs_fsck(struct ref_store *refs, struct fsck_options *o);
+int refs_fsck(struct ref_store *refs, struct fsck_options *o,
+ struct worktree *wt);
/*
* Apply the rules from check_refname_format, but mutate the result until it
diff --git a/refs/debug.c b/refs/debug.c
index 45e2e784a0..72e80ddd6d 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -420,10 +420,11 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
}
static int debug_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
- int res = drefs->refs->be->fsck(drefs->refs, o);
+ int res = drefs->refs->be->fsck(drefs->refs, o, wt);
trace_printf_key(&trace_refs, "fsck: %d\n", res);
return res;
}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8edb700568..8bfdce64bc 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -23,6 +23,7 @@
#include "../dir.h"
#include "../chdir-notify.h"
#include "../setup.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../revision.h"
@@ -3539,6 +3540,7 @@ static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
static int files_fsck_refs_dir(struct ref_store *ref_store,
struct fsck_options *o,
const char *refs_check_dir,
+ struct worktree *wt,
files_fsck_refs_fn *fsck_refs_fn)
{
struct strbuf refname = STRBUF_INIT;
@@ -3561,6 +3563,9 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
} else if (S_ISREG(iter->st.st_mode) ||
S_ISLNK(iter->st.st_mode)) {
strbuf_reset(&refname);
+
+ if (!is_main_worktree(wt))
+ strbuf_addf(&refname, "worktrees/%s/", wt->id);
strbuf_addf(&refname, "%s/%s", refs_check_dir,
iter->relative_path);
@@ -3590,7 +3595,8 @@ static int files_fsck_refs_dir(struct ref_store *ref_store,
}
static int files_fsck_refs(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
@@ -3599,17 +3605,18 @@ static int files_fsck_refs(struct ref_store *ref_store,
if (o->verbose)
fprintf_ln(stderr, _("Checking references consistency"));
- return files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fn);
+ return files_fsck_refs_dir(ref_store, o, "refs", wt, fsck_refs_fn);
}
static int files_fsck(struct ref_store *ref_store,
- struct fsck_options *o)
+ struct fsck_options *o,
+ struct worktree *wt)
{
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_READ, "fsck");
- return files_fsck_refs(ref_store, o) |
- refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+ return files_fsck_refs(ref_store, o, wt) |
+ refs->packed_ref_store->be->fsck(refs->packed_ref_store, o, wt);
}
struct ref_storage_be refs_be_files = {
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 07c57fd541..46dcaec654 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -13,6 +13,7 @@
#include "../lockfile.h"
#include "../chdir-notify.h"
#include "../statinfo.h"
+#include "../worktree.h"
#include "../wrapper.h"
#include "../write-or-die.h"
#include "../trace2.h"
@@ -1754,8 +1755,13 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
}
static int packed_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt)
{
+
+ if (!is_main_worktree(wt))
+ return 0;
+
return 0;
}
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 2313c830d8..037d7991cd 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -653,7 +653,8 @@ typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refnam
struct strbuf *referent);
typedef int fsck_fn(struct ref_store *ref_store,
- struct fsck_options *o);
+ struct fsck_options *o,
+ struct worktree *wt);
struct ref_storage_be {
const char *name;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f5f957e6de..b6a63c1015 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -2443,7 +2443,8 @@ static int reftable_be_reflog_expire(struct ref_store *ref_store,
}
static int reftable_be_fsck(struct ref_store *ref_store UNUSED,
- struct fsck_options *o UNUSED)
+ struct fsck_options *o UNUSED,
+ struct worktree *wt UNUSED)
{
return 0;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 2a172c913d..1e17393a3d 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -107,4 +107,55 @@ test_expect_success 'ref name check should be adapted into fsck messages' '
test_must_be_empty err
'
+test_expect_success 'ref name check should work for multiple worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+
+ cd repo &&
+ test_commit initial &&
+ git checkout -b branch-1 &&
+ test_commit second &&
+ git checkout -b branch-2 &&
+ test_commit third &&
+ git checkout -b branch-3 &&
+ git worktree add ./worktree-1 branch-1 &&
+ git worktree add ./worktree-2 branch-2 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-3
+ ) &&
+
+ cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/'\'' branch-5'\'' &&
+ cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/'\''~branch-6'\'' &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err &&
+
+ for worktree in "worktree-1" "worktree-2"
+ do
+ (
+ cd $worktree &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/ branch-5: badRefName: invalid refname format
+ error: worktrees/worktree-2/refs/worktree/~branch-6: badRefName: invalid refname format
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err || return 1
+ )
+ done
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 5/9] ref: port git-fsck(1) regular refs check for files backend
2024-11-20 11:47 ` [PATCH v9 " shejialuo
` (3 preceding siblings ...)
2024-11-20 11:51 ` [PATCH v9 4/9] ref: support multiple worktrees check for refs shejialuo
@ 2024-11-20 11:51 ` shejialuo
2024-11-20 11:51 ` [PATCH v9 6/9] ref: add more strict checks for regular refs shejialuo
` (4 subsequent siblings)
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
"git-fsck(1)" implicitly checks the ref content by passing the
callback "fsck_handle_ref" to the "refs.c::refs_for_each_rawref".
Then, it will check whether the ref content (eventually "oid")
is valid. If not, it will report the following error to the user.
error: refs/heads/main: invalid sha1 pointer 0000...
And it will also report above errors when there are dangling symrefs
in the repository wrongly. This does not align with the behavior of
the "git symbolic-ref" command which allows users to create dangling
symrefs.
As we have already introduced the "git refs verify" command, we'd better
check the ref content explicitly in the "git refs verify" command thus
later we could remove these checks in "git-fsck(1)" and launch a
subprocess to call "git refs verify" in "git-fsck(1)" to make the
"git-fsck(1)" more clean.
Following what "git-fsck(1)" does, add a similar check to "git refs
verify". Then add a new fsck error message "badRefContent(ERROR)" to
represent that a ref has an invalid content.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 47 +++++++++++++++
t/t0602-reffiles-fsck.sh | 105 ++++++++++++++++++++++++++++++++++
4 files changed, 156 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 68a2801f15..22c385ea22 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -19,6 +19,9 @@
`badParentSha1`::
(ERROR) A commit object has a bad parent sha1.
+`badRefContent`::
+ (ERROR) A ref has bad content.
+
`badRefFiletype`::
(ERROR) A ref has a bad file type.
diff --git a/fsck.h b/fsck.h
index 500b4c04d2..0d99a87911 100644
--- a/fsck.h
+++ b/fsck.h
@@ -31,6 +31,7 @@ enum fsck_msg_type {
FUNC(BAD_NAME, ERROR) \
FUNC(BAD_OBJECT_SHA1, ERROR) \
FUNC(BAD_PARENT_SHA1, ERROR) \
+ FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8bfdce64bc..9f300a7d3c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3505,6 +3505,52 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refname,
struct dir_iterator *iter);
+static int files_fsck_refs_content(struct ref_store *ref_store,
+ struct fsck_options *o,
+ const char *target_name,
+ struct dir_iterator *iter)
+{
+ struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf referent = STRBUF_INIT;
+ struct fsck_ref_report report = { 0 };
+ unsigned int type = 0;
+ int failure_errno = 0;
+ struct object_id oid;
+ int ret = 0;
+
+ report.path = target_name;
+
+ if (S_ISLNK(iter->st.st_mode))
+ goto cleanup;
+
+ if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
+ /*
+ * Ref file could be removed by another concurrent process. We should
+ * ignore this error and continue to the next ref.
+ */
+ if (errno == ENOENT)
+ goto cleanup;
+
+ ret = error_errno(_("cannot read ref file '%s'"), iter->path.buf);
+ goto cleanup;
+ }
+
+ if (parse_loose_ref_contents(ref_store->repo->hash_algo,
+ ref_content.buf, &oid, &referent,
+ &type, &failure_errno)) {
+ strbuf_rtrim(&ref_content);
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_BAD_REF_CONTENT,
+ "%s", ref_content.buf);
+ goto cleanup;
+ }
+
+cleanup:
+ strbuf_release(&ref_content);
+ strbuf_release(&referent);
+ return ret;
+}
+
static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
struct fsck_options *o,
const char *refname,
@@ -3600,6 +3646,7 @@ static int files_fsck_refs(struct ref_store *ref_store,
{
files_fsck_refs_fn fsck_refs_fn[]= {
files_fsck_refs_name,
+ files_fsck_refs_content,
NULL,
};
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 1e17393a3d..162370077b 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -158,4 +158,109 @@ test_expect_success 'ref name check should work for multiple worktrees' '
done
'
+test_expect_success 'regular ref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ git refs verify 2>err &&
+ test_must_be_empty err &&
+
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse main)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$branch_dir_prefix/a/b/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content
+ EOF
+ rm $branch_dir_prefix/a/b/branch-bad &&
+ test_cmp expect err || return 1
+ done
+'
+
+test_expect_success 'regular ref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ bad_content_1=$(git rev-parse main)x &&
+ bad_content_2=xfsazqfxcadas &&
+ bad_content_3=Xfsazqfxcadas &&
+ printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
+ printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
+ printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
+ error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
+ error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
+test_expect_success 'ref content checks should work with worktrees' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree1_refdir_prefix/bad-branch-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-1/refs/worktree/bad-branch-1: badRefContent: $bad_content
+ EOF
+ rm $worktree1_refdir_prefix/bad-branch-1 &&
+ test_cmp expect err || return 1
+ done &&
+
+ for bad_content in "$(git rev-parse HEAD)x" "xfsazqfxcadas" "Xfsazqfxcadas"
+ do
+ printf "%s" $bad_content >$worktree2_refdir_prefix/bad-branch-2 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: worktrees/worktree-2/refs/worktree/bad-branch-2: badRefContent: $bad_content
+ EOF
+ rm $worktree2_refdir_prefix/bad-branch-2 &&
+ test_cmp expect err || return 1
+ done
+'
+
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 6/9] ref: add more strict checks for regular refs
2024-11-20 11:47 ` [PATCH v9 " shejialuo
` (4 preceding siblings ...)
2024-11-20 11:51 ` [PATCH v9 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
@ 2024-11-20 11:51 ` shejialuo
2024-11-20 11:52 ` [PATCH v9 7/9] ref: add basic symref content check for files backend shejialuo
` (3 subsequent siblings)
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:51 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have already used "parse_loose_ref_contents" function to check
whether the ref content is valid in files backend. However, by
using "parse_loose_ref_contents", we allow the ref's content to end with
garbage or without a newline.
Even though we never create such loose refs ourselves, we have accepted
such loose refs. So, it is entirely possible that some third-party tools
may rely on such loose refs being valid. We should not report an error
fsck message at current. We should notify the users about such
"curiously formatted" loose refs so that adequate care is taken before
we decide to tighten the rules in the future.
And it's not suitable either to report a warn fsck message to the user.
We don't yet want the "--strict" flag that controls this bit to end up
generating errors for such weirdly-formatted reference contents, as we
first want to assess whether this retroactive tightening will cause
issues for any tools out there. It may cause compatibility issues which
may break the repository. So, we add the following two fsck infos to
represent the situation where the ref content ends without newline or
has trailing garbages:
1. refMissingNewline(INFO): A loose ref that does not end with
newline(LF).
2. trailingRefContent(INFO): A loose ref has trailing content.
It might appear that we can't provide the user with any warnings by
using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert
FSCK_INFO to FSCK_WARN and we can still warn the user about these
situations when using "git refs verify" without introducing
compatibility issues.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 14 +++++++++
fsck.h | 2 ++
refs.c | 2 +-
refs/files-backend.c | 26 ++++++++++++++--
refs/refs-internal.h | 2 +-
t/t0602-reffiles-fsck.sh | 57 +++++++++++++++++++++++++++++++++--
6 files changed, 96 insertions(+), 7 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 22c385ea22..6db0eaa84a 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -173,6 +173,20 @@
`nullSha1`::
(WARN) Tree contains entries pointing to a null sha1.
+`refMissingNewline`::
+ (INFO) A loose ref that does not end with newline(LF). As
+ valid implementations of Git never created such a loose ref
+ file, it may become an error in the future. Report to the
+ git@vger.kernel.org mailing list if you see this error, as
+ we need to know what tools created such a file.
+
+`trailingRefContent`::
+ (INFO) A loose ref has trailing content. As valid implementations
+ of Git never created such a loose ref file, it may become an
+ error in the future. Report to the git@vger.kernel.org mailing
+ list if you see this error, as we need to know what tools
+ created such a file.
+
`treeNotSorted`::
(ERROR) A tree is not properly sorted.
diff --git a/fsck.h b/fsck.h
index 0d99a87911..b85072df57 100644
--- a/fsck.h
+++ b/fsck.h
@@ -85,6 +85,8 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs.c b/refs.c
index 395a17273c..f88b32a633 100644
--- a/refs.c
+++ b/refs.c
@@ -1789,7 +1789,7 @@ static int refs_read_special_head(struct ref_store *ref_store,
}
result = parse_loose_ref_contents(ref_store->repo->hash_algo, content.buf,
- oid, referent, type, failure_errno);
+ oid, referent, type, NULL, failure_errno);
done:
strbuf_release(&full_path);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 9f300a7d3c..3d4d612420 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -569,7 +569,7 @@ static int read_ref_internal(struct ref_store *ref_store, const char *refname,
buf = sb_contents.buf;
ret = parse_loose_ref_contents(ref_store->repo->hash_algo, buf,
- oid, referent, type, &myerr);
+ oid, referent, type, NULL, &myerr);
out:
if (ret && !myerr)
@@ -606,7 +606,7 @@ static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno)
+ const char **trailing, int *failure_errno)
{
const char *p;
if (skip_prefix(buf, "ref:", &buf)) {
@@ -628,6 +628,10 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
*failure_errno = EINVAL;
return -1;
}
+
+ if (trailing)
+ *trailing = p;
+
return 0;
}
@@ -3513,6 +3517,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct strbuf ref_content = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
+ const char *trailing = NULL;
unsigned int type = 0;
int failure_errno = 0;
struct object_id oid;
@@ -3537,7 +3542,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
if (parse_loose_ref_contents(ref_store->repo->hash_algo,
ref_content.buf, &oid, &referent,
- &type, &failure_errno)) {
+ &type, &trailing, &failure_errno)) {
strbuf_rtrim(&ref_content);
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_CONTENT,
@@ -3545,6 +3550,21 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
+ if (!(type & REF_ISSYMREF)) {
+ if (!*trailing) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ goto cleanup;
+ }
+ if (*trailing != '\n' || *(trailing + 1)) {
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing garbage: '%s'", trailing);
+ goto cleanup;
+ }
+ }
+
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 037d7991cd..125f1fe735 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -716,7 +716,7 @@ struct ref_store {
int parse_loose_ref_contents(const struct git_hash_algo *algop,
const char *buf, struct object_id *oid,
struct strbuf *referent, unsigned int *type,
- int *failure_errno);
+ const char **trailing, int *failure_errno);
/*
* Fill in the generic part of refs and add it to our collection of
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 162370077b..33e7a390ad 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -189,7 +189,48 @@ test_expect_success 'regular ref content should be checked (individual)' '
EOF
rm $branch_dir_prefix/a/b/branch-bad &&
test_cmp expect err || return 1
- done
+ done &&
+
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ for trailing_content in " garbage" " more garbage"
+ do
+ printf "%s" "$(git rev-parse main)$trailing_content" >$branch_dir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\''$trailing_content'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "%s\n\n\n" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ '\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err &&
+
+ printf "%s\n\n\n garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage-special &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-garbage-special: trailingRefContent: has trailing garbage: '\''
+
+
+ garbage'\''
+ EOF
+ rm $branch_dir_prefix/branch-garbage-special &&
+ test_cmp expect err
'
test_expect_success 'regular ref content should be checked (aggregate)' '
@@ -207,12 +248,16 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
printf "%s" $bad_content_1 >$tag_dir_prefix/tag-bad-1 &&
printf "%s" $bad_content_2 >$tag_dir_prefix/tag-bad-2 &&
printf "%s" $bad_content_3 >$branch_dir_prefix/a/b/branch-bad &&
+ printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+ printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
error: refs/heads/a/b/branch-bad: badRefContent: $bad_content_3
error: refs/tags/tag-bad-1: badRefContent: $bad_content_1
error: refs/tags/tag-bad-2: badRefContent: $bad_content_2
+ warning: refs/heads/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
EOF
sort err >sorted_err &&
test_cmp expect sorted_err
@@ -260,7 +305,15 @@ test_expect_success 'ref content checks should work with worktrees' '
EOF
rm $worktree2_refdir_prefix/bad-branch-2 &&
test_cmp expect err || return 1
- done
+ done &&
+
+ printf "%s" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err
'
test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 7/9] ref: add basic symref content check for files backend
2024-11-20 11:47 ` [PATCH v9 " shejialuo
` (5 preceding siblings ...)
2024-11-20 11:51 ` [PATCH v9 6/9] ref: add more strict checks for regular refs shejialuo
@ 2024-11-20 11:52 ` shejialuo
2024-11-20 11:52 ` [PATCH v9 8/9] ref: check whether the target of the symref is a ref shejialuo
` (2 subsequent siblings)
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:52 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
We have code that checks regular ref contents, but we do not yet check
the contents of symbolic refs. By using "parse_loose_ref_content" for
symbolic refs, we will get the information of the "referent".
We do not need to check the "referent" by opening the file. This is
because if "referent" exists in the file system, we will eventually
check its correctness by inspecting every file in the "refs" directory.
If the "referent" does not exist in the filesystem, this is OK as it is
seen as the dangling symref.
So we just need to check the "referent" string content. A regular ref
could be accepted as a textual symref if it begins with "ref:", followed
by zero or more whitespaces, followed by the full refname, followed only
by whitespace characters. However, we always write a single SP after
"ref:" and a single LF after the refname. It may seem that we should
report a fsck error message when the "referent" does not apply above
rules and we should not be so aggressive because third-party
reimplementations of Git may have taken advantage of the looser syntax.
Put it more specific, we accept the following contents:
1. "ref: refs/heads/master "
2. "ref: refs/heads/master \n \n"
3. "ref: refs/heads/master\n\n"
When introducing the regular ref content checks, we created two fsck
infos "refMissingNewline" and "trailingRefContent" which exactly
represents above situations. So we will reuse these two fsck messages to
write checks to info the user about these situations.
But we do not allow any other trailing garbage. The followings are bad
symref contents which will be reported as fsck error by "git-fsck(1)".
1. "ref: refs/heads/master garbage\n"
2. "ref: refs/heads/master \n\n\n garbage "
And we introduce a new "badReferentName(ERROR)" fsck message to report
above errors by using "is_root_ref" and "check_refname_format" to check
the "referent". Since both "is_root_ref" and "check_refname_format"
don't work with whitespaces, we use the trimmed version of "referent"
with these functions.
In order to add checks, we will do the following things:
1. Record the untrimmed length "orig_len" and untrimmed last byte
"orig_last_byte".
2. Use "strbuf_rtrim" to trim the whitespaces or newlines to make sure
"is_root_ref" and "check_refname_format" won't be failed by them.
3. Use "orig_len" and "orig_last_byte" to check whether the "referent"
misses '\n' at the end or it has trailing whitespaces or newlines.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 3 +
fsck.h | 1 +
refs/files-backend.c | 40 ++++++++++++
t/t0602-reffiles-fsck.sh | 111 ++++++++++++++++++++++++++++++++++
4 files changed, 155 insertions(+)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 6db0eaa84a..dcea05edfc 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -28,6 +28,9 @@
`badRefName`::
(ERROR) A ref has an invalid format.
+`badReferentName`::
+ (ERROR) The referent name of a symref is invalid.
+
`badTagName`::
(INFO) A tag has an invalid format.
diff --git a/fsck.h b/fsck.h
index b85072df57..5227dfdef2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -34,6 +34,7 @@ enum fsck_msg_type {
FUNC(BAD_REF_CONTENT, ERROR) \
FUNC(BAD_REF_FILETYPE, ERROR) \
FUNC(BAD_REF_NAME, ERROR) \
+ FUNC(BAD_REFERENT_NAME, ERROR) \
FUNC(BAD_TIMEZONE, ERROR) \
FUNC(BAD_TREE, ERROR) \
FUNC(BAD_TREE_SHA1, ERROR) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 3d4d612420..f4342e3f3e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3509,6 +3509,43 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
const char *refname,
struct dir_iterator *iter);
+static int files_fsck_symref_target(struct fsck_options *o,
+ struct fsck_ref_report *report,
+ struct strbuf *referent)
+{
+ char orig_last_byte;
+ size_t orig_len;
+ int ret = 0;
+
+ orig_len = referent->len;
+ orig_last_byte = referent->buf[orig_len - 1];
+ strbuf_rtrim(referent);
+
+ if (!is_root_ref(referent->buf) &&
+ check_refname_format(referent->buf, 0)) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_BAD_REFERENT_NAME,
+ "points to invalid refname '%s'", referent->buf);
+ goto out;
+ }
+
+ if (referent->len == orig_len ||
+ (referent->len < orig_len && orig_last_byte != '\n')) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_REF_MISSING_NEWLINE,
+ "misses LF at the end");
+ }
+
+ if (referent->len != orig_len && referent->len != orig_len - 1) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_TRAILING_REF_CONTENT,
+ "has trailing whitespaces or newlines");
+ }
+
+out:
+ return ret;
+}
+
static int files_fsck_refs_content(struct ref_store *ref_store,
struct fsck_options *o,
const char *target_name,
@@ -3563,6 +3600,9 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
"has trailing garbage: '%s'", trailing);
goto cleanup;
}
+ } else {
+ ret = files_fsck_symref_target(o, &report, &referent);
+ goto cleanup;
}
cleanup:
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 33e7a390ad..ee1e5f2864 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -263,6 +263,109 @@ test_expect_success 'regular ref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'textual symref content should be checked (individual)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for bad_referent in "refs/heads/.branch" "refs/heads/~branch" "refs/heads/?branch"
+ do
+ printf "ref: %s\n" $bad_referent >$branch_dir_prefix/branch-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad: badReferentName: points to invalid refname '\''$bad_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad &&
+ test_cmp expect err || return 1
+ done &&
+
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-no-newline: refMissingNewline: misses LF at the end
+ EOF
+ rm $branch_dir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-1 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-2 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-trailing-3 &&
+ test_cmp expect err &&
+
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ EOF
+ rm $branch_dir_prefix/a/b/branch-complicated &&
+ test_cmp expect err
+'
+
+test_expect_success 'textual symref content should be checked (aggregate)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ printf "ref: refs/heads/branch\n" >$branch_dir_prefix/branch-good &&
+ printf "ref: HEAD\n" >$branch_dir_prefix/branch-head &&
+ printf "ref: refs/heads/branch" >$branch_dir_prefix/branch-no-newline-1 &&
+ printf "ref: refs/heads/branch " >$branch_dir_prefix/a/b/branch-trailing-1 &&
+ printf "ref: refs/heads/branch\n\n" >$branch_dir_prefix/a/b/branch-trailing-2 &&
+ printf "ref: refs/heads/branch \n" >$branch_dir_prefix/a/b/branch-trailing-3 &&
+ printf "ref: refs/heads/branch \n " >$branch_dir_prefix/a/b/branch-complicated &&
+ printf "ref: refs/heads/.branch\n" >$branch_dir_prefix/branch-bad-1 &&
+
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ error: refs/heads/branch-bad-1: badReferentName: points to invalid refname '\''refs/heads/.branch'\''
+ warning: refs/heads/a/b/branch-complicated: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-complicated: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-1: refMissingNewline: misses LF at the end
+ warning: refs/heads/a/b/branch-trailing-1: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-2: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/a/b/branch-trailing-3: trailingRefContent: has trailing whitespaces or newlines
+ warning: refs/heads/branch-no-newline-1: refMissingNewline: misses LF at the end
+ EOF
+ sort err >sorted_err &&
+ test_cmp expect sorted_err
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
@@ -313,6 +416,14 @@ test_expect_success 'ref content checks should work with worktrees' '
warning: worktrees/worktree-1/refs/worktree/branch-no-newline: refMissingNewline: misses LF at the end
EOF
rm $worktree1_refdir_prefix/branch-no-newline &&
+ test_cmp expect err &&
+
+ printf "%s garbage" "$(git rev-parse HEAD)" >$worktree1_refdir_prefix/branch-garbage &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-garbage: trailingRefContent: has trailing garbage: '\'' garbage'\''
+ EOF
+ rm $worktree1_refdir_prefix/branch-garbage &&
test_cmp expect err
'
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 8/9] ref: check whether the target of the symref is a ref
2024-11-20 11:47 ` [PATCH v9 " shejialuo
` (6 preceding siblings ...)
2024-11-20 11:52 ` [PATCH v9 7/9] ref: add basic symref content check for files backend shejialuo
@ 2024-11-20 11:52 ` shejialuo
2024-11-20 11:52 ` [PATCH v9 9/9] ref: add symlink ref content check for files backend shejialuo
2024-11-20 14:26 ` [PATCH v9 0/9] add " Patrick Steinhardt
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:52 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Ideally, we want to the users use "git symbolic-ref" to create symrefs
instead of writing raw contents into the filesystem. However, "git
symbolic-ref" is strict with the refname but not strict with the
referent. For example, we can make the "referent" located at the
"$(gitdir)/logs/aaa" and manually write the content into this where we
can still successfully parse this symref by using "git rev-parse".
$ git init repo && cd repo && git commit --allow-empty -mx
$ git symbolic-ref refs/heads/test logs/aaa
$ echo $(git rev-parse HEAD) > .git/logs/aaa
$ git rev-parse test
We may need to add some restrictions for "referent" parameter when using
"git symbolic-ref" to create symrefs because ideally all the
nonpseudo-refs should be located under the "refs" directory and we may
tighten this in the future.
In order to tell the user we may tighten the above situation, create
a new fsck message "symrefTargetIsNotARef" to notify the user that this
may become an error in the future.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 9 +++++++++
fsck.h | 1 +
refs/files-backend.c | 14 ++++++++++++--
t/t0602-reffiles-fsck.sh | 29 +++++++++++++++++++++++++++++
4 files changed, 51 insertions(+), 2 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index dcea05edfc..f82ebc58e8 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,15 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symrefTargetIsNotARef`::
+ (INFO) The target of a symbolic reference points neither to
+ a root reference nor to a reference starting with "refs/".
+ Although we allow create a symref pointing to the referent which
+ is outside the "ref" by using `git symbolic-ref`, we may tighten
+ the rule in the future. Report to the git@vger.kernel.org
+ mailing list if you see this error, as we need to know what tools
+ created such a file.
+
`trailingRefContent`::
(INFO) A loose ref has trailing content. As valid implementations
of Git never created such a loose ref file, it may become an
diff --git a/fsck.h b/fsck.h
index 5227dfdef2..53a47612e6 100644
--- a/fsck.h
+++ b/fsck.h
@@ -87,6 +87,7 @@ enum fsck_msg_type {
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
+ FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
/* ignored (elevated when requested) */ \
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index f4342e3f3e..c2b99fdf40 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3513,6 +3513,7 @@ static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
struct strbuf *referent)
{
+ int is_referent_root;
char orig_last_byte;
size_t orig_len;
int ret = 0;
@@ -3521,8 +3522,17 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_last_byte = referent->buf[orig_len - 1];
strbuf_rtrim(referent);
- if (!is_root_ref(referent->buf) &&
- check_refname_format(referent->buf, 0)) {
+ is_referent_root = is_root_ref(referent->buf);
+ if (!is_referent_root &&
+ !starts_with(referent->buf, "refs/") &&
+ !starts_with(referent->buf, "worktrees/")) {
+ ret = fsck_report_ref(o, report,
+ FSCK_MSG_SYMREF_TARGET_IS_NOT_A_REF,
+ "points to non-ref target '%s'", referent->buf);
+
+ }
+
+ if (!is_referent_root && check_refname_format(referent->buf, 0)) {
ret = fsck_report_ref(o, report,
FSCK_MSG_BAD_REFERENT_NAME,
"points to invalid refname '%s'", referent->buf);
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index ee1e5f2864..692b30727a 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -366,6 +366,35 @@ test_expect_success 'textual symref content should be checked (aggregate)' '
test_cmp expect sorted_err
'
+test_expect_success 'the target of the textual symref should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ for good_referent in "refs/heads/branch" "HEAD" "refs/tags/tag"
+ do
+ printf "ref: %s\n" $good_referent >$branch_dir_prefix/branch-good &&
+ git refs verify 2>err &&
+ rm $branch_dir_prefix/branch-good &&
+ test_must_be_empty err || return 1
+ done &&
+
+ for nonref_referent in "refs-back/heads/branch" "refs-back/tags/tag" "reflogs/refs/heads/branch"
+ do
+ printf "ref: %s\n" $nonref_referent >$branch_dir_prefix/branch-bad-1 &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-bad-1: symrefTargetIsNotARef: points to non-ref target '\''$nonref_referent'\''
+ EOF
+ rm $branch_dir_prefix/branch-bad-1 &&
+ test_cmp expect err || return 1
+ done
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* [PATCH v9 9/9] ref: add symlink ref content check for files backend
2024-11-20 11:47 ` [PATCH v9 " shejialuo
` (7 preceding siblings ...)
2024-11-20 11:52 ` [PATCH v9 8/9] ref: check whether the target of the symref is a ref shejialuo
@ 2024-11-20 11:52 ` shejialuo
2024-11-20 14:26 ` [PATCH v9 0/9] add " Patrick Steinhardt
9 siblings, 0 replies; 209+ messages in thread
From: shejialuo @ 2024-11-20 11:52 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Karthik Nayak, Junio C Hamano
Besides the textual symref, we also allow symbolic links as the symref.
So, we should also provide the consistency check as what we have done
for textual symref. And also we consider deprecating writing the
symbolic links. We first need to access whether symbolic links still
be used. So, add a new fsck message "symlinkRef(INFO)" to tell the
user be aware of this information.
We have already introduced "files_fsck_symref_target". We should reuse
this function to handle the symrefs which use legacy symbolic links. We
should not check the trailing garbage for symbolic refs. Add a new
parameter "symbolic_link" to disable some checks which should only be
executed for textual symrefs.
And we need to also generate the "referent" parameter for reusing
"files_fsck_symref_target" by the following steps:
1. Use "strbuf_add_real_path" to resolve the symlink and get the
absolute path "ref_content" which the symlink ref points to.
2. Generate the absolute path "abs_gitdir" of "gitdir" and combine
"ref_content" and "abs_gitdir" to extract the relative path
"relative_referent_path".
3. If "ref_content" is outside of "gitdir", we just set "referent" with
"ref_content". Instead, we set "referent" with
"relative_referent_path".
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
Documentation/fsck-msgids.txt | 6 ++
fsck.h | 1 +
refs/files-backend.c | 38 ++++++++-
t/t0602-reffiles-fsck.sh | 141 ++++++++++++++++++++++++++++++++++
4 files changed, 182 insertions(+), 4 deletions(-)
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index f82ebc58e8..b14bc44ca4 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -183,6 +183,12 @@
git@vger.kernel.org mailing list if you see this error, as
we need to know what tools created such a file.
+`symlinkRef`::
+ (INFO) A symbolic link is used as a symref. Report to the
+ git@vger.kernel.org mailing list if you see this error, as we
+ are assessing the feasibility of dropping the support to drop
+ creating symbolic links as symrefs.
+
`symrefTargetIsNotARef`::
(INFO) The target of a symbolic reference points neither to
a root reference nor to a reference starting with "refs/".
diff --git a/fsck.h b/fsck.h
index 53a47612e6..a44c231a5f 100644
--- a/fsck.h
+++ b/fsck.h
@@ -86,6 +86,7 @@ enum fsck_msg_type {
FUNC(MAILMAP_SYMLINK, INFO) \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO) \
+ FUNC(SYMLINK_REF, INFO) \
FUNC(REF_MISSING_NEWLINE, INFO) \
FUNC(SYMREF_TARGET_IS_NOT_A_REF, INFO) \
FUNC(TRAILING_REF_CONTENT, INFO) \
diff --git a/refs/files-backend.c b/refs/files-backend.c
index c2b99fdf40..ea5961e48c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1,6 +1,7 @@
#define USE_THE_REPOSITORY_VARIABLE
#include "../git-compat-util.h"
+#include "../abspath.h"
#include "../config.h"
#include "../copy.h"
#include "../environment.h"
@@ -3511,7 +3512,8 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
static int files_fsck_symref_target(struct fsck_options *o,
struct fsck_ref_report *report,
- struct strbuf *referent)
+ struct strbuf *referent,
+ unsigned int symbolic_link)
{
int is_referent_root;
char orig_last_byte;
@@ -3520,7 +3522,8 @@ static int files_fsck_symref_target(struct fsck_options *o,
orig_len = referent->len;
orig_last_byte = referent->buf[orig_len - 1];
- strbuf_rtrim(referent);
+ if (!symbolic_link)
+ strbuf_rtrim(referent);
is_referent_root = is_root_ref(referent->buf);
if (!is_referent_root &&
@@ -3539,6 +3542,9 @@ static int files_fsck_symref_target(struct fsck_options *o,
goto out;
}
+ if (symbolic_link)
+ goto out;
+
if (referent->len == orig_len ||
(referent->len < orig_len && orig_last_byte != '\n')) {
ret = fsck_report_ref(o, report,
@@ -3562,6 +3568,7 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
struct dir_iterator *iter)
{
struct strbuf ref_content = STRBUF_INIT;
+ struct strbuf abs_gitdir = STRBUF_INIT;
struct strbuf referent = STRBUF_INIT;
struct fsck_ref_report report = { 0 };
const char *trailing = NULL;
@@ -3572,8 +3579,30 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
report.path = target_name;
- if (S_ISLNK(iter->st.st_mode))
+ if (S_ISLNK(iter->st.st_mode)) {
+ const char *relative_referent_path = NULL;
+
+ ret = fsck_report_ref(o, &report,
+ FSCK_MSG_SYMLINK_REF,
+ "use deprecated symbolic link for symref");
+
+ strbuf_add_absolute_path(&abs_gitdir, ref_store->repo->gitdir);
+ strbuf_normalize_path(&abs_gitdir);
+ if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
+ strbuf_addch(&abs_gitdir, '/');
+
+ strbuf_add_real_path(&ref_content, iter->path.buf);
+ skip_prefix(ref_content.buf, abs_gitdir.buf,
+ &relative_referent_path);
+
+ if (relative_referent_path)
+ strbuf_addstr(&referent, relative_referent_path);
+ else
+ strbuf_addbuf(&referent, &ref_content);
+
+ ret |= files_fsck_symref_target(o, &report, &referent, 1);
goto cleanup;
+ }
if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
/*
@@ -3611,13 +3640,14 @@ static int files_fsck_refs_content(struct ref_store *ref_store,
goto cleanup;
}
} else {
- ret = files_fsck_symref_target(o, &report, &referent);
+ ret = files_fsck_symref_target(o, &report, &referent, 0);
goto cleanup;
}
cleanup:
strbuf_release(&ref_content);
strbuf_release(&referent);
+ strbuf_release(&abs_gitdir);
return ret;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 692b30727a..f8f27cfc6c 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -395,6 +395,147 @@ test_expect_success 'the target of the textual symref should be checked' '
done
'
+test_expect_success SYMLINKS 'symlink symref content should be checked' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ branch_dir_prefix=.git/refs/heads &&
+ tag_dir_prefix=.git/refs/tags &&
+ cd repo &&
+ test_commit default &&
+ mkdir -p "$branch_dir_prefix/a/b" &&
+
+ ln -sf ./main $branch_dir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../logs/branch-escape $branch_dir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: refs/heads/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"branch " $branch_dir_prefix/branch-symbolic-bad &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-bad: symlinkRef: use deprecated symbolic link for symref
+ error: refs/heads/branch-symbolic-bad: badReferentName: points to invalid refname '\''refs/heads/branch '\''
+ EOF
+ rm $branch_dir_prefix/branch-symbolic-bad &&
+ test_cmp expect err &&
+
+ ln -sf ./".tag" $tag_dir_prefix/tag-symbolic-1 &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/tags/tag-symbolic-1: symlinkRef: use deprecated symbolic link for symref
+ error: refs/tags/tag-symbolic-1: badReferentName: points to invalid refname '\''refs/tags/.tag'\''
+ EOF
+ rm $tag_dir_prefix/tag-symbolic-1 &&
+ test_cmp expect err
+'
+
+test_expect_success SYMLINKS 'symlink symref content should be checked (worktree)' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ cd repo &&
+ test_commit default &&
+ git branch branch-1 &&
+ git branch branch-2 &&
+ git branch branch-3 &&
+ git worktree add ./worktree-1 branch-2 &&
+ git worktree add ./worktree-2 branch-3 &&
+ main_worktree_refdir_prefix=.git/refs/heads &&
+ worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+ worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+ (
+ cd worktree-1 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+ (
+ cd worktree-2 &&
+ git update-ref refs/worktree/branch-4 refs/heads/branch-1
+ ) &&
+
+ ln -sf ../../../../refs/heads/good-branch $worktree1_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $worktree1_refdir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../worktrees/worktree-1/good-branch $worktree2_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $worktree2_refdir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../worktrees/worktree-2/good-branch $main_worktree_refdir_prefix/branch-symbolic-good &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: refs/heads/branch-symbolic-good: symlinkRef: use deprecated symbolic link for symref
+ EOF
+ rm $main_worktree_refdir_prefix/branch-symbolic-good &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../logs/branch-escape $worktree1_refdir_prefix/branch-symbolic &&
+ git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symlinkRef: use deprecated symbolic link for symref
+ warning: worktrees/worktree-1/refs/worktree/branch-symbolic: symrefTargetIsNotARef: points to non-ref target '\''logs/branch-escape'\''
+ EOF
+ rm $worktree1_refdir_prefix/branch-symbolic &&
+ test_cmp expect err &&
+
+ for bad_referent_name in ".tag" "branch "
+ do
+ ln -sf ./"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-1/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree1_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-1/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-1/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree1_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ./"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''worktrees/worktree-2/refs/worktree/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err &&
+
+ ln -sf ../../../../refs/heads/"$bad_referent_name" $worktree2_refdir_prefix/bad-symbolic &&
+ test_must_fail git refs verify 2>err &&
+ cat >expect <<-EOF &&
+ warning: worktrees/worktree-2/refs/worktree/bad-symbolic: symlinkRef: use deprecated symbolic link for symref
+ error: worktrees/worktree-2/refs/worktree/bad-symbolic: badReferentName: points to invalid refname '\''refs/heads/$bad_referent_name'\''
+ EOF
+ rm $worktree2_refdir_prefix/bad-symbolic &&
+ test_cmp expect err || return 1
+ done
+'
+
test_expect_success 'ref content checks should work with worktrees' '
test_when_finished "rm -rf repo" &&
git init repo &&
--
2.47.0
^ permalink raw reply related [flat|nested] 209+ messages in thread
* Re: [PATCH v9 0/9] add ref content check for files backend
2024-11-20 11:47 ` [PATCH v9 " shejialuo
` (8 preceding siblings ...)
2024-11-20 11:52 ` [PATCH v9 9/9] ref: add symlink ref content check for files backend shejialuo
@ 2024-11-20 14:26 ` Patrick Steinhardt
2024-11-20 23:21 ` Junio C Hamano
9 siblings, 1 reply; 209+ messages in thread
From: Patrick Steinhardt @ 2024-11-20 14:26 UTC (permalink / raw)
To: shejialuo; +Cc: git, Karthik Nayak, Junio C Hamano
On Wed, Nov 20, 2024 at 07:47:04PM +0800, shejialuo wrote:
> Hi All:
>
> This version fixes two problems:
>
> 1. Remove unnecessary space.
> 2. Drop extra "strerror(errno)".
>
> Thanks,
> Jialuo
The range-diff looks as expected, so this version lokos good to me.
Thanks!
Patrick
^ permalink raw reply [flat|nested] 209+ messages in thread
* Re: [PATCH v9 0/9] add ref content check for files backend
2024-11-20 14:26 ` [PATCH v9 0/9] add " Patrick Steinhardt
@ 2024-11-20 23:21 ` Junio C Hamano
0 siblings, 0 replies; 209+ messages in thread
From: Junio C Hamano @ 2024-11-20 23:21 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: shejialuo, git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> On Wed, Nov 20, 2024 at 07:47:04PM +0800, shejialuo wrote:
>> Hi All:
>>
>> This version fixes two problems:
>>
>> 1. Remove unnecessary space.
>> 2. Drop extra "strerror(errno)".
>>
>> Thanks,
>> Jialuo
>
> The range-diff looks as expected, so this version lokos good to me.
Thanks, both. Looking good.
^ permalink raw reply [flat|nested] 209+ messages in thread