From: Junio C Hamano <gitster@pobox.com>
To: shejialuo <shejialuo@gmail.com>
Cc: git@vger.kernel.org, Patrick Steinhardt <ps@pks.im>,
Karthik Nayak <karthik.188@gmail.com>
Subject: Re: [PATCH v2 3/4] ref: add symbolic ref content check for files backend
Date: Tue, 27 Aug 2024 12:19:11 -0700 [thread overview]
Message-ID: <xmqq1q2993kg.fsf@gitster.g> (raw)
In-Reply-To: <Zs3558scHssaG_XS@ArchLinux> (shejialuo@gmail.com's message of "Wed, 28 Aug 2024 00:08:07 +0800")
shejialuo <shejialuo@gmail.com> writes:
> We have already introduced the checks for regular refs. There is no need
> to check the consistency of the target which the symbolic ref points to.
> Instead, we just check the content of the symbolic ref itself.
Just in case you need it in the future, if you ever need to refer to
a symbolic ref in a way that it is clear which of the two kinds you
are talking about, you can say "textual symref" (a regular file
whose contents is "ref: " followed by the target), to contrast them
with "symbolic link used as symref".
In the proposed log message of this commit, all references to
"symbolic ref" talk about textual ones, so I do not see any need to
be extra explicit by saying "textual symref".
> In order to check the content of the symbolic ref, create a function
> "files_fsck_symref_target". It will first check whether the "pointee" is
> under the "refs/" directory and then we will check the "pointee" itself.
Hmph, as the pointee must be within the usual places that you would
find refs (either in refs/ directory or pseudo ref files immediately
below $GIT_DIR), wouldn't we check the pointee when fsck (or "git
refs verify") run and check everything? The pointee will have its
turn to be checked, and I am not sure why you need to check the
pointee when you find a symbolic ref is pointing at it, which will
lead for it to be checked twice (or more).
I however did not find an additional code to "check the pointee itself"
in the patch, so perhaps it is OK---the only thing that needs fixing
may be the above paragraph if that is the case.
> There is no specification about the content of the symbolic ref.
> Although we do write "ref: %s\n" to create a symbolic ref by using
> "git-symbolic-ref(1)" command. However, this is not mandatory. We still
> accept symbolic refs with null trailing garbage. Put it more specific,
> the following are correct:
>
> 1. "ref: refs/heads/master "
> 2. "ref: refs/heads/master \n \n"
> 3. "ref: refs/heads/master\n\n"
>
> But we do not allow any non-null trailing garbage.
Your use of word "null" is probably too confusing to contributors to
this project. None of the above has NUL bytes in them. I think you
want to say something like this:
A regular file is accepted as a textual symbolic ref if it
begins with "ref:", followed by zero or more whitespaces,
followed by the full refname (e.g. "refs/heads/master",
"refs/tags/v1.0"), followed only by whitespace characters. We
always write a single SP after "ref:" and a single LF after the
full refname, but third-party reimplementations of Git may have
taken advantage of the looser syntax that is allowed as above.
> The following are bad
> symbolic contents which will be reported as fsck error by "git-fsck(1)".
>
> 1. "ref: refs/heads/master garbage\n"
> 2. "ref: refs/heads/master \n\n\n garbage "
>
> In order to provide above checks, we will use "strrchr" to check whether
> we have newline in the ref content.
strrchr() to look for only LF is overly strict. You need to match
what refs/files-backend.c:read_ref_internal() does to the contents
read from such a loose ref file, i.e. strbuf_rtrim(). Any isspace()
bytes are trimmed at the end, including SP, HT, CR and LF.
> +static int files_fsck_symref_target(struct fsck_options *o,
> + struct fsck_ref_report *report,
> + const char *refname,
> + struct strbuf *pointee_name,
> + struct strbuf *pointee_path)
> +{
> + const char *newline_pos = NULL;
> + const char *p = NULL;
> + struct stat st;
> + int ret = 0;
> +
> + if (!skip_prefix(pointee_name->buf, "refs/", &p)) {
> +
> + ret = fsck_report_ref(o, report,
> + FSCK_MSG_BAD_SYMREF_POINTEE,
> + "points to ref outside the refs directory");
> + goto out;
> + }
> +
> + newline_pos = strrchr(p, '\n');
> + if (!newline_pos || *(newline_pos + 1)) {
> + ret = fsck_report_ref(o, report,
> + FSCK_MSG_REF_MISSING_NEWLINE,
> + "missing newline");
If newline_pos is NULL, it is truly a "missing newline" situation.
If I am reading the code correctly, the severity level is set to
INFO, which is good.
If newline_pos is not NULL but newline_pos[1] is not NUL, however,
that is not a "missing newline". "refs: refs/heads/master\n " would
trigger this report, for example.
As far as I can tell, such a textual symbolic ref is taken as a
valid symbolic ref pointing at "refs/heads/master" by
refs/files-backend.c:read_ref_internal(), so we are trying to detect
a valid but curiously formatted textual symbolic ref file with the
above code?
And strrchr() to find the last LF is not sufficient for that
purpose. We would never write "refs: refs/head/master \n",
but the above code will find the LF, be satisified that the LF is
followed by NUL, without realizing that SP there is not something we
would have written!
I am not sure if that is worth detecting that if it is something we
would have written, but if that were the case, then you would
probably need to do
(1) check the last byte of pointee_name.buf[] to make sure that
it is LF; and
(2) remember pointee_name.len, run strbuf_rtrim() on pointee_name,
and that LF at the end was the only thing that was trimmed by
checking the pointee_name.len after trimming.
or something like that. Then you do not have to have an ugly "oh we
need to check again"---the production code would not do that, either.
> + if (check_refname_format(pointee_name->buf, 0)) {
> + /*
> + * When containing null-garbage, "check_refname_format" will
> + * fail, we should trim the "pointee" to check again.
> + */
> + strbuf_rtrim(pointee_name);
> + if (!check_refname_format(pointee_name->buf, 0)) {
> + ret = fsck_report_ref(o, report,
> + FSCK_MSG_TRAILING_REF_CONTENT,
> + "trailing null-garbage");
> + goto out;
> + }
IOW, the above "let's retry" feels totally wrong. You shouldn't
have to do so, and that comes from running check_refname_format()
before rtrimming the pointee_name.
> + ret = fsck_report_ref(o, report,
> + FSCK_MSG_BAD_SYMREF_POINTEE,
> + "points to refname with invalid format");
> + }
Good. With this check, we know that the referent, if exists, is
well-formed. The contents of the referent will then be checked just
like all other refs that may not be pointed by any symbolic ref.
> + /*
> + * Missing target should not be treated as any error worthy event and
> + * not even warn. It is a common case that a symbolic ref points to a
> + * ref that does not exist yet. If the target ref does not exist, just
> + * skip the check for the file type.
> + */
> + if (lstat(pointee_path->buf, &st) < 0)
> + goto out;
Good.
> + if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) {
> + ret = fsck_report_ref(o, report,
> + FSCK_MSG_BAD_SYMREF_POINTEE,
> + "points to an invalid file type");
> + goto out;
I do not think it is wrong per se, but I am not sure if this check
is needed, either. When "git fsck" or "git refs verify" is told to
check the loose refs, wouldn't it walk the refs directory and report
such an unusual filesystem entity that is not a regular file,
symbolic link, or a directory as "there is unusual cruft exist
here"?
next prev parent reply other threads:[~2024-08-27 19:19 UTC|newest]
Thread overview: 209+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-13 14:18 [RFC] Implement ref content consistency check shejialuo
2024-08-15 10:19 ` karthik nayak
2024-08-15 13:37 ` shejialuo
2024-08-16 9:06 ` Patrick Steinhardt
2024-08-16 16:39 ` Junio C Hamano
2024-08-18 15:00 ` [PATCH v1 0/4] add ref content check for files backend shejialuo
2024-08-18 15:01 ` [PATCH v1 1/4] fsck: introduce "FSCK_REF_REPORT_DEFAULT" macro shejialuo
2024-08-20 16:25 ` Junio C Hamano
2024-08-21 12:49 ` shejialuo
2024-08-18 15:01 ` [PATCH v1 2/4] ref: add regular ref content check for files backend shejialuo
2024-08-20 16:49 ` Junio C Hamano
2024-08-21 14:21 ` shejialuo
2024-08-22 8:46 ` Patrick Steinhardt
2024-08-22 16:13 ` Junio C Hamano
2024-08-22 16:17 ` Junio C Hamano
2024-08-23 7:21 ` Patrick Steinhardt
2024-08-23 11:30 ` shejialuo
2024-08-22 8:48 ` Patrick Steinhardt
2024-08-22 12:06 ` shejialuo
2024-08-18 15:01 ` [PATCH v1 3/4] ref: add symbolic " shejialuo
2024-08-22 8:53 ` Patrick Steinhardt
2024-08-22 12:42 ` shejialuo
2024-08-23 5:36 ` Patrick Steinhardt
2024-08-23 11:37 ` shejialuo
2024-08-18 15:02 ` [PATCH v1 4/4] ref: add symlink ref consistency " shejialuo
2024-08-27 16:04 ` [PATCH v2 0/4] add ref content " shejialuo
2024-08-27 16:07 ` [PATCH v2 1/4] ref: initialize "fsck_ref_report" with zero shejialuo
2024-08-27 17:49 ` Junio C Hamano
2024-08-27 16:07 ` [PATCH v2 2/4] ref: add regular ref content check for files backend shejialuo
2024-08-27 16:19 ` shejialuo
2024-08-27 18:21 ` Junio C Hamano
2024-08-28 12:50 ` Patrick Steinhardt
2024-08-28 16:32 ` Junio C Hamano
2024-08-29 10:19 ` Patrick Steinhardt
2024-08-28 14:31 ` shejialuo
2024-08-28 16:45 ` Junio C Hamano
2024-08-28 12:50 ` Patrick Steinhardt
2024-08-28 14:41 ` shejialuo
2024-08-28 15:30 ` Junio C Hamano
2024-08-27 16:08 ` [PATCH v2 3/4] ref: add symbolic " shejialuo
2024-08-27 19:19 ` Junio C Hamano [this message]
2024-08-28 15:26 ` shejialuo
2024-08-28 12:50 ` Patrick Steinhardt
2024-08-28 15:36 ` shejialuo
2024-08-28 15:41 ` Junio C Hamano
2024-08-29 10:11 ` Patrick Steinhardt
2024-08-27 16:08 ` [PATCH v2 4/4] ref: add symlink ref " shejialuo
2024-08-28 18:42 ` [PATCH] SQUASH??? remove unused parameters Junio C Hamano
2024-08-28 21:28 ` [PATCH v2 0/4] add ref content check for files backend Junio C Hamano
2024-08-29 4:02 ` Jeff King
2024-08-29 4:59 ` Junio C Hamano
2024-08-29 7:00 ` Patrick Steinhardt
2024-08-29 15:07 ` Junio C Hamano
2024-08-29 19:48 ` Jeff King
2024-08-29 15:48 ` shejialuo
2024-08-29 16:12 ` Junio C Hamano
2024-08-29 15:00 ` [PATCH 8/6] CodingGuidelines: also mention MAYBE_UNUSED Junio C Hamano
2024-08-29 17:52 ` Jeff King
2024-08-29 18:06 ` Junio C Hamano
2024-08-29 18:18 ` [PATCH v2] " Junio C Hamano
2024-08-29 18:27 ` [PATCH 9/6] git-compat-util: guard definition of MAYBE_UNUSED with __GNUC__ Junio C Hamano
2024-08-29 19:45 ` Jeff King
2024-08-29 20:19 ` Junio C Hamano
2024-08-29 19:40 ` [PATCH v2] CodingGuidelines: also mention MAYBE_UNUSED Jeff King
2024-09-03 12:18 ` [PATCH v3 0/4] add ref content check for files backend shejialuo
2024-09-03 12:20 ` [PATCH v3 1/4] ref: initialize "fsck_ref_report" with zero shejialuo
2024-09-03 12:20 ` [PATCH v3 2/4] ref: add regular ref content check for files backend shejialuo
2024-09-09 15:04 ` Patrick Steinhardt
2024-09-10 7:42 ` shejialuo
2024-09-10 16:07 ` karthik nayak
2024-09-13 10:25 ` shejialuo
2024-09-03 12:20 ` [PATCH v3 3/4] ref: add symref " shejialuo
2024-09-09 15:04 ` Patrick Steinhardt
2024-09-10 8:02 ` shejialuo
2024-09-10 22:19 ` karthik nayak
2024-09-12 4:00 ` shejialuo
2024-09-03 12:21 ` [PATCH v3 4/4] ref: add symlink ref " shejialuo
2024-09-09 15:04 ` Patrick Steinhardt
2024-09-10 8:28 ` shejialuo
2024-09-13 17:14 ` [PATCH v4 0/5] add " shejialuo
2024-09-13 17:17 ` [PATCH v4 1/5] ref: initialize "fsck_ref_report" with zero shejialuo
2024-09-18 16:41 ` Junio C Hamano
2024-09-13 17:17 ` [PATCH v4 2/5] ref: port git-fsck(1) regular refs check for files backend shejialuo
2024-09-18 18:59 ` Junio C Hamano
2024-09-22 14:58 ` shejialuo
2024-09-13 17:17 ` [PATCH v4 3/5] ref: add more strict checks for regular refs shejialuo
2024-09-18 19:39 ` Junio C Hamano
2024-09-22 15:06 ` shejialuo
2024-09-22 16:48 ` Junio C Hamano
2024-09-13 17:18 ` [PATCH v4 4/5] ref: add symref content check for files backend shejialuo
2024-09-18 20:19 ` Junio C Hamano
2024-09-22 15:53 ` shejialuo
2024-09-22 16:55 ` Junio C Hamano
2024-09-13 17:18 ` [PATCH v4 5/5] ref: add symlink ref " shejialuo
2024-09-18 23:02 ` Junio C Hamano
2024-09-18 16:49 ` [PATCH v4 0/5] add " Junio C Hamano
2024-09-29 7:13 ` [PATCH v5 0/9] " shejialuo
2024-09-29 7:15 ` [PATCH v5 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-10-08 7:29 ` Karthik Nayak
2024-09-29 7:15 ` [PATCH v5 2/9] builtin/refs: support multiple worktrees check for refs shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:42 ` shejialuo
2024-10-07 9:16 ` Patrick Steinhardt
2024-10-07 12:06 ` shejialuo
2024-09-29 7:15 ` [PATCH v5 3/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:42 ` shejialuo
2024-10-07 9:18 ` Patrick Steinhardt
2024-10-07 12:08 ` shejialuo
2024-10-08 7:43 ` Karthik Nayak
2024-10-08 12:24 ` shejialuo
2024-10-08 17:44 ` Junio C Hamano
2024-10-09 8:05 ` Patrick Steinhardt
2024-10-09 11:59 ` shejialuo
2024-10-10 6:52 ` Patrick Steinhardt
2024-10-10 16:00 ` Junio C Hamano
2024-10-09 11:55 ` shejialuo
2024-09-29 7:16 ` [PATCH v5 4/9] ref: add more strict checks for regular refs shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:44 ` shejialuo
2024-10-07 9:25 ` Patrick Steinhardt
2024-10-07 12:19 ` shejialuo
2024-09-29 7:16 ` [PATCH v5 5/9] ref: add basic symref content check for files backend shejialuo
2024-10-08 7:58 ` Karthik Nayak
2024-10-08 12:18 ` shejialuo
2024-09-29 7:16 ` [PATCH v5 6/9] ref: add escape check for the referent of symref shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:44 ` shejialuo
2024-10-07 9:26 ` Patrick Steinhardt
2024-09-29 7:17 ` [PATCH v5 7/9] ref: enhance escape situation for worktrees shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:45 ` shejialuo
2024-09-29 7:17 ` [PATCH v5 8/9] t0602: add ref content checks " shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:45 ` shejialuo
2024-09-29 7:17 ` [PATCH v5 9/9] ref: add symlink ref content check for files backend shejialuo
2024-10-07 6:58 ` Patrick Steinhardt
2024-10-07 8:45 ` shejialuo
2024-09-30 18:57 ` [PATCH v5 0/9] add " Junio C Hamano
2024-10-01 3:40 ` shejialuo
2024-10-07 12:49 ` shejialuo
2024-10-21 13:32 ` [PATCH v6 " shejialuo
2024-10-21 13:34 ` [PATCH v6 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-10-21 13:34 ` [PATCH v6 2/9] ref: check the full refname instead of basename shejialuo
2024-10-21 15:38 ` karthik nayak
2024-10-22 11:42 ` shejialuo
2024-11-05 7:11 ` Patrick Steinhardt
2024-11-06 12:37 ` shejialuo
2024-10-21 13:34 ` [PATCH v6 3/9] ref: initialize target name outside of check functions shejialuo
2024-10-21 15:49 ` karthik nayak
2024-11-05 7:11 ` Patrick Steinhardt
2024-11-06 12:32 ` shejialuo
2024-11-06 13:14 ` Patrick Steinhardt
2024-10-21 13:34 ` [PATCH v6 4/9] ref: support multiple worktrees check for refs shejialuo
2024-10-21 15:56 ` karthik nayak
2024-10-22 11:44 ` shejialuo
2024-11-05 7:11 ` Patrick Steinhardt
2024-11-05 12:52 ` shejialuo
2024-11-06 6:34 ` Patrick Steinhardt
2024-11-06 12:20 ` shejialuo
2024-10-21 13:34 ` [PATCH v6 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
2024-11-05 7:11 ` Patrick Steinhardt
2024-10-21 13:34 ` [PATCH v6 6/9] ref: add more strict checks for regular refs shejialuo
2024-10-21 13:35 ` [PATCH v6 7/9] ref: add basic symref content check for files backend shejialuo
2024-10-21 13:35 ` [PATCH v6 8/9] ref: check whether the target of the symref is a ref shejialuo
2024-10-21 13:35 ` [PATCH v6 9/9] ref: add symlink ref content check for files backend shejialuo
2024-10-21 16:09 ` [PATCH v6 0/9] add " Taylor Blau
2024-10-22 11:41 ` shejialuo
2024-10-21 16:18 ` Taylor Blau
2024-11-10 12:07 ` [PATCH v7 " shejialuo
2024-11-10 12:09 ` [PATCH v7 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-11-10 12:09 ` [PATCH v7 2/9] ref: check the full refname instead of basename shejialuo
2024-11-10 12:09 ` [PATCH v7 3/9] ref: initialize ref name outside of check functions shejialuo
2024-11-10 12:09 ` [PATCH v7 4/9] ref: support multiple worktrees check for refs shejialuo
2024-11-10 12:09 ` [PATCH v7 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
2024-11-13 7:36 ` Patrick Steinhardt
2024-11-14 12:09 ` shejialuo
2024-11-10 12:10 ` [PATCH v7 6/9] ref: add more strict checks for regular refs shejialuo
2024-11-10 12:10 ` [PATCH v7 7/9] ref: add basic symref content check for files backend shejialuo
2024-11-10 12:10 ` [PATCH v7 8/9] ref: check whether the target of the symref is a ref shejialuo
2024-11-10 12:10 ` [PATCH v7 9/9] ref: add symlink ref content check for files backend shejialuo
2024-11-13 7:36 ` Patrick Steinhardt
2024-11-14 12:18 ` shejialuo
2024-11-13 7:36 ` [PATCH v7 0/9] add " Patrick Steinhardt
2024-11-14 16:51 ` [PATCH v8 " shejialuo
2024-11-14 16:53 ` [PATCH v8 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-11-14 16:54 ` [PATCH v8 2/9] ref: check the full refname instead of basename shejialuo
2024-11-14 16:54 ` [PATCH v8 3/9] ref: initialize ref name outside of check functions shejialuo
2024-11-14 16:54 ` [PATCH v8 4/9] ref: support multiple worktrees check for refs shejialuo
2024-11-14 16:54 ` [PATCH v8 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
2024-11-15 7:11 ` Patrick Steinhardt
2024-11-15 11:08 ` shejialuo
2024-11-14 16:54 ` [PATCH v8 6/9] ref: add more strict checks for regular refs shejialuo
2024-11-14 16:54 ` [PATCH v8 7/9] ref: add basic symref content check for files backend shejialuo
2024-11-14 16:54 ` [PATCH v8 8/9] ref: check whether the target of the symref is a ref shejialuo
2024-11-14 16:55 ` [PATCH v8 9/9] ref: add symlink ref content check for files backend shejialuo
2024-11-15 11:10 ` [PATCH v8 0/9] add " shejialuo
2024-11-20 11:47 ` [PATCH v9 " shejialuo
2024-11-20 11:51 ` [PATCH v9 1/9] ref: initialize "fsck_ref_report" with zero shejialuo
2024-11-20 11:51 ` [PATCH v9 2/9] ref: check the full refname instead of basename shejialuo
2024-11-20 11:51 ` [PATCH v9 3/9] ref: initialize ref name outside of check functions shejialuo
2024-11-20 11:51 ` [PATCH v9 4/9] ref: support multiple worktrees check for refs shejialuo
2024-11-20 11:51 ` [PATCH v9 5/9] ref: port git-fsck(1) regular refs check for files backend shejialuo
2024-11-20 11:51 ` [PATCH v9 6/9] ref: add more strict checks for regular refs shejialuo
2024-11-20 11:52 ` [PATCH v9 7/9] ref: add basic symref content check for files backend shejialuo
2024-11-20 11:52 ` [PATCH v9 8/9] ref: check whether the target of the symref is a ref shejialuo
2024-11-20 11:52 ` [PATCH v9 9/9] ref: add symlink ref content check for files backend shejialuo
2024-11-20 14:26 ` [PATCH v9 0/9] add " Patrick Steinhardt
2024-11-20 23:21 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1q2993kg.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=karthik.188@gmail.com \
--cc=ps@pks.im \
--cc=shejialuo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).