From: shejialuo <shejialuo@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Patrick Steinhardt <ps@pks.im>,
Karthik Nayak <karthik.188@gmail.com>,
Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check
Date: Fri, 31 Jan 2025 22:23:57 +0800 [thread overview]
Message-ID: <Z5zc_QAqYP-Dg4-K@ArchLinux> (raw)
In-Reply-To: <xmqq1pwkdt7r.fsf@gitster.g>
On Thu, Jan 30, 2025 at 10:58:32AM -0800, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
>
> > In "packed-backend.c::create_snapshot", if there is a header (the line
> > which starts with '#'), we will check whether the line starts with "#
> > pack-refs with:". As we are going to implement the header consistency
> > check, we should port this check into "packed_fsck".
> >
> > However, the above check is not enough, this is because "git pack-refs"
> > will always write "PACKED_REFS_HEADER" which is a constant string to the
> > "packed-refs" file. So, we should check the following things for the
> > header.
>
> I haven't done history digging in this area for a while, but we
> should make sure we are not flagging a file that was written in
> ancient version of Git whose repository is still supported.
>
Understood.
> > 1. If the header does not exist, we may report an error to the user
> > because it should exist, but we do allow no header in "packed-refs"
> > file. So, create a new fsck message "packedRefMissingHeader(INFO)" to
> > warn the user and also keep compatibility.
>
> Are we sure "it should exist"? I think the header did not exist
> before "Git v1.5.0". I didn't check with other reimplementations of
> Git (like jgit or libgit2), but as long as our reading side of the
> runtime allows a packed-refs file without the header without
> complaint, I do not think it is a good idea to treat it as a
> report-worthy event from "git fsck".
>
OK, let me improve this in the next version.
> > 2. If the header content does not start with "# packed-ref with:", we
> > should report an error just like what "create_snapshot" does. So,
> > create a new fsck message "badPackedRefHeader(ERROR)" for this.
>
> This I can agree with. If the first line begins with "#" but not
> with that string (with a trailing SP), that is a sign that it may
> not even be a valid packed-refs file, which is a report-worthy
> event.
>
> > 3. If the header content is not the same as the constant string
> > "PACKED_REFS_HEADER", ideally, we should report an error to the user.
>
> NO. THAT IS NOT IDEAL AT ALL.
>
> The header was written like this:
>
> /* perhaps other traits later as well */
> fprintf(cbdata.refs_file, "# pack-refs with: peeled \n");
>
> in the older versions of Git before it was made into a separate
> preprocessor macro and lost the comment (the above excerpt is from
> "git show v1.5.0:builtin-pack-refs.c").
>
> Notice "other traits later" in the comment?
>
> The thing is _designed_ to be extensible. In fact, these days we
> support a few more traits
>
> static const char PACKED_REFS_HEADER[] =
> "# pack-refs with: peeled fully-peeled sorted \n";
>
> (an excerpt from the current refs/packed-backend.c).
>
> Reporting an error when you see something written by an older
> version of Git is far from ideal.
>
Understood, I think we should be consistency with the runtime check.
> > However, we allow other contents as long as the header content starts
> > with "# packed-ref with:". To keep compatibility, create a new fsck
> > message "unknownPackedRefHeader(INFO)" to warn about this. We may
> > tighten this rule in the future.
>
> Whatever we do, what we do with an unknown trait should be in line
> with what the runtime does. If the runtime failed (we do not, but
> this is to illustrate the principle [*]) on a packed-refs file
> without "sorted" trait, noticing that "sorted" is not there and
> flagging as an error is a good thing to do. But if the runtime
> gracefully degrades and sorts the list of refs read from such a
> packed-refs file before continuing, then a packed-refs file that
> lack "sorted" trait is not a report-worthy event.
>
Actually, the runtime won't complain about this. I agree with you here.
> I do not offhand recall if we introduced the concept of mandatory vs
> optional traits in the packed-refs part of the system (like we have
> in the index extension subsystem, where a version of Git that
> encounters an unknown *and* mandatory index extension must refuse to
> touch the repository), but if there is a mandatory trait declared in
> the header that our version of Git does not understand, it is a
> report-worthy event that must be flagged with "git refs verify".
>
I don't think any trait in "packed-refs" is mandatory. Because I have
done some experiments before implementing the code. We should only check
case 2 here.
> > +static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
> > +{
> > + const char *err_fmt = NULL;
> > + int fsck_msg_id = -1;
> > +
> > + if (!starts_with(start, "# pack-refs with:")) {
> > + err_fmt = "'%.*s' does not start with '# pack-refs with:'";
> > + fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
> > + } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
> > + err_fmt = "'%.*s' is an unknown packed-refs header";
> > + fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
> > + }
>
> As I outlined above, this is totally unacceptable.
>
> Inspecting the header is good, but if this code claims to be a
> checker, it should do at least what the runtime does, i.e. parse the
> header to tell what traits the packed-file declares, not just
> assuming that it is a fixed string. And error on unknown trait(s)
> if they are mandatory (if such a concept is implemented in the
> runtime reading side). Informing on an unknown and optional
> trait(s) I can live with, but personally I wouldn't recommend it.
>
Got it, I don't want to report unknown trait(s) either.
> In other words, report loudly if it is an error, but otherwise stay
> silent if we know we tolerate it well.
>
Thanks for this suggestion.
> > +static int packed_fsck_ref_content(struct fsck_options *o,
> > + const char *start, const char *eof)
> > +{
> > + struct strbuf packed_entry = STRBUF_INIT;
> > + int line_number = 1;
>
> We limit ourselves with about 1 billion refs in the packed-refs
> file, which may be plenty,
Let me change this to `size_t`. This would be better.
> but I do not quite understand the use of
> this variable. There is no loop inside this so ...
>
The reason why I define this variable is that I am going to use loop to
check each entry in the next patch.
> > + const char *eol;
> > + int ret = 0;
> > +
> > + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
>
> ... this is always line #1, and then
>
> > + ret |= packed_fsck_ref_next_line(o, &packed_entry, start, eof, &eol);
> > + if (*start == '#') {
> > + ret |= packed_fsck_ref_header(o, start, eol);
> > +
> > + start = eol + 1;
> > + line_number++;
>
> ... it may be incremented, but upon returning from the funcition, it
> is lost.
>
> Perhaps you wanted to make it a function-scope static, but then you
> are allowed to read one single packed-refs file during the life of
> your process before you exit, which I am not sure is what you want?
>
Actually, what I want is use this variable for looping the each ref
entry in the "packed-refs" file.
> > + } else {
> > + struct fsck_ref_report report = { 0 };
> > + report.path = "packed-refs";
> > +
> > + ret |= fsck_report_ref(o, &report,
> > + FSCK_MSG_PACKED_REF_MISSING_HEADER,
> > + "missing header line");
> > + }
> > +
> > + strbuf_release(&packed_entry);
> > + return ret;
> > +}
Thanks,
Jialuo
next prev parent reply other threads:[~2025-01-31 14:22 UTC|newest]
Thread overview: 168+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
2025-01-07 14:17 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 13:40 ` shejialuo
2025-01-24 7:54 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
2025-01-07 14:57 ` Karthik Nayak
2025-01-07 16:34 ` shejialuo
2025-01-08 8:40 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-07 16:33 ` Karthik Nayak
2025-01-17 14:00 ` shejialuo
2025-01-17 22:01 ` Eric Sunshine
2025-01-18 3:05 ` shejialuo
2025-01-19 8:03 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-08 0:54 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:23 ` shejialuo
2025-01-24 7:51 ` Patrick Steinhardt
2025-02-17 13:16 ` shejialuo
2025-01-05 13:49 ` [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:33 ` shejialuo
2025-01-05 13:49 ` [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:35 ` shejialuo
2025-01-05 13:50 ` [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 08/10] packed-backend: add check for object consistency shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 10/10] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-06 22:16 ` Junio C Hamano
2025-01-07 12:00 ` shejialuo
2025-01-07 15:52 ` Junio C Hamano
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
2025-01-30 4:06 ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-01-30 17:53 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 2/8] builtin/refs: get worktrees without reading head info shejialuo
2025-01-30 18:04 ` Junio C Hamano
2025-01-31 13:29 ` shejialuo
2025-01-31 16:16 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-30 18:23 ` Junio C Hamano
2025-01-31 13:54 ` shejialuo
2025-01-31 16:20 ` Junio C Hamano
2025-02-01 9:47 ` shejialuo
2025-02-03 20:15 ` Junio C Hamano
2025-02-04 3:58 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:07 ` [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-30 18:58 ` Junio C Hamano
2025-01-31 14:23 ` shejialuo [this message]
2025-01-30 4:07 ` [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-05 10:09 ` shejialuo
2025-01-30 4:07 ` [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 4:28 ` shejialuo
2025-01-30 4:08 ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-30 19:02 ` Junio C Hamano
2025-01-31 14:35 ` shejialuo
2025-01-31 16:23 ` Junio C Hamano
2025-02-01 9:50 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:08 ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-30 19:03 ` Junio C Hamano
2025-01-31 14:37 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 5:32 ` shejialuo
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
2025-02-06 5:58 ` [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-06 5:58 ` [PATCH v3 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-06 5:58 ` [PATCH v3 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-06 5:59 ` [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:12 ` shejialuo
2025-02-12 17:48 ` Junio C Hamano
2025-02-14 3:53 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-06 5:59 ` [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:18 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:20 ` shejialuo
2025-02-12 10:42 ` Patrick Steinhardt
2025-02-12 10:56 ` shejialuo
2025-02-06 6:00 ` [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:21 ` shejialuo
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
2025-02-14 4:51 ` [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-14 4:52 ` [PATCH v4 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-14 9:19 ` Karthik Nayak
2025-02-14 12:20 ` shejialuo
2025-02-14 4:52 ` [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-14 9:50 ` Karthik Nayak
2025-02-14 12:37 ` shejialuo
2025-02-14 4:52 ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-14 10:30 ` Karthik Nayak
2025-02-14 12:43 ` shejialuo
2025-02-14 14:01 ` Junio C Hamano
2025-02-14 4:52 ` [PATCH v4 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-14 4:53 ` [PATCH v4 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-14 4:59 ` [PATCH v4 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-14 4:59 ` [PATCH v4 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-14 9:04 ` [PATCH v4 0/8] add more ref consistency checks Karthik Nayak
2025-02-14 12:16 ` shejialuo
2025-02-17 15:25 ` [PATCH v5 " shejialuo
2025-02-17 15:27 ` [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-17 15:27 ` [PATCH v5 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25 8:26 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-25 12:34 ` shejialuo
2025-02-17 15:27 ` [PATCH v5 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-17 15:28 ` [PATCH v5 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-17 15:28 ` [PATCH v5 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-17 15:28 ` [PATCH v5 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-25 8:27 ` [PATCH v5 0/8] add more ref consistency checks Patrick Steinhardt
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
2025-02-25 13:21 ` [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-25 13:21 ` [PATCH v6 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25 13:21 ` [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25 17:44 ` Junio C Hamano
2025-02-26 12:05 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26 8:08 ` Patrick Steinhardt
2025-02-26 12:28 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25 13:21 ` [PATCH v6 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-25 13:22 ` [PATCH v6 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-25 13:22 ` [PATCH v6 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-25 13:22 ` [PATCH v6 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
2025-02-26 13:49 ` [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-26 13:49 ` [PATCH v7 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-26 13:49 ` [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-26 18:36 ` Junio C Hamano
2025-02-27 0:57 ` shejialuo
2025-02-27 14:10 ` Patrick Steinhardt
2025-02-27 16:57 ` Junio C Hamano
2025-02-28 5:02 ` shejialuo
2025-02-26 13:50 ` [PATCH v7 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26 13:50 ` [PATCH v7 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-26 13:50 ` [PATCH v7 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-26 13:50 ` [PATCH v7 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-26 13:50 ` [PATCH v7 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-26 13:51 ` [PATCH v7 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
2025-02-27 16:05 ` [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-27 16:06 ` [PATCH v8 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-27 16:06 ` [PATCH v8 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-27 16:06 ` [PATCH v8 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-27 16:06 ` [PATCH v8 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-27 16:07 ` [PATCH v8 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-27 16:07 ` [PATCH v8 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-27 16:07 ` [PATCH v8 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-27 16:07 ` [PATCH v8 9/9] builtin/fsck: add `git refs verify` child process shejialuo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z5zc_QAqYP-Dg4-K@ArchLinux \
--to=shejialuo@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=mhagger@alum.mit.edu \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).