From: shejialuo <shejialuo@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Karthik Nayak <karthik.188@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
Date: Fri, 17 Jan 2025 22:23:06 +0800 [thread overview]
Message-ID: <Z4pnyhF2V2ykuHlg@ArchLinux> (raw)
In-Reply-To: <Z4kQUb7og2Ce1iCo@pks.im>
On Thu, Jan 16, 2025 at 02:57:37PM +0100, Patrick Steinhardt wrote:
> On Sun, Jan 05, 2025 at 09:49:37PM +0800, shejialuo wrote:
> > Add a new flag "safe_object_check" in "fsck_options", when there is
> > anything wrong with the parsing process, set this flag to 0 to avoid
> > checking objects in the later checks.
>
> Okay, I understand the motivation: a corrupted refdb may be completely
> bogus, so checking its objects may not be sensible.
>
> For one of the preceding commits I made the suggestion to split out the
> object checks into a generic part instead, as they aren't specific to
> the backend. With such a scheme we could adapt the logic to first do the
> backend-specific checks for the format, and only in case the backend
> looks sane to us we'd execute those generic checks for that specific
> backend. That'd allow us to get rid of the "safe object check" flag.
>
Yes, I agree with you here. And I won't touch this topic in the next
version. Let me make this patch concentrate on the "packed-ref" format.
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index d9eb2f8b71..3b11abe5f8 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -1748,12 +1748,100 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
> > return empty_ref_iterator_begin();
> > }
> >
> > +static int packed_fsck_ref_next_line(struct fsck_options *o,
> > + int line_number, const char *start,
> > + const char *eof, const char **eol)
> > +{
> > + int ret = 0;
> > +
> > + *eol = memchr(start, '\n', eof - start);
> > + if (!*eol) {
> > + struct strbuf packed_entry = STRBUF_INIT;
> > + struct fsck_ref_report report = { 0 };
> > +
> > + strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
> > + report.path = packed_entry.buf;
> > + ret = fsck_report_ref(o, &report,
> > + FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
> > + "'%.*s' is not terminated with a newline",
> > + (int)(eof - start), start);
> > +
> > + /*
> > + * There is no newline but we still want to parse it to the end of
> > + * the buffer.
> > + */
> > + *eol = eof;
>
> I don't quite understand. We've figured out that there isn't a newline,
> so wouldn't that mean that we _are_ at the end of the buffer already?
>
In the "packed-refs" file, the last line should end with a newline. If
not, this is a fatal error. The motivation why I do this is that for
each line, we could pass the "line_start" and "eol" to check. But if
there is no newline, the "eol" will be NULL. So, I change it to "eof" to
make sure that we could follow the same logic when "eol" is not NULL.
I guess I should not handle this in this function which may cause
confusion here. I will improve this in the next version.
> > + strbuf_release(&packed_entry);
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
> > +{
> > + const char *err_fmt = NULL;
> > + int fsck_msg_id = -1;
> > +
> > + if (!starts_with(start, "# pack-refs with:")) {
> > + err_fmt = "'%.*s' does not start with '# pack-refs with:'";
> > + fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
> > + } else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
> > + err_fmt = "'%.*s' is not the official packed-refs header";
>
> I wouldn't say "official", because it could totally be that whatever is
> official changes in the future, e.g. when a new format is introduced.
> Unlikely to happen, but saying "unknown packed-refs header" might be a
> bit more future proof.
>
I will improve this in the next version.
> > + fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
> > + }
> > +
> > + if (err_fmt && fsck_msg_id >= 0) {
> > + struct fsck_ref_report report = { 0 };
> > + report.path = "packed-refs.header";
> > +
> > + return fsck_report_ref(o, &report, fsck_msg_id, err_fmt,
> > + (int)(eol - start), start);
> > +
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int packed_fsck_ref_content(struct fsck_options *o,
> > + const char *start, const char *eof)
> > +{
> > + int line_number = 1;
> > + const char *eol;
> > + int ret = 0;
> > +
> > + ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
> > + if (*start == '#') {
> > + ret |= packed_fsck_ref_header(o, start, eol);
> > +
> > + start = eol + 1;
> > + line_number++;
>
> The header can only appear at the beginning of the file, can't it? But
> we accept it in every line here. We should likely verify that it's
> actually a header and not a line at some random place.
>
Yes. But we don't accept it in every line. Because in here, we are
getting the first line "start" and "eol" by using
"packed_fsck_ref_next_line". Only it starts with "#", we will check the
header consistency.
> > + } else {
> > + struct fsck_ref_report report = { 0 };
> > + report.path = "packed-refs";
> > +
> > + ret |= fsck_report_ref(o, &report,
> > + FSCK_MSG_PACKED_REF_MISSING_HEADER,
> > + "missing header line");
> > + }
> > +
> > + /*
> > + * If there is anything wrong during the parsing of the "packed-refs"
> > + * file, we should not check the object of the refs.
> > + */
> > + if (ret)
> > + o->safe_object_check = 0;
> > +
> > +
> > + return ret;
> > +}
> > +
> > static int packed_fsck(struct ref_store *ref_store,
> > struct fsck_options *o,
> > struct worktree *wt)
> > {
> > struct packed_ref_store *refs = packed_downcast(ref_store,
> > REF_STORE_READ, "fsck");
> > + struct strbuf packed_ref_content = STRBUF_INIT;
> > struct stat st;
> > int ret = 0;
> >
> > @@ -1779,7 +1867,24 @@ static int packed_fsck(struct ref_store *ref_store,
> > goto cleanup;
> > }
> >
> > + if (strbuf_read_file(&packed_ref_content, refs->path, 0) < 0) {
> > + /*
> > + * Although we have checked that the file exists, there is a possibility
> > + * that it has been removed between the lstat() and the read attempt by
> > + * another process. In that case, we should not report an error.
> > + */
> > + if (errno == ENOENT)
> > + goto cleanup;
>
> Unlikely, but good to guard us against that condition regardless. It's
> still not entirely race-free though because the file could meanwhile
> have changed into a symlink, and we wouldn't notice now. We could fix
> that by using open(O_NOFOLLOW), fstat the returne file descriptor and
> then use `strbuf_read()` to slurp in the file.
>
Would this be too complicated for us to avoid race condition and we will
introduce a lot of code to handle above logic. Because there is a
possibility that when finishing reading the file content to the memory,
the file could be changed into a symlink and we cannot notice. So, I
wanna say we can't avoid race condition totally. It would be good if we
avoid race, but what I am concern about here is that we would make the
logic too complicated. So, could we make it unchanged?
next prev parent reply other threads:[~2025-01-17 14:21 UTC|newest]
Thread overview: 168+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
2025-01-07 14:17 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 13:40 ` shejialuo
2025-01-24 7:54 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
2025-01-07 14:57 ` Karthik Nayak
2025-01-07 16:34 ` shejialuo
2025-01-08 8:40 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-07 16:33 ` Karthik Nayak
2025-01-17 14:00 ` shejialuo
2025-01-17 22:01 ` Eric Sunshine
2025-01-18 3:05 ` shejialuo
2025-01-19 8:03 ` Karthik Nayak
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-08 0:54 ` shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:23 ` shejialuo [this message]
2025-01-24 7:51 ` Patrick Steinhardt
2025-02-17 13:16 ` shejialuo
2025-01-05 13:49 ` [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:33 ` shejialuo
2025-01-05 13:49 ` [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-17 14:35 ` shejialuo
2025-01-05 13:50 ` [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 08/10] packed-backend: add check for object consistency shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-16 13:57 ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 10/10] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-06 22:16 ` Junio C Hamano
2025-01-07 12:00 ` shejialuo
2025-01-07 15:52 ` Junio C Hamano
2025-01-30 4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
2025-01-30 4:06 ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-01-30 17:53 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 2/8] builtin/refs: get worktrees without reading head info shejialuo
2025-01-30 18:04 ` Junio C Hamano
2025-01-31 13:29 ` shejialuo
2025-01-31 16:16 ` Junio C Hamano
2025-01-30 4:07 ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-30 18:23 ` Junio C Hamano
2025-01-31 13:54 ` shejialuo
2025-01-31 16:20 ` Junio C Hamano
2025-02-01 9:47 ` shejialuo
2025-02-03 20:15 ` Junio C Hamano
2025-02-04 3:58 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:07 ` [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-30 18:58 ` Junio C Hamano
2025-01-31 14:23 ` shejialuo
2025-01-30 4:07 ` [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-05 10:09 ` shejialuo
2025-01-30 4:07 ` [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 4:28 ` shejialuo
2025-01-30 4:08 ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-30 19:02 ` Junio C Hamano
2025-01-31 14:35 ` shejialuo
2025-01-31 16:23 ` Junio C Hamano
2025-02-01 9:50 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-03 8:40 ` Patrick Steinhardt
2025-01-30 4:08 ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-30 19:03 ` Junio C Hamano
2025-01-31 14:37 ` shejialuo
2025-02-03 8:40 ` Patrick Steinhardt
2025-02-04 5:32 ` shejialuo
2025-02-06 5:56 ` [PATCH v3 0/8] add more ref consistency checks shejialuo
2025-02-06 5:58 ` [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-06 5:58 ` [PATCH v3 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-06 5:58 ` [PATCH v3 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-06 5:59 ` [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:12 ` shejialuo
2025-02-12 17:48 ` Junio C Hamano
2025-02-14 3:53 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-06 5:59 ` [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:18 ` shejialuo
2025-02-06 5:59 ` [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:20 ` shejialuo
2025-02-12 10:42 ` Patrick Steinhardt
2025-02-12 10:56 ` shejialuo
2025-02-06 6:00 ` [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-12 9:56 ` Patrick Steinhardt
2025-02-12 10:21 ` shejialuo
2025-02-14 4:50 ` [PATCH v4 0/8] add more ref consistency checks shejialuo
2025-02-14 4:51 ` [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-14 4:52 ` [PATCH v4 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-14 9:19 ` Karthik Nayak
2025-02-14 12:20 ` shejialuo
2025-02-14 4:52 ` [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-14 9:50 ` Karthik Nayak
2025-02-14 12:37 ` shejialuo
2025-02-14 4:52 ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-14 10:30 ` Karthik Nayak
2025-02-14 12:43 ` shejialuo
2025-02-14 14:01 ` Junio C Hamano
2025-02-14 4:52 ` [PATCH v4 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-14 4:53 ` [PATCH v4 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-14 4:59 ` [PATCH v4 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-14 4:59 ` [PATCH v4 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-14 9:04 ` [PATCH v4 0/8] add more ref consistency checks Karthik Nayak
2025-02-14 12:16 ` shejialuo
2025-02-17 15:25 ` [PATCH v5 " shejialuo
2025-02-17 15:27 ` [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-17 15:27 ` [PATCH v5 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25 8:26 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-17 15:27 ` [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25 8:27 ` Patrick Steinhardt
2025-02-25 12:34 ` shejialuo
2025-02-17 15:27 ` [PATCH v5 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-17 15:28 ` [PATCH v5 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-17 15:28 ` [PATCH v5 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-17 15:28 ` [PATCH v5 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-25 8:27 ` [PATCH v5 0/8] add more ref consistency checks Patrick Steinhardt
2025-02-25 13:19 ` [PATCH v6 0/9] " shejialuo
2025-02-25 13:21 ` [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-25 13:21 ` [PATCH v6 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25 13:21 ` [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25 17:44 ` Junio C Hamano
2025-02-26 12:05 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26 8:08 ` Patrick Steinhardt
2025-02-26 12:28 ` shejialuo
2025-02-25 13:21 ` [PATCH v6 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25 13:21 ` [PATCH v6 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-25 13:22 ` [PATCH v6 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-25 13:22 ` [PATCH v6 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-25 13:22 ` [PATCH v6 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-26 13:48 ` [PATCH v7 0/9] add more ref consistency checks shejialuo
2025-02-26 13:49 ` [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-26 13:49 ` [PATCH v7 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-26 13:49 ` [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-26 18:36 ` Junio C Hamano
2025-02-27 0:57 ` shejialuo
2025-02-27 14:10 ` Patrick Steinhardt
2025-02-27 16:57 ` Junio C Hamano
2025-02-28 5:02 ` shejialuo
2025-02-26 13:50 ` [PATCH v7 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26 13:50 ` [PATCH v7 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-26 13:50 ` [PATCH v7 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-26 13:50 ` [PATCH v7 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-26 13:50 ` [PATCH v7 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-26 13:51 ` [PATCH v7 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-27 16:03 ` [PATCH v8 0/9] add more ref consistency checks shejialuo
2025-02-27 16:05 ` [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-27 16:06 ` [PATCH v8 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-27 16:06 ` [PATCH v8 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-27 16:06 ` [PATCH v8 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-27 16:06 ` [PATCH v8 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-27 16:07 ` [PATCH v8 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-27 16:07 ` [PATCH v8 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-27 16:07 ` [PATCH v8 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-27 16:07 ` [PATCH v8 9/9] builtin/fsck: add `git refs verify` child process shejialuo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4pnyhF2V2ykuHlg@ArchLinux \
--to=shejialuo@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=mhagger@alum.mit.edu \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).