git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: shejialuo <shejialuo@gmail.com>
Cc: git@vger.kernel.org, Karthik Nayak <karthik.188@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH 04/10] packed-backend: add "packed-refs" header consistency check
Date: Thu, 16 Jan 2025 14:57:37 +0100	[thread overview]
Message-ID: <Z4kQUb7og2Ce1iCo@pks.im> (raw)
In-Reply-To: <Z3qN8U2VbZBnUSWj@ArchLinux>

On Sun, Jan 05, 2025 at 09:49:37PM +0800, shejialuo wrote:
> Add a new flag "safe_object_check" in "fsck_options", when there is
> anything wrong with the parsing process, set this flag to 0 to avoid
> checking objects in the later checks.

Okay, I understand the motivation: a corrupted refdb may be completely
bogus, so checking its objects may not be sensible.

For one of the preceding commits I made the suggestion to split out the
object checks into a generic part instead, as they aren't specific to
the backend. With such a scheme we could adapt the logic to first do the
backend-specific checks for the format, and only in case the backend
looks sane to us we'd execute those generic checks for that specific
backend. That'd allow us to get rid of the "safe object check" flag.

> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index d9eb2f8b71..3b11abe5f8 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1748,12 +1748,100 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
>  	return empty_ref_iterator_begin();
>  }
>  
> +static int packed_fsck_ref_next_line(struct fsck_options *o,
> +				     int line_number, const char *start,
> +				     const char *eof, const char **eol)
> +{
> +	int ret = 0;
> +
> +	*eol = memchr(start, '\n', eof - start);
> +	if (!*eol) {
> +		struct strbuf packed_entry = STRBUF_INIT;
> +		struct fsck_ref_report report = { 0 };
> +
> +		strbuf_addf(&packed_entry, "packed-refs line %d", line_number);
> +		report.path = packed_entry.buf;
> +		ret = fsck_report_ref(o, &report,
> +				      FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
> +				      "'%.*s' is not terminated with a newline",
> +				      (int)(eof - start), start);
> +
> +		/*
> +		 * There is no newline but we still want to parse it to the end of
> +		 * the buffer.
> +		 */
> +		*eol = eof;

I don't quite understand. We've figured out that there isn't a newline,
so wouldn't that mean that we _are_ at the end of the buffer already?

> +		strbuf_release(&packed_entry);
> +	}
> +
> +	return ret;
> +}
> +
> +static int packed_fsck_ref_header(struct fsck_options *o, const char *start, const char *eol)
> +{
> +	const char *err_fmt = NULL;
> +	int fsck_msg_id = -1;
> +
> +	if (!starts_with(start, "# pack-refs with:")) {
> +		err_fmt = "'%.*s' does not start with '# pack-refs with:'";
> +		fsck_msg_id = FSCK_MSG_BAD_PACKED_REF_HEADER;
> +	} else if (strncmp(start, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER))) {
> +		err_fmt = "'%.*s' is not the official packed-refs header";

I wouldn't say "official", because it could totally be that whatever is
official changes in the future, e.g. when a new format is introduced.
Unlikely to happen, but saying "unknown packed-refs header" might be a
bit more future proof.

> +		fsck_msg_id = FSCK_MSG_UNKNOWN_PACKED_REF_HEADER;
> +	}
> +
> +	if (err_fmt && fsck_msg_id >= 0) {
> +		struct fsck_ref_report report = { 0 };
> +		report.path = "packed-refs.header";
> +
> +		return fsck_report_ref(o, &report, fsck_msg_id, err_fmt,
> +				       (int)(eol - start), start);
> +
> +	}
> +
> +	return 0;
> +}
> +
> +static int packed_fsck_ref_content(struct fsck_options *o,
> +				   const char *start, const char *eof)
> +{
> +	int line_number = 1;
> +	const char *eol;
> +	int ret = 0;
> +
> +	ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
> +	if (*start == '#') {
> +		ret |= packed_fsck_ref_header(o, start, eol);
> +
> +		start = eol + 1;
> +		line_number++;

The header can only appear at the beginning of the file, can't it? But
we accept it in every line here. We should likely verify that it's
actually a header and not a line at some random place.

> +	} else {
> +		struct fsck_ref_report report = { 0 };
> +		report.path = "packed-refs";
> +
> +		ret |= fsck_report_ref(o, &report,
> +				       FSCK_MSG_PACKED_REF_MISSING_HEADER,
> +				       "missing header line");
> +	}
> +
> +	/*
> +	 * If there is anything wrong during the parsing of the "packed-refs"
> +	 * file, we should not check the object of the refs.
> +	 */
> +	if (ret)
> +		o->safe_object_check = 0;
> +
> +
> +	return ret;
> +}
> +
>  static int packed_fsck(struct ref_store *ref_store,
>  		       struct fsck_options *o,
>  		       struct worktree *wt)
>  {
>  	struct packed_ref_store *refs = packed_downcast(ref_store,
>  							REF_STORE_READ, "fsck");
> +	struct strbuf packed_ref_content = STRBUF_INIT;
>  	struct stat st;
>  	int ret = 0;
>  
> @@ -1779,7 +1867,24 @@ static int packed_fsck(struct ref_store *ref_store,
>  		goto cleanup;
>  	}
>  
> +	if (strbuf_read_file(&packed_ref_content, refs->path, 0) < 0) {
> +		/*
> +		 * Although we have checked that the file exists, there is a possibility
> +		 * that it has been removed between the lstat() and the read attempt by
> +		 * another process. In that case, we should not report an error.
> +		 */
> +		if (errno == ENOENT)
> +			goto cleanup;

Unlikely, but good to guard us against that condition regardless. It's
still not entirely race-free though because the file could meanwhile
have changed into a symlink, and we wouldn't notice now. We could fix
that by using open(O_NOFOLLOW), fstat the returne file descriptor and
then use `strbuf_read()` to slurp in the file.

> +		ret = error_errno("could not read %s", refs->path);
> +		goto cleanup;
> +	}
> +
> +	ret = packed_fsck_ref_content(o, packed_ref_content.buf,
> +				      packed_ref_content.buf + packed_ref_content.len);
> +
>  cleanup:
> +	strbuf_release(&packed_ref_content);
>  	return ret;
>  }
>  
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 307f94a3ca..6c729e749a 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -646,4 +646,48 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
>  	test_cmp expect err
>  '
>  
> +test_expect_success 'packed-refs header should be checked' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	cd repo &&

The same comment applies here as on a preceding test: cd should be
executed in a subshell.

Patrick

  parent reply	other threads:[~2025-01-16 13:57 UTC|newest]

Thread overview: 168+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-05 13:46 [PATCH 00/10] add more ref consistency checks shejialuo
2025-01-05 13:49 ` [PATCH 01/10] files-backend: add object check for regular ref shejialuo
2025-01-07 14:17   ` Karthik Nayak
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-17 13:40     ` shejialuo
2025-01-24  7:54       ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 02/10] builtin/refs.h: get worktrees without reading head info shejialuo
2025-01-07 14:57   ` Karthik Nayak
2025-01-07 16:34     ` shejialuo
2025-01-08  8:40       ` Karthik Nayak
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 03/10] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-07 16:33   ` Karthik Nayak
2025-01-17 14:00     ` shejialuo
2025-01-17 22:01       ` Eric Sunshine
2025-01-18  3:05         ` shejialuo
2025-01-19  8:03         ` Karthik Nayak
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-05 13:49 ` [PATCH 04/10] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-08  0:54   ` shejialuo
2025-01-16 13:57   ` Patrick Steinhardt [this message]
2025-01-17 14:23     ` shejialuo
2025-01-24  7:51       ` Patrick Steinhardt
2025-02-17 13:16     ` shejialuo
2025-01-05 13:49 ` [PATCH 05/10] packed-backend: check whether the refname contains NULL binaries shejialuo
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-17 14:33     ` shejialuo
2025-01-05 13:49 ` [PATCH 06/10] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-17 14:35     ` shejialuo
2025-01-05 13:50 ` [PATCH 07/10] packed-backend: create "fsck_packed_ref_entry" to store parsing info shejialuo
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 08/10] packed-backend: add check for object consistency shejialuo
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 09/10] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-16 13:57   ` Patrick Steinhardt
2025-01-05 13:50 ` [PATCH 10/10] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-06 22:16   ` Junio C Hamano
2025-01-07 12:00     ` shejialuo
2025-01-07 15:52       ` Junio C Hamano
2025-01-30  4:04 ` [PATCH v2 0/8] add more ref consistency checks shejialuo
2025-01-30  4:06   ` [PATCH v2 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-01-30 17:53     ` Junio C Hamano
2025-01-30  4:07   ` [PATCH v2 2/8] builtin/refs: get worktrees without reading head info shejialuo
2025-01-30 18:04     ` Junio C Hamano
2025-01-31 13:29       ` shejialuo
2025-01-31 16:16         ` Junio C Hamano
2025-01-30  4:07   ` [PATCH v2 3/8] packed-backend: check whether the "packed-refs" is regular shejialuo
2025-01-30 18:23     ` Junio C Hamano
2025-01-31 13:54       ` shejialuo
2025-01-31 16:20         ` Junio C Hamano
2025-02-01  9:47           ` shejialuo
2025-02-03 20:15             ` Junio C Hamano
2025-02-04  3:58               ` shejialuo
2025-02-03  8:40     ` Patrick Steinhardt
2025-01-30  4:07   ` [PATCH v2 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-01-30 18:58     ` Junio C Hamano
2025-01-31 14:23       ` shejialuo
2025-01-30  4:07   ` [PATCH v2 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-03  8:40     ` Patrick Steinhardt
2025-02-05 10:09       ` shejialuo
2025-01-30  4:07   ` [PATCH v2 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-03  8:40     ` Patrick Steinhardt
2025-02-04  4:28       ` shejialuo
2025-01-30  4:08   ` [PATCH v2 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-01-30 19:02     ` Junio C Hamano
2025-01-31 14:35       ` shejialuo
2025-01-31 16:23         ` Junio C Hamano
2025-02-01  9:50           ` shejialuo
2025-02-03  8:40         ` Patrick Steinhardt
2025-02-03  8:40     ` Patrick Steinhardt
2025-01-30  4:08   ` [PATCH v2 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-01-30 19:03     ` Junio C Hamano
2025-01-31 14:37       ` shejialuo
2025-02-03  8:40     ` Patrick Steinhardt
2025-02-04  5:32       ` shejialuo
2025-02-06  5:56   ` [PATCH v3 0/8] add more ref consistency checks shejialuo
2025-02-06  5:58     ` [PATCH v3 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-06  5:58     ` [PATCH v3 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-06  5:58     ` [PATCH v3 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-06  5:59     ` [PATCH v3 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-12  9:56       ` Patrick Steinhardt
2025-02-12 10:12         ` shejialuo
2025-02-12 17:48         ` Junio C Hamano
2025-02-14  3:53           ` shejialuo
2025-02-06  5:59     ` [PATCH v3 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-06  5:59     ` [PATCH v3 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-12  9:56       ` Patrick Steinhardt
2025-02-12 10:18         ` shejialuo
2025-02-06  5:59     ` [PATCH v3 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-12  9:56       ` Patrick Steinhardt
2025-02-12 10:20         ` shejialuo
2025-02-12 10:42           ` Patrick Steinhardt
2025-02-12 10:56         ` shejialuo
2025-02-06  6:00     ` [PATCH v3 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-12  9:56       ` Patrick Steinhardt
2025-02-12 10:21         ` shejialuo
2025-02-14  4:50     ` [PATCH v4 0/8] add more ref consistency checks shejialuo
2025-02-14  4:51       ` [PATCH v4 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-14  4:52       ` [PATCH v4 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-14  9:19         ` Karthik Nayak
2025-02-14 12:20           ` shejialuo
2025-02-14  4:52       ` [PATCH v4 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-14  9:50         ` Karthik Nayak
2025-02-14 12:37           ` shejialuo
2025-02-14  4:52       ` [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-14 10:30         ` Karthik Nayak
2025-02-14 12:43           ` shejialuo
2025-02-14 14:01         ` Junio C Hamano
2025-02-14  4:52       ` [PATCH v4 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-14  4:53       ` [PATCH v4 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-14  4:59       ` [PATCH v4 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-14  4:59       ` [PATCH v4 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-14  9:04       ` [PATCH v4 0/8] add more ref consistency checks Karthik Nayak
2025-02-14 12:16         ` shejialuo
2025-02-17 15:25       ` [PATCH v5 " shejialuo
2025-02-17 15:27         ` [PATCH v5 1/8] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-17 15:27         ` [PATCH v5 2/8] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25  8:26           ` Patrick Steinhardt
2025-02-17 15:27         ` [PATCH v5 3/8] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25  8:27           ` Patrick Steinhardt
2025-02-17 15:27         ` [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25  8:27           ` Patrick Steinhardt
2025-02-25 12:34             ` shejialuo
2025-02-17 15:27         ` [PATCH v5 5/8] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-17 15:28         ` [PATCH v5 6/8] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-17 15:28         ` [PATCH v5 7/8] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-17 15:28         ` [PATCH v5 8/8] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-25  8:27         ` [PATCH v5 0/8] add more ref consistency checks Patrick Steinhardt
2025-02-25 13:19         ` [PATCH v6 0/9] " shejialuo
2025-02-25 13:21           ` [PATCH v6 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-25 13:21           ` [PATCH v6 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-25 13:21           ` [PATCH v6 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-25 17:44             ` Junio C Hamano
2025-02-26 12:05               ` shejialuo
2025-02-25 13:21           ` [PATCH v6 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26  8:08             ` Patrick Steinhardt
2025-02-26 12:28               ` shejialuo
2025-02-25 13:21           ` [PATCH v6 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-25 13:21           ` [PATCH v6 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-25 13:22           ` [PATCH v6 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-25 13:22           ` [PATCH v6 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-25 13:22           ` [PATCH v6 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-26 13:48           ` [PATCH v7 0/9] add more ref consistency checks shejialuo
2025-02-26 13:49             ` [PATCH v7 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-26 13:49             ` [PATCH v7 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-26 13:49             ` [PATCH v7 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-26 18:36               ` Junio C Hamano
2025-02-27  0:57                 ` shejialuo
2025-02-27 14:10                   ` Patrick Steinhardt
2025-02-27 16:57                   ` Junio C Hamano
2025-02-28  5:02                     ` shejialuo
2025-02-26 13:50             ` [PATCH v7 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-26 13:50             ` [PATCH v7 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-26 13:50             ` [PATCH v7 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-26 13:50             ` [PATCH v7 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-26 13:50             ` [PATCH v7 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-26 13:51             ` [PATCH v7 9/9] builtin/fsck: add `git refs verify` child process shejialuo
2025-02-27 16:03             ` [PATCH v8 0/9] add more ref consistency checks shejialuo
2025-02-27 16:05               ` [PATCH v8 1/9] t0602: use subshell to ensure working directory unchanged shejialuo
2025-02-27 16:06               ` [PATCH v8 2/9] builtin/refs: get worktrees without reading head information shejialuo
2025-02-27 16:06               ` [PATCH v8 3/9] packed-backend: check whether the "packed-refs" is regular file shejialuo
2025-02-27 16:06               ` [PATCH v8 4/9] packed-backend: check if header starts with "# pack-refs with: " shejialuo
2025-02-27 16:06               ` [PATCH v8 5/9] packed-backend: add "packed-refs" header consistency check shejialuo
2025-02-27 16:07               ` [PATCH v8 6/9] packed-backend: check whether the refname contains NUL characters shejialuo
2025-02-27 16:07               ` [PATCH v8 7/9] packed-backend: add "packed-refs" entry consistency check shejialuo
2025-02-27 16:07               ` [PATCH v8 8/9] packed-backend: check whether the "packed-refs" is sorted shejialuo
2025-02-27 16:07               ` [PATCH v8 9/9] builtin/fsck: add `git refs verify` child process shejialuo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4kQUb7og2Ce1iCo@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karthik.188@gmail.com \
    --cc=mhagger@alum.mit.edu \
    --cc=shejialuo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).