From: Taylor Blau <me@ttaylorr.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com,
Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 1/4] fsck: create scaffolding for rev-index checks
Date: Mon, 17 Apr 2023 18:20:00 -0400 [thread overview]
Message-ID: <ZD3GEIsPptxRetUV@nand.local> (raw)
In-Reply-To: <6bc3c56453ee2d0263210c233dbc946b5dbdcb4d.1681748502.git.gitgitgadget@gmail.com>
On Mon, Apr 17, 2023 at 04:21:38PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
>
> The 'fsck' builtin checks many of Git's on-disk data structures, but
> does not currently validate the pack rev-index files (a .rev file to
> pair with a .pack and .idx file).
>
> Before doing a more-involved check process, create the scaffolding
> within builtin/fsck.c to have a new error type and add that error type
> when the API method verify_pack_revindex() returns an error. That method
> does nothing currently, but we will add checks to it in later changes.
>
> For now, check that 'git fsck' succeeds without any errors in the normal
> case. Future checks will be paired with tests that corrupt the .rev file
> appropriately.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
> builtin/fsck.c | 30 ++++++++++++++++++++++++++++++
> pack-revindex.c | 11 +++++++++++
> pack-revindex.h | 8 ++++++++
> t/t5325-reverse-index.sh | 14 ++++++++++++++
> 4 files changed, 63 insertions(+)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 095b39d3980..2ab78129bde 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -24,6 +24,7 @@
> #include "resolve-undo.h"
> #include "run-command.h"
> #include "worktree.h"
> +#include "pack-revindex.h"
>
> #define REACHABLE 0x0001
> #define SEEN 0x0002
> @@ -53,6 +54,7 @@ static int name_objects;
> #define ERROR_REFS 010
> #define ERROR_COMMIT_GRAPH 020
> #define ERROR_MULTI_PACK_INDEX 040
> +#define ERROR_PACK_REV_INDEX 0100
>
> static const char *describe_object(const struct object_id *oid)
> {
> @@ -856,6 +858,32 @@ static int mark_packed_for_connectivity(const struct object_id *oid,
> return 0;
> }
>
> +static int check_pack_rev_indexes(struct repository *r, int show_progress)
> +{
> + struct progress *progress = NULL;
> + uint32_t pack_count = 0;
> + int res = 0;
> +
> + if (show_progress) {
> + for (struct packed_git *p = get_all_packs(the_repository); p; p = p->next)
It's going to take me a while to get used to these declarations inside
of for-loops!
> + pack_count++;
> + progress = start_delayed_progress("Verifying reverse pack-indexes", pack_count);
I wonder if we want to count over the sum of objects in packs rather
than the number of packs themselves. My worry would be that a rather
large pack would make it appear as if nothing is happening when in
reality we're just churning through a lot of objects.
> + pack_count = 0;
> + }
> +
> + for (struct packed_git *p = get_all_packs(the_repository); p; p = p->next) {
> + if (!load_pack_revindex(the_repository, p) &&
I was going to comment that I wasn't sure if `load_pack_revindex()` was
the right thing here, since we don't care about validating the
on-the-fly reverse indexes that we generate.
But I see in your 3/4 that you are comparing the values on disk to those
in memory, which is very nice.
> + verify_pack_revindex(p)) {
Inside of verify_pack_revindex(), it says that a negative number is
returned on error. Do we care about disambiguating >= 0 here? IOW,
should this be:
if (!load_pack_revindex(the_repository, p) || verify_pack_revindex(p) < 0)
?
All looking good otherwise.
Thanks,
Taylor
next prev parent reply other threads:[~2023-04-17 22:20 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-17 16:21 [PATCH 0/4] git fsck: check pack rev-index files Derrick Stolee via GitGitGadget
2023-04-17 16:21 ` [PATCH 1/4] fsck: create scaffolding for rev-index checks Derrick Stolee via GitGitGadget
2023-04-17 22:20 ` Taylor Blau [this message]
2023-04-17 16:21 ` [PATCH 2/4] fsck: check rev-index checksums Derrick Stolee via GitGitGadget
2023-04-17 22:15 ` Junio C Hamano
2023-04-18 14:24 ` Derrick Stolee
2023-04-17 22:24 ` Taylor Blau
2023-04-18 14:27 ` Derrick Stolee
2023-04-18 14:51 ` Taylor Blau
2023-04-18 14:57 ` Derrick Stolee
2023-04-18 15:03 ` Taylor Blau
2023-04-17 16:21 ` [PATCH 3/4] fsck: check rev-index position values Derrick Stolee via GitGitGadget
2023-04-17 22:01 ` Junio C Hamano
2023-04-18 14:32 ` Derrick Stolee
2023-04-17 22:52 ` Taylor Blau
2023-04-17 16:21 ` [PATCH 4/4] fsck: validate .rev file header Derrick Stolee via GitGitGadget
2023-04-17 21:37 ` [PATCH 0/4] git fsck: check pack rev-index files Junio C Hamano
2023-04-18 15:23 ` Taylor Blau
2023-04-18 16:59 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZD3GEIsPptxRetUV@nand.local \
--to=me@ttaylorr.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.