* [PATCH 1/6] commit-graph: factor out chain opening function
2023-09-26 5:54 [PATCH 0/6] some "commit-graph verify" fixes for chains Jeff King
@ 2023-09-26 5:56 ` Jeff King
2023-09-26 5:57 ` [PATCH 2/6] commit-graph: check mixed generation validation when loading chain file Jeff King
` (5 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-26 5:56 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
The load_commit_graph_chain() function opens the chain file and all of
the slices of graph that it points to. If there is no chain file (which
is a totally normal condition), we return NULL. But if we run into
errors with the chain file or loading the actual graph data, we also
return NULL, and the caller cannot tell the difference.
The caller can check for ENOENT for the unremarkable "no such file"
case. But I'm hesitant to assume that the rest of the function would
never accidentally set errno to ENOENT itself, since it is opening the
slice files (and that would mean the caller fails to notice a real
error).
So let's break this into two functions: one to open the file, and one to
actually load it. This matches the interface we provide for the
non-chain graph file, which will also come in handy in a moment when we
fix some bugs in the "git commit-graph verify" code.
Some notes:
- I've kept the "1 is good, 0 is bad" return convention (and the weird
"fd" out-parameter) used by the matching open_commit_graph()
function and other parts of the commit-graph code. This is unlike
most of the rest of Git (which would just return the fd, with -1 for
error), but it makes sense to stay consistent with the adjacent bits
of the API here.
- The existing chain loading function will quietly return if the file
is too small to hold a single entry. I've retained that behavior
(and explicitly set ENOENT in the opener function) for now, under
the notion that it's probably valid (though I'd imagine unusual) to
have an empty chain file.
There are two small behavior changes here, but I think both are strictly
positive:
1. The original blindly did a stat() before checking if fopen()
succeeded, meaning we were making a pointless extra stat call.
2. We now use fstat() to check the file size. The previous code using
a regular stat() on the pathname meant we could technically race
with somebody updating the chain file, and end up with a size that
does not match what we just opened with fopen(). I doubt anybody
ever hit this in practice, but it may have caused an out-of-bounds
read.
We'll retain the load_commit_graph_chain() function which does both the
open and reading steps (most existing callers do not care about seeing
errors anyway, since loading commit-graphs is optimistic).
Signed-off-by: Jeff King <peff@peff.net>
---
commit-graph.c | 58 +++++++++++++++++++++++++++++++++-----------------
commit-graph.h | 3 +++
2 files changed, 42 insertions(+), 19 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 5e8a3a5085..12cdd9af8e 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -513,31 +513,34 @@ static int add_graph_to_chain(struct commit_graph *g,
return 1;
}
-static struct commit_graph *load_commit_graph_chain(struct repository *r,
- struct object_directory *odb)
+int open_commit_graph_chain(const char *chain_file,
+ int *fd, struct stat *st)
+{
+ *fd = git_open(chain_file);
+ if (*fd < 0)
+ return 0;
+ if (fstat(*fd, st)) {
+ close(*fd);
+ return 0;
+ }
+ if (st->st_size <= the_hash_algo->hexsz) {
+ close(*fd);
+ errno = ENOENT;
+ return 0;
+ }
+ return 1;
+}
+
+struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+ int fd, struct stat *st)
{
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
- struct stat st;
struct object_id *oids;
int i = 0, valid = 1, count;
- char *chain_name = get_commit_graph_chain_filename(odb);
- FILE *fp;
- int stat_res;
+ FILE *fp = xfdopen(fd, "r");
- fp = fopen(chain_name, "r");
- stat_res = stat(chain_name, &st);
- free(chain_name);
-
- if (!fp)
- return NULL;
- if (stat_res ||
- st.st_size <= the_hash_algo->hexsz) {
- fclose(fp);
- return NULL;
- }
-
- count = st.st_size / (the_hash_algo->hexsz + 1);
+ count = st->st_size / (the_hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
prepare_alt_odb(r);
@@ -585,6 +588,23 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
return graph_chain;
}
+static struct commit_graph *load_commit_graph_chain(struct repository *r,
+ struct object_directory *odb)
+{
+ char *chain_file = get_commit_graph_chain_filename(odb);
+ struct stat st;
+ int fd;
+ struct commit_graph *g = NULL;
+
+ if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ /* ownership of fd is taken over by load function */
+ g = load_commit_graph_chain_fd_st(r, fd, &st);
+ }
+
+ free(chain_file);
+ return g;
+}
+
/*
* returns 1 if and only if all graphs in the chain have
* corrected commit dates stored in the generation_data chunk.
diff --git a/commit-graph.h b/commit-graph.h
index 5e534f0fcc..3b662fd2b5 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -26,6 +26,7 @@ struct string_list;
char *get_commit_graph_filename(struct object_directory *odb);
char *get_commit_graph_chain_filename(struct object_directory *odb);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
+int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st);
/*
* Given a commit struct, try to fill the commit struct info, including:
@@ -105,6 +106,8 @@ struct commit_graph {
struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
int fd, struct stat *st,
struct object_directory *odb);
+struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+ int fd, struct stat *st);
struct commit_graph *read_commit_graph_one(struct repository *r,
struct object_directory *odb);
--
2.42.0.758.gd56856b565
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/6] commit-graph: check mixed generation validation when loading chain file
2023-09-26 5:54 [PATCH 0/6] some "commit-graph verify" fixes for chains Jeff King
2023-09-26 5:56 ` [PATCH 1/6] commit-graph: factor out chain opening function Jeff King
@ 2023-09-26 5:57 ` Jeff King
2023-09-26 5:58 ` [PATCH 3/6] t5324: harmonize sha1/sha256 graph chain corruption Jeff King
` (4 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-26 5:57 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
In read_commit_graph_one(), we call validate_mixed_generation_chain()
after loading the graph. Even though we don't check the return value,
this has the side effect of clearing the read_generation_data flag,
which is important when working with mixed generation numbers.
But doing this in load_commit_graph_chain_fd_st() makes more sense:
1. We are calling it even when we did not load a chain at all, which
is pointless (you cannot have mixed generations in a single file).
2. For now, all callers load the graph via read_commit_graph_one().
But the point of factoring out the open/load in the previous commit
was to let "commit-graph verify" call them separately. So it needs
to trigger this function as part of the load.
Without this patch, the mixed-generation tests in t5324 would start
failing on "git commit-graph verify" calls, once we switch to using
a separate open/load call there.
Signed-off-by: Jeff King <peff@peff.net>
---
commit-graph.c | 54 +++++++++++++++++++++++++-------------------------
1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 12cdd9af8e..8b29c6de24 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -473,6 +473,31 @@ static struct commit_graph *load_commit_graph_v1(struct repository *r,
return g;
}
+/*
+ * returns 1 if and only if all graphs in the chain have
+ * corrected commit dates stored in the generation_data chunk.
+ */
+static int validate_mixed_generation_chain(struct commit_graph *g)
+{
+ int read_generation_data = 1;
+ struct commit_graph *p = g;
+
+ while (read_generation_data && p) {
+ read_generation_data = p->read_generation_data;
+ p = p->base_graph;
+ }
+
+ if (read_generation_data)
+ return 1;
+
+ while (g) {
+ g->read_generation_data = 0;
+ g = g->base_graph;
+ }
+
+ return 0;
+}
+
static int add_graph_to_chain(struct commit_graph *g,
struct commit_graph *chain,
struct object_id *oids,
@@ -581,6 +606,8 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
}
}
+ validate_mixed_generation_chain(graph_chain);
+
free(oids);
fclose(fp);
strbuf_release(&line);
@@ -605,31 +632,6 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
return g;
}
-/*
- * returns 1 if and only if all graphs in the chain have
- * corrected commit dates stored in the generation_data chunk.
- */
-static int validate_mixed_generation_chain(struct commit_graph *g)
-{
- int read_generation_data = 1;
- struct commit_graph *p = g;
-
- while (read_generation_data && p) {
- read_generation_data = p->read_generation_data;
- p = p->base_graph;
- }
-
- if (read_generation_data)
- return 1;
-
- while (g) {
- g->read_generation_data = 0;
- g = g->base_graph;
- }
-
- return 0;
-}
-
struct commit_graph *read_commit_graph_one(struct repository *r,
struct object_directory *odb)
{
@@ -638,8 +640,6 @@ struct commit_graph *read_commit_graph_one(struct repository *r,
if (!g)
g = load_commit_graph_chain(r, odb);
- validate_mixed_generation_chain(g);
-
return g;
}
--
2.42.0.758.gd56856b565
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/6] t5324: harmonize sha1/sha256 graph chain corruption
2023-09-26 5:54 [PATCH 0/6] some "commit-graph verify" fixes for chains Jeff King
2023-09-26 5:56 ` [PATCH 1/6] commit-graph: factor out chain opening function Jeff King
2023-09-26 5:57 ` [PATCH 2/6] commit-graph: check mixed generation validation when loading chain file Jeff King
@ 2023-09-26 5:58 ` Jeff King
2023-09-26 6:00 ` [PATCH 4/6] commit-graph: detect read errors when verifying graph chain Jeff King
` (3 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-26 5:58 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
In t5324.20, we corrupt a hex character 60 bytes into the graph chain
file. Since the file consists of two hash identifiers, one per line, the
corruption differs between sha1 and sha256. In a sha1 repository, the
corruption is on the second line, and in a sha256 repository, it is on
the first.
We should of course detect the problem with either line. But as the next
few patches will show (and fix), that is not the case (in fact, we
currently do not exit non-zero for either line!). And while at the end
of our series we'll catch all errors, our intermediate states will have
differing behavior between the two hashes.
Let's make this test behave consistently with either hash by always
corrupting the first line. We'll add additional tests that explicitly
cover the second line as we fix those bugs.
Signed-off-by: Jeff King <peff@peff.net>
---
t/t5324-split-commit-graph.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 36c4141e67..e335ef87a6 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -316,11 +316,11 @@ test_expect_success 'verify after commit-graph-chain corruption' '
git clone --no-hardlinks . verify-chain &&
(
cd verify-chain &&
- corrupt_file "$graphdir/commit-graph-chain" 60 "G" &&
+ corrupt_file "$graphdir/commit-graph-chain" 30 "G" &&
git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "invalid commit-graph chain" err &&
- corrupt_file "$graphdir/commit-graph-chain" 60 "A" &&
+ corrupt_file "$graphdir/commit-graph-chain" 30 "A" &&
git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "unable to find all commit-graph files" err
--
2.42.0.758.gd56856b565
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/6] commit-graph: detect read errors when verifying graph chain
2023-09-26 5:54 [PATCH 0/6] some "commit-graph verify" fixes for chains Jeff King
` (2 preceding siblings ...)
2023-09-26 5:58 ` [PATCH 3/6] t5324: harmonize sha1/sha256 graph chain corruption Jeff King
@ 2023-09-26 6:00 ` Jeff King
2023-09-26 6:03 ` [PATCH 5/6] commit-graph: tighten chain size check Jeff King
` (2 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-26 6:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
Because it's OK to not have a graph file at all, the graph_verify()
function needs to tell the difference between a missing file and a real
error. So when loading a traditional graph file, we call
open_commit_graph() separately from load_commit_graph_chain_fd_st(), and
don't complain if the first one fails with ENOENT.
When the function learned about chain files in 3da4b609bb (commit-graph:
verify chains with --shallow mode, 2019-06-18), we couldn't be as
careful, since the only way to load a chain was with
read_commit_graph_one(), which did both the open/load as a single unit.
So we'll miss errors in chain files we load, thinking instead that there
was just no chain file at all.
Note that we do still report some of these problems to stderr, as the
loading function calls error() and warning(). But we'd exit with a
successful exit code, which is wrong.
We can fix that by using the recently split open/load functions for
chains. That lets us treat the chain file just like a single file with
respect to error handling here.
An existing test (from 3da4b609bb) shows off the problem; we were
expecting "commit-graph verify" to report success, but that makes no
sense. We did not even verify the contents of the graph data, because we
couldn't load it! I don't think this was an intentional exception, but
rather just the test covering what happened to occur.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/commit-graph.c | 23 ++++++++++++++++-------
t/t5324-split-commit-graph.sh | 4 ++--
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index c88389df24..50c15d9bfe 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -69,7 +69,8 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
struct commit_graph *graph = NULL;
struct object_directory *odb = NULL;
char *graph_name;
- int open_ok;
+ char *chain_name;
+ enum { OPENED_NONE, OPENED_GRAPH, OPENED_CHAIN } opened = OPENED_NONE;
int fd;
struct stat st;
int flags = 0;
@@ -102,21 +103,29 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
odb = find_odb(the_repository, opts.obj_dir);
graph_name = get_commit_graph_filename(odb);
- open_ok = open_commit_graph(graph_name, &fd, &st);
- if (!open_ok && errno != ENOENT)
+ chain_name = get_commit_graph_chain_filename(odb);
+ if (open_commit_graph(graph_name, &fd, &st))
+ opened = OPENED_GRAPH;
+ else if (errno != ENOENT)
die_errno(_("Could not open commit-graph '%s'"), graph_name);
+ else if (open_commit_graph_chain(chain_name, &fd, &st))
+ opened = OPENED_CHAIN;
+ else if (errno != ENOENT)
+ die_errno(_("could not open commit-graph chain '%s'"), chain_name);
FREE_AND_NULL(graph_name);
+ FREE_AND_NULL(chain_name);
FREE_AND_NULL(options);
- if (open_ok)
+ if (opened == OPENED_NONE)
+ return 0;
+ else if (opened == OPENED_GRAPH)
graph = load_commit_graph_one_fd_st(the_repository, fd, &st, odb);
else
- graph = read_commit_graph_one(the_repository, odb);
+ graph = load_commit_graph_chain_fd_st(the_repository, fd, &st);
- /* Return failure if open_ok predicted success */
if (!graph)
- return !!open_ok;
+ return 1;
ret = verify_commit_graph(the_repository, graph, flags);
free_commit_graph(graph);
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index e335ef87a6..0ac7bbd1dc 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -317,11 +317,11 @@ test_expect_success 'verify after commit-graph-chain corruption' '
(
cd verify-chain &&
corrupt_file "$graphdir/commit-graph-chain" 30 "G" &&
- git commit-graph verify 2>test_err &&
+ test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "invalid commit-graph chain" err &&
corrupt_file "$graphdir/commit-graph-chain" 30 "A" &&
- git commit-graph verify 2>test_err &&
+ test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "unable to find all commit-graph files" err
)
--
2.42.0.758.gd56856b565
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/6] commit-graph: tighten chain size check
2023-09-26 5:54 [PATCH 0/6] some "commit-graph verify" fixes for chains Jeff King
` (3 preceding siblings ...)
2023-09-26 6:00 ` [PATCH 4/6] commit-graph: detect read errors when verifying graph chain Jeff King
@ 2023-09-26 6:03 ` Jeff King
2023-09-26 6:04 ` [PATCH 6/6] commit-graph: report incomplete chains during verification Jeff King
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-26 6:03 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
When we open a commit-graph-chain file, if it's smaller than a single
entry, we just quietly treat that as ENOENT. That make some sense if the
file is truly zero bytes, but it means that "commit-graph verify" will
quietly ignore a file that contains garbage if that garbage happens to
be short.
Instead, let's only simulate ENOENT when the file is truly empty, and
otherwise return EINVAL. The normal graph-loading routines don't care,
but "commit-graph verify" will notice and complain about the difference.
It's not entirely clear to me that the 0-is-ENOENT case actually happens
in real life, so we could perhaps just eliminate this special-case
altogether. But this is how we've always behaved, so I'm preserving it
in the name of backwards compatibility (though again, it really only
matters for "verify", as the regular routines are happy to load what
they can).
Signed-off-by: Jeff King <peff@peff.net>
---
There's a related issue which I didn't fix in this series, which is that
the "load" function does a truncating division of (size / hash_size + 1)
to find the number of entries (the +1 is for the newline). So a
less-than-hash-sized chunk of garbage at the end of the file will be
ignored.
The simplest way to fix that is that we can insist that the size mod
hash_size+1 is 0. But I was wary of being too picky here (e.g., what
about a file that doesn't end with newline) especially in a way that
affected more than just "verify".
commit-graph.c | 10 ++++++++--
t/t5324-split-commit-graph.sh | 12 ++++++++++++
2 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 8b29c6de24..b1d3e5bf94 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -548,9 +548,15 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size <= the_hash_algo->hexsz) {
+ if (st->st_size < the_hash_algo->hexsz) {
close(*fd);
- errno = ENOENT;
+ if (!st->st_size) {
+ /* treat empty files the same as missing */
+ errno = ENOENT;
+ } else {
+ warning("commit-graph chain file too small");
+ errno = EINVAL;
+ }
return 0;
}
return 1;
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 0ac7bbd1dc..86d56debf6 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -327,6 +327,18 @@ test_expect_success 'verify after commit-graph-chain corruption' '
)
'
+test_expect_success 'verify notices too-short chain file' '
+ git clone --no-hardlinks . verify-chain-garbage &&
+ (
+ cd verify-chain-garbage &&
+ git commit-graph verify &&
+ echo "garbage" >$graphdir/commit-graph-chain &&
+ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ grep "commit-graph chain file too small" err
+ )
+'
+
test_expect_success 'verify across alternates' '
git clone --no-hardlinks . verify-alt &&
(
--
2.42.0.758.gd56856b565
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 6/6] commit-graph: report incomplete chains during verification
2023-09-26 5:54 [PATCH 0/6] some "commit-graph verify" fixes for chains Jeff King
` (4 preceding siblings ...)
2023-09-26 6:03 ` [PATCH 5/6] commit-graph: tighten chain size check Jeff King
@ 2023-09-26 6:04 ` Jeff King
2023-09-28 4:30 ` Jeff King
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
6 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2023-09-26 6:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
The load_commit_graph_chain_fd_st() function will stop loading chains
when it sees an error. But if it has loaded any graph slice at all, it
will return it. This is a good thing for normal use (we use what data we
can, and this is just an optimization). But it's a bad thing for
"commit-graph verify", which should be careful about finding any
irregularities. We do complain to stderr with a warning(), but the
verify command still exits with a successful return code.
The new tests here cover corruption of both the base and tip slices of
the chain. The corruption of the base file already works (it is the
first file we look at, so when we see the error we return NULL). The
"tip" case is what is fixed by this patch (it complains to stderr but
still returns the base slice).
Note that this also causes us to adjust a test later in the file that
similarly corrupts a tip (though confusingly the test script calls this
"base"). It checks stderr but erroneously expects the whole "verify"
command to exit with a successful code.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/commit-graph.c | 10 +++++++++-
commit-graph.c | 9 +++++++--
commit-graph.h | 3 ++-
t/t5324-split-commit-graph.sh | 28 +++++++++++++++++++++++++++-
4 files changed, 45 insertions(+), 5 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 50c15d9bfe..f5e66e9969 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -74,6 +74,7 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
int fd;
struct stat st;
int flags = 0;
+ int incomplete_chain = 0;
int ret;
static struct option builtin_commit_graph_verify_options[] = {
@@ -122,13 +123,20 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
else if (opened == OPENED_GRAPH)
graph = load_commit_graph_one_fd_st(the_repository, fd, &st, odb);
else
- graph = load_commit_graph_chain_fd_st(the_repository, fd, &st);
+ graph = load_commit_graph_chain_fd_st(the_repository, fd, &st,
+ &incomplete_chain);
if (!graph)
return 1;
ret = verify_commit_graph(the_repository, graph, flags);
free_commit_graph(graph);
+
+ if (incomplete_chain) {
+ error("one or more commit-graph chain files could not be loaded");
+ ret |= 1;
+ }
+
return ret;
}
diff --git a/commit-graph.c b/commit-graph.c
index b1d3e5bf94..148af50058 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -563,14 +563,17 @@ int open_commit_graph_chain(const char *chain_file,
}
struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
- int fd, struct stat *st)
+ int fd, struct stat *st,
+ int *incomplete_chain)
{
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
struct object_id *oids;
int i = 0, valid = 1, count;
FILE *fp = xfdopen(fd, "r");
+ *incomplete_chain = 0;
+
count = st->st_size / (the_hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
@@ -608,6 +611,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (!valid) {
warning(_("unable to find all commit-graph files"));
+ *incomplete_chain = 1;
break;
}
}
@@ -630,8 +634,9 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
struct commit_graph *g = NULL;
if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ int incomplete;
/* ownership of fd is taken over by load function */
- g = load_commit_graph_chain_fd_st(r, fd, &st);
+ g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
}
free(chain_file);
diff --git a/commit-graph.h b/commit-graph.h
index 3b662fd2b5..20ada7e891 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -107,7 +107,8 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
int fd, struct stat *st,
struct object_directory *odb);
struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
- int fd, struct stat *st);
+ int fd, struct stat *st,
+ int *incomplete_chain);
struct commit_graph *read_commit_graph_one(struct repository *r,
struct object_directory *odb);
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 86d56debf6..598660557a 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -285,6 +285,32 @@ test_expect_success 'verify hashes along chain, even in shallow' '
)
'
+test_expect_success 'verify notices chain slice which is bogus (base)' '
+ git clone --no-hardlinks . verify-chain-bogus-base &&
+ (
+ cd verify-chain-bogus-base &&
+ git commit-graph verify &&
+ base_file=$graphdir/graph-$(sed -n 1p $graphdir/commit-graph-chain).graph &&
+ echo "garbage" >$base_file &&
+ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ grep "commit-graph file is too small" err
+ )
+'
+
+test_expect_success 'verify notices chain slice which is bogus (tip)' '
+ git clone --no-hardlinks . verify-chain-bogus-tip &&
+ (
+ cd verify-chain-bogus-tip &&
+ git commit-graph verify &&
+ tip_file=$graphdir/graph-$(sed -n 2p $graphdir/commit-graph-chain).graph &&
+ echo "garbage" >$tip_file &&
+ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ grep "commit-graph file is too small" err
+ )
+'
+
test_expect_success 'verify --shallow does not check base contents' '
git clone --no-hardlinks . verify-shallow &&
(
@@ -306,7 +332,7 @@ test_expect_success 'warn on base graph chunk incorrect' '
git commit-graph verify &&
base_file=$graphdir/graph-$(tail -n 1 $graphdir/commit-graph-chain).graph &&
corrupt_file "$base_file" $(test_oid base) "\01" &&
- git commit-graph verify --shallow 2>test_err &&
+ test_must_fail git commit-graph verify --shallow 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "commit-graph chain does not match" err
)
--
2.42.0.758.gd56856b565
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 6/6] commit-graph: report incomplete chains during verification
2023-09-26 6:04 ` [PATCH 6/6] commit-graph: report incomplete chains during verification Jeff King
@ 2023-09-28 4:30 ` Jeff King
0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:30 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
On Tue, Sep 26, 2023 at 02:04:30AM -0400, Jeff King wrote:
> @@ -608,6 +611,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
>
> if (!valid) {
> warning(_("unable to find all commit-graph files"));
> + *incomplete_chain = 1;
> break;
> }
> }
Reviewing my own patch since I noticed it misses a case. ;)
The whole function here looks like (abbreviated):
int valid = 1;
for (i = 0; i < count; i++) {
if (get_oid_hex(line.buf, &oids[i])) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
break;
}
valid = 0;
for (odb = r->objects->odb; odb; odb = odb->next) {
struct commit_graph *g = load_commit_graph_one(r, graph_name, odb);
if (g) {
if (add_graph_to_chain(g, graph_chain, oids, i)) {
graph_chain = g;
valid = 1;
}
break;
}
}
if (!valid) {
warning(_("unable to find all commit-graph files"));
break;
}
}
My patch updates the final "if (!valid)" check, which covers anything
that happened in the loop over the odb (i.e., finding and opening the
file itself). But if we see a line which does not parse as an oid, we
break out of the outer loop earlier. We set "valid", but nobody looks at
it! So the caller would not be correctly notified that we had an error
in that case.
The fix is simple: we can just check "valid" after leaving the outer
loop, which covers both cases. And we'll want an extra test to check it.
This is actually covered by the test modified earlier in patch 3, where
sha1 and sha256 produced different results. I fixed it there by
consistently corrupting the first line of the file, but we really want
to check both cases.
I'll send a re-roll in a moment.
-Peff
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 0/6] some "commit-graph verify" fixes for chains
2023-09-26 5:54 [PATCH 0/6] some "commit-graph verify" fixes for chains Jeff King
` (5 preceding siblings ...)
2023-09-26 6:04 ` [PATCH 6/6] commit-graph: report incomplete chains during verification Jeff King
@ 2023-09-28 4:37 ` Jeff King
2023-09-28 4:38 ` [PATCH v2 1/6] commit-graph: factor out chain opening function Jeff King
` (6 more replies)
6 siblings, 7 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:37 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
On Tue, Sep 26, 2023 at 01:54:52AM -0400, Jeff King wrote:
> [1/6]: commit-graph: factor out chain opening function
> [2/6]: commit-graph: check mixed generation validation when loading chain file
> [3/6]: t5324: harmonize sha1/sha256 graph chain corruption
> [4/6]: commit-graph: detect read errors when verifying graph chain
> [5/6]: commit-graph: tighten chain size check
> [6/6]: commit-graph: report incomplete chains during verification
Here's a re-roll that fixes a missed case in patch 6 (I sent more
details elsewhere in the thread).
Range-diff is below, but the changes are:
- patch 3 extends its tests to cover corrupting both the first and
second lines of the file. The test and directory renaming spills
over into the range-diff context for the other patches (and I
renamed the directory used in the "too-short chain file" test in
patch 5 to align better with the others).
- patch 6 covers the case that get_oid_hex() fails (previously it only
detected that a parsed hex failed to result in us loading a valid
graph file). The range-diff is IMHO quite hard to read here, but the
patch itself is quite simple.
1: baa20e6391 = 1: a4893f325f commit-graph: factor out chain opening function
2: b77cc15584 = 2: 722181d8ed commit-graph: check mixed generation validation when loading chain file
3: af350d1d41 < -: ---------- t5324: harmonize sha1/sha256 graph chain corruption
-: ---------- > 3: 7de47054c7 t5324: harmonize sha1/sha256 graph chain corruption
4: 54ce3863f8 ! 4: 313c051a38 commit-graph: detect read errors when verifying graph chain
@@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv, con
free_commit_graph(graph);
## t/t5324-split-commit-graph.sh ##
-@@ t/t5324-split-commit-graph.sh: test_expect_success 'verify after commit-graph-chain corruption' '
+@@ t/t5324-split-commit-graph.sh: test_expect_success 'verify after commit-graph-chain corruption (base)' '
(
- cd verify-chain &&
+ cd verify-chain-base &&
corrupt_file "$graphdir/commit-graph-chain" 30 "G" &&
- git commit-graph verify 2>test_err &&
+ test_must_fail git commit-graph verify 2>test_err &&
5: 273f5e8b87 ! 5: 6d9efd761a commit-graph: tighten chain size check
@@ commit-graph.c: int open_commit_graph_chain(const char *chain_file,
return 1;
## t/t5324-split-commit-graph.sh ##
-@@ t/t5324-split-commit-graph.sh: test_expect_success 'verify after commit-graph-chain corruption' '
+@@ t/t5324-split-commit-graph.sh: test_expect_success 'verify after commit-graph-chain corruption (tip)' '
)
'
+test_expect_success 'verify notices too-short chain file' '
-+ git clone --no-hardlinks . verify-chain-garbage &&
++ git clone --no-hardlinks . verify-chain-short &&
+ (
-+ cd verify-chain-garbage &&
++ cd verify-chain-short &&
+ git commit-graph verify &&
+ echo "garbage" >$graphdir/commit-graph-chain &&
+ test_must_fail git commit-graph verify 2>test_err &&
6: 2661efed03 ! 6: 8fc82e2254 commit-graph: report incomplete chains during verification
@@ Commit message
"tip" case is what is fixed by this patch (it complains to stderr but
still returns the base slice).
+ Likewise the existing tests for corruption of the commit-graph-chain
+ file itself need to be updated. We already exited non-zero correctly for
+ the "base" case, but the "tip" case can now do so, too.
+
Note that this also causes us to adjust a test later in the file that
similarly corrupts a tip (though confusingly the test script calls this
"base"). It checks stderr but erroneously expects the whole "verify"
@@ commit-graph.c: int open_commit_graph_chain(const char *chain_file,
{
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
- struct object_id *oids;
- int i = 0, valid = 1, count;
- FILE *fp = xfdopen(fd, "r");
-
-+ *incomplete_chain = 0;
-+
- count = st->st_size / (the_hash_algo->hexsz + 1);
- CALLOC_ARRAY(oids, count);
-
@@ commit-graph.c: struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+ fclose(fp);
+ strbuf_release(&line);
+
++ *incomplete_chain = !valid;
+ return graph_chain;
+ }
- if (!valid) {
- warning(_("unable to find all commit-graph files"));
-+ *incomplete_chain = 1;
- break;
- }
- }
@@ commit-graph.c: static struct commit_graph *load_commit_graph_chain(struct repository *r,
struct commit_graph *g = NULL;
@@ t/t5324-split-commit-graph.sh: test_expect_success 'warn on base graph chunk inc
grep -v "^+" test_err >err &&
test_i18ngrep "commit-graph chain does not match" err
)
+@@ t/t5324-split-commit-graph.sh: test_expect_success 'verify after commit-graph-chain corruption (tip)' '
+ (
+ cd verify-chain-tip &&
+ corrupt_file "$graphdir/commit-graph-chain" 70 "G" &&
+- git commit-graph verify 2>test_err &&
++ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ test_i18ngrep "invalid commit-graph chain" err &&
+ corrupt_file "$graphdir/commit-graph-chain" 70 "A" &&
+- git commit-graph verify 2>test_err &&
++ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ test_i18ngrep "unable to find all commit-graph files" err
+ )
[1/6]: commit-graph: factor out chain opening function
[2/6]: commit-graph: check mixed generation validation when loading chain file
[3/6]: t5324: harmonize sha1/sha256 graph chain corruption
[4/6]: commit-graph: detect read errors when verifying graph chain
[5/6]: commit-graph: tighten chain size check
[6/6]: commit-graph: report incomplete chains during verification
builtin/commit-graph.c | 31 +++++++---
commit-graph.c | 109 +++++++++++++++++++++-------------
commit-graph.h | 4 ++
t/t5324-split-commit-graph.sh | 69 ++++++++++++++++++---
4 files changed, 158 insertions(+), 55 deletions(-)
-Peff
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 1/6] commit-graph: factor out chain opening function
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
@ 2023-09-28 4:38 ` Jeff King
2023-09-28 4:38 ` [PATCH v2 2/6] commit-graph: check mixed generation validation when loading chain file Jeff King
` (5 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:38 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
The load_commit_graph_chain() function opens the chain file and all of
the slices of graph that it points to. If there is no chain file (which
is a totally normal condition), we return NULL. But if we run into
errors with the chain file or loading the actual graph data, we also
return NULL, and the caller cannot tell the difference.
The caller can check for ENOENT for the unremarkable "no such file"
case. But I'm hesitant to assume that the rest of the function would
never accidentally set errno to ENOENT itself, since it is opening the
slice files (and that would mean the caller fails to notice a real
error).
So let's break this into two functions: one to open the file, and one to
actually load it. This matches the interface we provide for the
non-chain graph file, which will also come in handy in a moment when we
fix some bugs in the "git commit-graph verify" code.
Some notes:
- I've kept the "1 is good, 0 is bad" return convention (and the weird
"fd" out-parameter) used by the matching open_commit_graph()
function and other parts of the commit-graph code. This is unlike
most of the rest of Git (which would just return the fd, with -1 for
error), but it makes sense to stay consistent with the adjacent bits
of the API here.
- The existing chain loading function will quietly return if the file
is too small to hold a single entry. I've retained that behavior
(and explicitly set ENOENT in the opener function) for now, under
the notion that it's probably valid (though I'd imagine unusual) to
have an empty chain file.
There are two small behavior changes here, but I think both are strictly
positive:
1. The original blindly did a stat() before checking if fopen()
succeeded, meaning we were making a pointless extra stat call.
2. We now use fstat() to check the file size. The previous code using
a regular stat() on the pathname meant we could technically race
with somebody updating the chain file, and end up with a size that
does not match what we just opened with fopen(). I doubt anybody
ever hit this in practice, but it may have caused an out-of-bounds
read.
We'll retain the load_commit_graph_chain() function which does both the
open and reading steps (most existing callers do not care about seeing
errors anyway, since loading commit-graphs is optimistic).
Signed-off-by: Jeff King <peff@peff.net>
---
Same as v1.
commit-graph.c | 58 +++++++++++++++++++++++++++++++++-----------------
commit-graph.h | 3 +++
2 files changed, 42 insertions(+), 19 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 5e8a3a5085..12cdd9af8e 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -513,31 +513,34 @@ static int add_graph_to_chain(struct commit_graph *g,
return 1;
}
-static struct commit_graph *load_commit_graph_chain(struct repository *r,
- struct object_directory *odb)
+int open_commit_graph_chain(const char *chain_file,
+ int *fd, struct stat *st)
+{
+ *fd = git_open(chain_file);
+ if (*fd < 0)
+ return 0;
+ if (fstat(*fd, st)) {
+ close(*fd);
+ return 0;
+ }
+ if (st->st_size <= the_hash_algo->hexsz) {
+ close(*fd);
+ errno = ENOENT;
+ return 0;
+ }
+ return 1;
+}
+
+struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+ int fd, struct stat *st)
{
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
- struct stat st;
struct object_id *oids;
int i = 0, valid = 1, count;
- char *chain_name = get_commit_graph_chain_filename(odb);
- FILE *fp;
- int stat_res;
+ FILE *fp = xfdopen(fd, "r");
- fp = fopen(chain_name, "r");
- stat_res = stat(chain_name, &st);
- free(chain_name);
-
- if (!fp)
- return NULL;
- if (stat_res ||
- st.st_size <= the_hash_algo->hexsz) {
- fclose(fp);
- return NULL;
- }
-
- count = st.st_size / (the_hash_algo->hexsz + 1);
+ count = st->st_size / (the_hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
prepare_alt_odb(r);
@@ -585,6 +588,23 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
return graph_chain;
}
+static struct commit_graph *load_commit_graph_chain(struct repository *r,
+ struct object_directory *odb)
+{
+ char *chain_file = get_commit_graph_chain_filename(odb);
+ struct stat st;
+ int fd;
+ struct commit_graph *g = NULL;
+
+ if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ /* ownership of fd is taken over by load function */
+ g = load_commit_graph_chain_fd_st(r, fd, &st);
+ }
+
+ free(chain_file);
+ return g;
+}
+
/*
* returns 1 if and only if all graphs in the chain have
* corrected commit dates stored in the generation_data chunk.
diff --git a/commit-graph.h b/commit-graph.h
index 5e534f0fcc..3b662fd2b5 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -26,6 +26,7 @@ struct string_list;
char *get_commit_graph_filename(struct object_directory *odb);
char *get_commit_graph_chain_filename(struct object_directory *odb);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
+int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st);
/*
* Given a commit struct, try to fill the commit struct info, including:
@@ -105,6 +106,8 @@ struct commit_graph {
struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
int fd, struct stat *st,
struct object_directory *odb);
+struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+ int fd, struct stat *st);
struct commit_graph *read_commit_graph_one(struct repository *r,
struct object_directory *odb);
--
2.42.0.773.ga6e30199be
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 2/6] commit-graph: check mixed generation validation when loading chain file
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
2023-09-28 4:38 ` [PATCH v2 1/6] commit-graph: factor out chain opening function Jeff King
@ 2023-09-28 4:38 ` Jeff King
2023-09-28 4:38 ` [PATCH v2 3/6] t5324: harmonize sha1/sha256 graph chain corruption Jeff King
` (4 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:38 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
In read_commit_graph_one(), we call validate_mixed_generation_chain()
after loading the graph. Even though we don't check the return value,
this has the side effect of clearing the read_generation_data flag,
which is important when working with mixed generation numbers.
But doing this in load_commit_graph_chain_fd_st() makes more sense:
1. We are calling it even when we did not load a chain at all, which
is pointless (you cannot have mixed generations in a single file).
2. For now, all callers load the graph via read_commit_graph_one().
But the point of factoring out the open/load in the previous commit
was to let "commit-graph verify" call them separately. So it needs
to trigger this function as part of the load.
Without this patch, the mixed-generation tests in t5324 would start
failing on "git commit-graph verify" calls, once we switch to using
a separate open/load call there.
Signed-off-by: Jeff King <peff@peff.net>
---
Same as v1.
commit-graph.c | 54 +++++++++++++++++++++++++-------------------------
1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 12cdd9af8e..8b29c6de24 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -473,6 +473,31 @@ static struct commit_graph *load_commit_graph_v1(struct repository *r,
return g;
}
+/*
+ * returns 1 if and only if all graphs in the chain have
+ * corrected commit dates stored in the generation_data chunk.
+ */
+static int validate_mixed_generation_chain(struct commit_graph *g)
+{
+ int read_generation_data = 1;
+ struct commit_graph *p = g;
+
+ while (read_generation_data && p) {
+ read_generation_data = p->read_generation_data;
+ p = p->base_graph;
+ }
+
+ if (read_generation_data)
+ return 1;
+
+ while (g) {
+ g->read_generation_data = 0;
+ g = g->base_graph;
+ }
+
+ return 0;
+}
+
static int add_graph_to_chain(struct commit_graph *g,
struct commit_graph *chain,
struct object_id *oids,
@@ -581,6 +606,8 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
}
}
+ validate_mixed_generation_chain(graph_chain);
+
free(oids);
fclose(fp);
strbuf_release(&line);
@@ -605,31 +632,6 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
return g;
}
-/*
- * returns 1 if and only if all graphs in the chain have
- * corrected commit dates stored in the generation_data chunk.
- */
-static int validate_mixed_generation_chain(struct commit_graph *g)
-{
- int read_generation_data = 1;
- struct commit_graph *p = g;
-
- while (read_generation_data && p) {
- read_generation_data = p->read_generation_data;
- p = p->base_graph;
- }
-
- if (read_generation_data)
- return 1;
-
- while (g) {
- g->read_generation_data = 0;
- g = g->base_graph;
- }
-
- return 0;
-}
-
struct commit_graph *read_commit_graph_one(struct repository *r,
struct object_directory *odb)
{
@@ -638,8 +640,6 @@ struct commit_graph *read_commit_graph_one(struct repository *r,
if (!g)
g = load_commit_graph_chain(r, odb);
- validate_mixed_generation_chain(g);
-
return g;
}
--
2.42.0.773.ga6e30199be
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 3/6] t5324: harmonize sha1/sha256 graph chain corruption
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
2023-09-28 4:38 ` [PATCH v2 1/6] commit-graph: factor out chain opening function Jeff King
2023-09-28 4:38 ` [PATCH v2 2/6] commit-graph: check mixed generation validation when loading chain file Jeff King
@ 2023-09-28 4:38 ` Jeff King
2023-09-28 4:38 ` [PATCH v2 4/6] commit-graph: detect read errors when verifying graph chain Jeff King
` (3 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:38 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
In t5324.20, we corrupt a hex character 60 bytes into the graph chain
file. Since the file consists of two hash identifiers, one per line, the
corruption differs between sha1 and sha256. In a sha1 repository, the
corruption is on the second line, and in a sha256 repository, it is on
the first.
We should of course detect the problem with either line. But as the next
few patches will show (and fix), that is not the case (in fact, we
currently do not exit non-zero for either line!). And while at the end
of our series we'll catch all errors, our intermediate states will have
differing behavior between the two hashes.
Let's make sure we test corruption of both the first and second lines,
and do so consistently with either hash by choosing offsets which are
always in the first hash (30 bytes) or in the second (70).
Signed-off-by: Jeff King <peff@peff.net>
---
In v2 we are now testing both first and second lines, instead of just
the first.
t/t5324-split-commit-graph.sh | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 36c4141e67..a9b2428b56 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -312,15 +312,30 @@ test_expect_success 'warn on base graph chunk incorrect' '
)
'
-test_expect_success 'verify after commit-graph-chain corruption' '
- git clone --no-hardlinks . verify-chain &&
+test_expect_success 'verify after commit-graph-chain corruption (base)' '
+ git clone --no-hardlinks . verify-chain-base &&
(
- cd verify-chain &&
- corrupt_file "$graphdir/commit-graph-chain" 60 "G" &&
+ cd verify-chain-base &&
+ corrupt_file "$graphdir/commit-graph-chain" 30 "G" &&
git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "invalid commit-graph chain" err &&
- corrupt_file "$graphdir/commit-graph-chain" 60 "A" &&
+ corrupt_file "$graphdir/commit-graph-chain" 30 "A" &&
+ git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ test_i18ngrep "unable to find all commit-graph files" err
+ )
+'
+
+test_expect_success 'verify after commit-graph-chain corruption (tip)' '
+ git clone --no-hardlinks . verify-chain-tip &&
+ (
+ cd verify-chain-tip &&
+ corrupt_file "$graphdir/commit-graph-chain" 70 "G" &&
+ git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ test_i18ngrep "invalid commit-graph chain" err &&
+ corrupt_file "$graphdir/commit-graph-chain" 70 "A" &&
git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "unable to find all commit-graph files" err
--
2.42.0.773.ga6e30199be
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 4/6] commit-graph: detect read errors when verifying graph chain
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
` (2 preceding siblings ...)
2023-09-28 4:38 ` [PATCH v2 3/6] t5324: harmonize sha1/sha256 graph chain corruption Jeff King
@ 2023-09-28 4:38 ` Jeff King
2023-09-28 4:39 ` [PATCH v2 5/6] commit-graph: tighten chain size check Jeff King
` (2 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:38 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
Because it's OK to not have a graph file at all, the graph_verify()
function needs to tell the difference between a missing file and a real
error. So when loading a traditional graph file, we call
open_commit_graph() separately from load_commit_graph_chain_fd_st(), and
don't complain if the first one fails with ENOENT.
When the function learned about chain files in 3da4b609bb (commit-graph:
verify chains with --shallow mode, 2019-06-18), we couldn't be as
careful, since the only way to load a chain was with
read_commit_graph_one(), which did both the open/load as a single unit.
So we'll miss errors in chain files we load, thinking instead that there
was just no chain file at all.
Note that we do still report some of these problems to stderr, as the
loading function calls error() and warning(). But we'd exit with a
successful exit code, which is wrong.
We can fix that by using the recently split open/load functions for
chains. That lets us treat the chain file just like a single file with
respect to error handling here.
An existing test (from 3da4b609bb) shows off the problem; we were
expecting "commit-graph verify" to report success, but that makes no
sense. We did not even verify the contents of the graph data, because we
couldn't load it! I don't think this was an intentional exception, but
rather just the test covering what happened to occur.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/commit-graph.c | 23 ++++++++++++++++-------
t/t5324-split-commit-graph.sh | 4 ++--
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index c88389df24..50c15d9bfe 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -69,7 +69,8 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
struct commit_graph *graph = NULL;
struct object_directory *odb = NULL;
char *graph_name;
- int open_ok;
+ char *chain_name;
+ enum { OPENED_NONE, OPENED_GRAPH, OPENED_CHAIN } opened = OPENED_NONE;
int fd;
struct stat st;
int flags = 0;
@@ -102,21 +103,29 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
odb = find_odb(the_repository, opts.obj_dir);
graph_name = get_commit_graph_filename(odb);
- open_ok = open_commit_graph(graph_name, &fd, &st);
- if (!open_ok && errno != ENOENT)
+ chain_name = get_commit_graph_chain_filename(odb);
+ if (open_commit_graph(graph_name, &fd, &st))
+ opened = OPENED_GRAPH;
+ else if (errno != ENOENT)
die_errno(_("Could not open commit-graph '%s'"), graph_name);
+ else if (open_commit_graph_chain(chain_name, &fd, &st))
+ opened = OPENED_CHAIN;
+ else if (errno != ENOENT)
+ die_errno(_("could not open commit-graph chain '%s'"), chain_name);
FREE_AND_NULL(graph_name);
+ FREE_AND_NULL(chain_name);
FREE_AND_NULL(options);
- if (open_ok)
+ if (opened == OPENED_NONE)
+ return 0;
+ else if (opened == OPENED_GRAPH)
graph = load_commit_graph_one_fd_st(the_repository, fd, &st, odb);
else
- graph = read_commit_graph_one(the_repository, odb);
+ graph = load_commit_graph_chain_fd_st(the_repository, fd, &st);
- /* Return failure if open_ok predicted success */
if (!graph)
- return !!open_ok;
+ return 1;
ret = verify_commit_graph(the_repository, graph, flags);
free_commit_graph(graph);
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index a9b2428b56..a5ac0440f1 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -317,11 +317,11 @@ test_expect_success 'verify after commit-graph-chain corruption (base)' '
(
cd verify-chain-base &&
corrupt_file "$graphdir/commit-graph-chain" 30 "G" &&
- git commit-graph verify 2>test_err &&
+ test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "invalid commit-graph chain" err &&
corrupt_file "$graphdir/commit-graph-chain" 30 "A" &&
- git commit-graph verify 2>test_err &&
+ test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "unable to find all commit-graph files" err
)
--
2.42.0.773.ga6e30199be
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 5/6] commit-graph: tighten chain size check
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
` (3 preceding siblings ...)
2023-09-28 4:38 ` [PATCH v2 4/6] commit-graph: detect read errors when verifying graph chain Jeff King
@ 2023-09-28 4:39 ` Jeff King
2023-09-28 4:39 ` [PATCH v2 6/6] commit-graph: report incomplete chains during verification Jeff King
2023-10-05 17:38 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Taylor Blau
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:39 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
When we open a commit-graph-chain file, if it's smaller than a single
entry, we just quietly treat that as ENOENT. That make some sense if the
file is truly zero bytes, but it means that "commit-graph verify" will
quietly ignore a file that contains garbage if that garbage happens to
be short.
Instead, let's only simulate ENOENT when the file is truly empty, and
otherwise return EINVAL. The normal graph-loading routines don't care,
but "commit-graph verify" will notice and complain about the difference.
It's not entirely clear to me that the 0-is-ENOENT case actually happens
in real life, so we could perhaps just eliminate this special-case
altogether. But this is how we've always behaved, so I'm preserving it
in the name of backwards compatibility (though again, it really only
matters for "verify", as the regular routines are happy to load what
they can).
Signed-off-by: Jeff King <peff@peff.net>
---
commit-graph.c | 10 ++++++++--
t/t5324-split-commit-graph.sh | 12 ++++++++++++
2 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 8b29c6de24..b1d3e5bf94 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -548,9 +548,15 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size <= the_hash_algo->hexsz) {
+ if (st->st_size < the_hash_algo->hexsz) {
close(*fd);
- errno = ENOENT;
+ if (!st->st_size) {
+ /* treat empty files the same as missing */
+ errno = ENOENT;
+ } else {
+ warning("commit-graph chain file too small");
+ errno = EINVAL;
+ }
return 0;
}
return 1;
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index a5ac0440f1..be58545810 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -342,6 +342,18 @@ test_expect_success 'verify after commit-graph-chain corruption (tip)' '
)
'
+test_expect_success 'verify notices too-short chain file' '
+ git clone --no-hardlinks . verify-chain-short &&
+ (
+ cd verify-chain-short &&
+ git commit-graph verify &&
+ echo "garbage" >$graphdir/commit-graph-chain &&
+ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ grep "commit-graph chain file too small" err
+ )
+'
+
test_expect_success 'verify across alternates' '
git clone --no-hardlinks . verify-alt &&
(
--
2.42.0.773.ga6e30199be
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 6/6] commit-graph: report incomplete chains during verification
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
` (4 preceding siblings ...)
2023-09-28 4:39 ` [PATCH v2 5/6] commit-graph: tighten chain size check Jeff King
@ 2023-09-28 4:39 ` Jeff King
2023-10-05 17:38 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Taylor Blau
6 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2023-09-28 4:39 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee
The load_commit_graph_chain_fd_st() function will stop loading chains
when it sees an error. But if it has loaded any graph slice at all, it
will return it. This is a good thing for normal use (we use what data we
can, and this is just an optimization). But it's a bad thing for
"commit-graph verify", which should be careful about finding any
irregularities. We do complain to stderr with a warning(), but the
verify command still exits with a successful return code.
The new tests here cover corruption of both the base and tip slices of
the chain. The corruption of the base file already works (it is the
first file we look at, so when we see the error we return NULL). The
"tip" case is what is fixed by this patch (it complains to stderr but
still returns the base slice).
Likewise the existing tests for corruption of the commit-graph-chain
file itself need to be updated. We already exited non-zero correctly for
the "base" case, but the "tip" case can now do so, too.
Note that this also causes us to adjust a test later in the file that
similarly corrupts a tip (though confusingly the test script calls this
"base"). It checks stderr but erroneously expects the whole "verify"
command to exit with a successful code.
Signed-off-by: Jeff King <peff@peff.net>
---
v2 now covers failed get_oid_hex() call on non-base lines.
builtin/commit-graph.c | 10 +++++++++-
commit-graph.c | 7 +++++--
commit-graph.h | 3 ++-
t/t5324-split-commit-graph.sh | 32 +++++++++++++++++++++++++++++---
4 files changed, 45 insertions(+), 7 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 50c15d9bfe..f5e66e9969 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -74,6 +74,7 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
int fd;
struct stat st;
int flags = 0;
+ int incomplete_chain = 0;
int ret;
static struct option builtin_commit_graph_verify_options[] = {
@@ -122,13 +123,20 @@ static int graph_verify(int argc, const char **argv, const char *prefix)
else if (opened == OPENED_GRAPH)
graph = load_commit_graph_one_fd_st(the_repository, fd, &st, odb);
else
- graph = load_commit_graph_chain_fd_st(the_repository, fd, &st);
+ graph = load_commit_graph_chain_fd_st(the_repository, fd, &st,
+ &incomplete_chain);
if (!graph)
return 1;
ret = verify_commit_graph(the_repository, graph, flags);
free_commit_graph(graph);
+
+ if (incomplete_chain) {
+ error("one or more commit-graph chain files could not be loaded");
+ ret |= 1;
+ }
+
return ret;
}
diff --git a/commit-graph.c b/commit-graph.c
index b1d3e5bf94..1a56efcf69 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -563,7 +563,8 @@ int open_commit_graph_chain(const char *chain_file,
}
struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
- int fd, struct stat *st)
+ int fd, struct stat *st,
+ int *incomplete_chain)
{
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
@@ -618,6 +619,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
fclose(fp);
strbuf_release(&line);
+ *incomplete_chain = !valid;
return graph_chain;
}
@@ -630,8 +632,9 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
struct commit_graph *g = NULL;
if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ int incomplete;
/* ownership of fd is taken over by load function */
- g = load_commit_graph_chain_fd_st(r, fd, &st);
+ g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
}
free(chain_file);
diff --git a/commit-graph.h b/commit-graph.h
index 3b662fd2b5..20ada7e891 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -107,7 +107,8 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
int fd, struct stat *st,
struct object_directory *odb);
struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
- int fd, struct stat *st);
+ int fd, struct stat *st,
+ int *incomplete_chain);
struct commit_graph *read_commit_graph_one(struct repository *r,
struct object_directory *odb);
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index be58545810..06bb897f02 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -285,6 +285,32 @@ test_expect_success 'verify hashes along chain, even in shallow' '
)
'
+test_expect_success 'verify notices chain slice which is bogus (base)' '
+ git clone --no-hardlinks . verify-chain-bogus-base &&
+ (
+ cd verify-chain-bogus-base &&
+ git commit-graph verify &&
+ base_file=$graphdir/graph-$(sed -n 1p $graphdir/commit-graph-chain).graph &&
+ echo "garbage" >$base_file &&
+ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ grep "commit-graph file is too small" err
+ )
+'
+
+test_expect_success 'verify notices chain slice which is bogus (tip)' '
+ git clone --no-hardlinks . verify-chain-bogus-tip &&
+ (
+ cd verify-chain-bogus-tip &&
+ git commit-graph verify &&
+ tip_file=$graphdir/graph-$(sed -n 2p $graphdir/commit-graph-chain).graph &&
+ echo "garbage" >$tip_file &&
+ test_must_fail git commit-graph verify 2>test_err &&
+ grep -v "^+" test_err >err &&
+ grep "commit-graph file is too small" err
+ )
+'
+
test_expect_success 'verify --shallow does not check base contents' '
git clone --no-hardlinks . verify-shallow &&
(
@@ -306,7 +332,7 @@ test_expect_success 'warn on base graph chunk incorrect' '
git commit-graph verify &&
base_file=$graphdir/graph-$(tail -n 1 $graphdir/commit-graph-chain).graph &&
corrupt_file "$base_file" $(test_oid base) "\01" &&
- git commit-graph verify --shallow 2>test_err &&
+ test_must_fail git commit-graph verify --shallow 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "commit-graph chain does not match" err
)
@@ -332,11 +358,11 @@ test_expect_success 'verify after commit-graph-chain corruption (tip)' '
(
cd verify-chain-tip &&
corrupt_file "$graphdir/commit-graph-chain" 70 "G" &&
- git commit-graph verify 2>test_err &&
+ test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "invalid commit-graph chain" err &&
corrupt_file "$graphdir/commit-graph-chain" 70 "A" &&
- git commit-graph verify 2>test_err &&
+ test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "unable to find all commit-graph files" err
)
--
2.42.0.773.ga6e30199be
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/6] some "commit-graph verify" fixes for chains
2023-09-28 4:37 ` [PATCH v2 0/6] some "commit-graph verify" fixes for chains Jeff King
` (5 preceding siblings ...)
2023-09-28 4:39 ` [PATCH v2 6/6] commit-graph: report incomplete chains during verification Jeff King
@ 2023-10-05 17:38 ` Taylor Blau
6 siblings, 0 replies; 16+ messages in thread
From: Taylor Blau @ 2023-10-05 17:38 UTC (permalink / raw)
To: Jeff King; +Cc: git, Derrick Stolee
On Thu, Sep 28, 2023 at 12:37:46AM -0400, Jeff King wrote:
> On Tue, Sep 26, 2023 at 01:54:52AM -0400, Jeff King wrote:
>
> > [1/6]: commit-graph: factor out chain opening function
> > [2/6]: commit-graph: check mixed generation validation when loading chain file
> > [3/6]: t5324: harmonize sha1/sha256 graph chain corruption
> > [4/6]: commit-graph: detect read errors when verifying graph chain
> > [5/6]: commit-graph: tighten chain size check
> > [6/6]: commit-graph: report incomplete chains during verification
>
> Here's a re-roll that fixes a missed case in patch 6 (I sent more
> details elsewhere in the thread).
Thanks, and apologies for taking a little longer than I would have liked
to review both of these rounds. This round looks great to me (and it's
already in 'next').
Thanks,
Taylor
^ permalink raw reply [flat|nested] 16+ messages in thread