* [PATCH 0/9] commit-graph: remove reliance on global state
@ 2025-08-04 8:17 Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 1/9] trace2: introduce function to trace unsigned integers Patrick Steinhardt
` (12 more replies)
0 siblings, 13 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
Hi,
this patch series is another step on our long road towards not having
global state. In addition to that, as commit-graphs are part of the
object database layer, this is also another step towards pluggable
object databases.
Thanks!
Patrick
---
Patrick Steinhardt (9):
trace2: introduce function to trace unsigned integers
commit-graph: stop using signed integers to count bloom filters
commit-graph: fix type for some write options
commit-graph: fix sign comparison warnings
commit-graph: stop using `the_hash_algo` via macros
commit-graph: store the hash algorithm instead of its length
commit-graph: stop using `the_hash_algo`
commit-graph: stop using `the_repository`
commit-graph: stop passing in redundant repository
builtin/commit-graph.c | 13 +-
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 366 +++++++++++++++++++++----------------------
commit-graph.h | 24 +--
oss-fuzz/fuzz-commit-graph.c | 4 +-
t/helper/test-read-graph.c | 2 +-
trace2.c | 14 ++
trace2.h | 9 ++
9 files changed, 226 insertions(+), 210 deletions(-)
---
base-commit: e813a0200a7121b97fec535f0d0b460b0a33356c
change-id: 20250717-b4-pks-commit-graph-wo-the-repository-1dc2cacbc8e3
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 1/9] trace2: introduce function to trace unsigned integers
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 21:33 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters Patrick Steinhardt
` (11 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
While we have `trace2_data_intmax()`, there is no equivalent function
that takes an unsigned integer. Introduce `trace2_data_uintmax()` to
plug this gap.
This function will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
trace2.c | 14 ++++++++++++++
trace2.h | 9 +++++++++
2 files changed, 23 insertions(+)
diff --git a/trace2.c b/trace2.c
index c23c0a227b..a687944f7b 100644
--- a/trace2.c
+++ b/trace2.c
@@ -948,6 +948,20 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
strbuf_release(&buf_string);
}
+void trace2_data_uintmax_fl(const char *file, int line, const char *category,
+ const struct repository *repo, const char *key,
+ uintmax_t value)
+{
+ struct strbuf buf_string = STRBUF_INIT;
+
+ if (!trace2_enabled)
+ return;
+
+ strbuf_addf(&buf_string, "%" PRIuMAX, value);
+ trace2_data_string_fl(file, line, category, repo, key, buf_string.buf);
+ strbuf_release(&buf_string);
+}
+
void trace2_data_json_fl(const char *file, int line, const char *category,
const struct repository *repo, const char *key,
const struct json_writer *value)
diff --git a/trace2.h b/trace2.h
index e4f23784e4..115c45a1eb 100644
--- a/trace2.h
+++ b/trace2.h
@@ -463,6 +463,15 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
(value))
+void trace2_data_uintmax_fl(const char *file, int line, const char *category,
+ const struct repository *repo, const char *key,
+ uintmax_t value);
+
+#define trace2_data_uintmax(category, repo, key, value) \
+ trace2_data_uintmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
+ (value))
+
+
void trace2_data_json_fl(const char *file, int line, const char *category,
const struct repository *repo, const char *key,
const struct json_writer *jw);
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 1/9] trace2: introduce function to trace unsigned integers Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 9:13 ` Oswald Buddenhagen
2025-08-04 21:42 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 3/9] commit-graph: fix type for some write options Patrick Steinhardt
` (10 subsequent siblings)
12 siblings, 2 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
When writing a new commit graph we have a couple of counters that
provide statistics around what kind of bloom filters we have or have not
written. These counters naturally count from zero and are only ever
incremented, but they use a signed integer as type regardless.
Refactor those fields to be of type `size_t` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index bd7b6f5338..a7a1a761bc 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1170,11 +1170,11 @@ struct write_commit_graph_context {
size_t total_bloom_filter_data_size;
const struct bloom_filter_settings *bloom_settings;
- int count_bloom_filter_computed;
- int count_bloom_filter_not_computed;
- int count_bloom_filter_trunc_empty;
- int count_bloom_filter_trunc_large;
- int count_bloom_filter_upgraded;
+ size_t count_bloom_filter_computed;
+ size_t count_bloom_filter_not_computed;
+ size_t count_bloom_filter_trunc_empty;
+ size_t count_bloom_filter_trunc_large;
+ size_t count_bloom_filter_upgraded;
};
static int write_graph_chunk_fanout(struct hashfile *f,
@@ -1779,16 +1779,16 @@ void ensure_generations_valid(struct repository *r,
static void trace2_bloom_filter_write_statistics(struct write_commit_graph_context *ctx)
{
- trace2_data_intmax("commit-graph", ctx->r, "filter-computed",
- ctx->count_bloom_filter_computed);
- trace2_data_intmax("commit-graph", ctx->r, "filter-not-computed",
- ctx->count_bloom_filter_not_computed);
- trace2_data_intmax("commit-graph", ctx->r, "filter-trunc-empty",
- ctx->count_bloom_filter_trunc_empty);
- trace2_data_intmax("commit-graph", ctx->r, "filter-trunc-large",
- ctx->count_bloom_filter_trunc_large);
- trace2_data_intmax("commit-graph", ctx->r, "filter-upgraded",
- ctx->count_bloom_filter_upgraded);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-computed",
+ ctx->count_bloom_filter_computed);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-not-computed",
+ ctx->count_bloom_filter_not_computed);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-trunc-empty",
+ ctx->count_bloom_filter_trunc_empty);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-trunc-large",
+ ctx->count_bloom_filter_trunc_large);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-upgraded",
+ ctx->count_bloom_filter_upgraded);
}
static void compute_bloom_filters(struct write_commit_graph_context *ctx)
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 3/9] commit-graph: fix type for some write options
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 1/9] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 21:52 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 4/9] commit-graph: fix sign comparison warnings Patrick Steinhardt
` (9 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
The options "max-commits" and "size-multiple" are both supposed to be
positive integers and are documented as such, but we use a signed
integer field to store them. This causes sign comparison warnings in
`split_graph_merge_strategy()` because we end up comparing the option
values with the observed number of commits.
Fix the issue by converting the fields to be unsigned and convert the
options to use `OPT_UNSIGNED()` accordingly. This macro has only been
introduced recently, which might explain why the option values were
signed in the first place.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 4 ++--
commit-graph.c | 5 ++---
commit-graph.h | 4 ++--
3 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 25018a0b9d..145802afb7 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -241,9 +241,9 @@ static int graph_write(int argc, const char **argv, const char *prefix,
N_("allow writing an incremental commit-graph file"),
PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
write_option_parse_split),
- OPT_INTEGER(0, "max-commits", &write_opts.max_commits,
+ OPT_UNSIGNED(0, "max-commits", &write_opts.max_commits,
N_("maximum number of commits in a non-base split commit-graph")),
- OPT_INTEGER(0, "size-multiple", &write_opts.size_multiple,
+ OPT_UNSIGNED(0, "size-multiple", &write_opts.size_multiple,
N_("maximum ratio between two levels of a split commit-graph")),
OPT_EXPIRY_DATE(0, "expire-time", &write_opts.expire_time,
N_("only expire files older than a given date-time")),
diff --git a/commit-graph.c b/commit-graph.c
index a7a1a761bc..ad3f084dd4 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2235,9 +2235,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
uint32_t num_commits;
enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_UNSPECIFIED;
uint32_t i;
-
- int max_commits = 0;
- int size_mult = 2;
+ size_t max_commits = 0;
+ size_t size_mult = 2;
if (ctx->opts) {
max_commits = ctx->opts->max_commits;
diff --git a/commit-graph.h b/commit-graph.h
index 78ab7b875b..b71cb55697 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -160,8 +160,8 @@ enum commit_graph_split_flags {
};
struct commit_graph_opts {
- int size_multiple;
- int max_commits;
+ size_t size_multiple;
+ size_t max_commits;
timestamp_t expire_time;
enum commit_graph_split_flags split_flags;
int max_new_filters;
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 4/9] commit-graph: fix sign comparison warnings
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (2 preceding siblings ...)
2025-08-04 8:17 ` [PATCH 3/9] commit-graph: fix type for some write options Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 22:04 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 5/9] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
` (8 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
The "commit-graph.c" file has a bunch of sign comparison warnings:
- There are a bunch of variables that are declared as signed integers
even though they are used to count entities, like for example
`num_commit_graphs_before` and `num_commit_graphs_after`.
- There are several cases where we use signed loop variables to
iterate through an unsigned entity count.
- In `write_graph_chunk_base_1()` we count how many chunks we have
written in total. But while the value represents a positive
quantity, we still return a signed integer that we then later
compare with unsigned values.
- The bloom settings hash version is being assigned `-1` even though
it's an unsigned value. This is used to indicate an unspecified
value and relies on 1's complement.
Fix all of these cases by either using the proper variable type or by
adding casts as required.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 54 +++++++++++++++++++++++++++---------------------------
1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index ad3f084dd4..443177ffd3 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,4 @@
#define USE_THE_REPOSITORY_VARIABLE
-#define DISABLE_SIGN_COMPARE_WARNINGS
#include "git-compat-util.h"
#include "config.h"
@@ -569,7 +568,7 @@ static void validate_mixed_bloom_settings(struct commit_graph *g)
static int add_graph_to_chain(struct commit_graph *g,
struct commit_graph *chain,
struct object_id *oids,
- int n)
+ size_t n)
{
struct commit_graph *cur_g = chain;
@@ -622,7 +621,7 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size < the_hash_algo->hexsz) {
+ if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ -643,15 +642,16 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
struct object_id *oids;
- int i = 0, valid = 1, count;
+ int valid = 1;
FILE *fp = xfdopen(fd, "r");
+ size_t count;
count = st->st_size / (the_hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
odb_prepare_alternates(r->objects);
- for (i = 0; i < count; i++) {
+ for (size_t i = 0; i < count; i++) {
struct odb_source *source;
if (strbuf_getline_lf(&line, fp) == EOF)
@@ -1145,12 +1145,12 @@ struct write_commit_graph_context {
int num_generation_data_overflows;
unsigned long approx_nr_objects;
struct progress *progress;
- int progress_done;
+ uint64_t progress_done;
uint64_t progress_cnt;
char *base_graph_name;
- int num_commit_graphs_before;
- int num_commit_graphs_after;
+ size_t num_commit_graphs_before;
+ size_t num_commit_graphs_after;
char **commit_graph_filenames_before;
char **commit_graph_filenames_after;
char **commit_graph_hash_after;
@@ -1181,7 +1181,7 @@ static int write_graph_chunk_fanout(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i, count = 0;
+ size_t i, count = 0;
struct commit **list = ctx->commits.list;
/*
@@ -1209,7 +1209,8 @@ static int write_graph_chunk_oids(struct hashfile *f,
{
struct write_commit_graph_context *ctx = data;
struct commit **list = ctx->commits.list;
- int count;
+ size_t count;
+
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
@@ -1331,9 +1332,9 @@ static int write_graph_chunk_generation_data(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i, num_generation_data_overflows = 0;
+ int num_generation_data_overflows = 0;
- for (i = 0; i < ctx->commits.nr; i++) {
+ for (size_t i = 0; i < ctx->commits.nr; i++) {
struct commit *c = ctx->commits.list[i];
timestamp_t offset;
repo_parse_commit(ctx->r, c);
@@ -1355,8 +1356,8 @@ static int write_graph_chunk_generation_data_overflow(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i;
- for (i = 0; i < ctx->commits.nr; i++) {
+
+ for (size_t i = 0; i < ctx->commits.nr; i++) {
struct commit *c = ctx->commits.list[i];
timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
display_progress(ctx->progress, ++ctx->progress_cnt);
@@ -1526,7 +1527,7 @@ static void add_missing_parents(struct write_commit_graph_context *ctx, struct c
static void close_reachable(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct commit *commit;
enum commit_graph_split_flags flags = ctx->opts ?
ctx->opts->split_flags : COMMIT_GRAPH_SPLIT_UNSPECIFIED;
@@ -1620,10 +1621,9 @@ static void compute_reachable_generation_numbers(
struct compute_generation_info *info,
int generation_version)
{
- int i;
struct commit_list *list = NULL;
- for (i = 0; i < info->commits->nr; i++) {
+ for (size_t i = 0; i < info->commits->nr; i++) {
struct commit *c = info->commits->list[i];
timestamp_t gen;
repo_parse_commit(info->r, c);
@@ -1714,7 +1714,7 @@ static void set_generation_v2(struct commit *c, timestamp_t t,
static void compute_generation_numbers(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct compute_generation_info info = {
.r = ctx->r,
.commits = &ctx->commits,
@@ -1793,10 +1793,10 @@ static void trace2_bloom_filter_write_statistics(struct write_commit_graph_conte
static void compute_bloom_filters(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct progress *progress = NULL;
struct commit **sorted_commits;
- int max_new_filters;
+ size_t max_new_filters;
init_bloom_filters();
@@ -1814,7 +1814,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
QSORT(sorted_commits, ctx->commits.nr, commit_gen_cmp);
max_new_filters = ctx->opts && ctx->opts->max_new_filters >= 0 ?
- ctx->opts->max_new_filters : ctx->commits.nr;
+ (size_t) ctx->opts->max_new_filters : ctx->commits.nr;
for (i = 0; i < ctx->commits.nr; i++) {
enum bloom_filter_computed computed = 0;
@@ -2017,10 +2017,10 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
stop_progress(&ctx->progress);
}
-static int write_graph_chunk_base_1(struct hashfile *f,
- struct commit_graph *g)
+static size_t write_graph_chunk_base_1(struct hashfile *f,
+ struct commit_graph *g)
{
- int num = 0;
+ size_t num = 0;
if (!g)
return 0;
@@ -2034,7 +2034,7 @@ static int write_graph_chunk_base(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int num = write_graph_chunk_base_1(f, ctx->new_base_graph);
+ size_t num = write_graph_chunk_base_1(f, ctx->new_base_graph);
if (num != ctx->num_commit_graphs_after - 1) {
error(_("failed to write correct number of base graph ids"));
@@ -2480,7 +2480,7 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx)
if (stat(path.buf, &st) < 0)
continue;
- if (st.st_mtime > expire_time)
+ if ((unsigned) st.st_mtime > expire_time)
continue;
if (path.len < 6 || strcmp(path.buf + path.len - 6, ".graph"))
continue;
@@ -2576,7 +2576,7 @@ int write_commit_graph(struct odb_source *source,
ctx.changed_paths = 1;
/* don't propagate the hash_version unless unspecified */
- if (bloom_settings.hash_version == -1)
+ if (bloom_settings.hash_version == (unsigned) -1)
bloom_settings.hash_version = g->bloom_filter_settings->hash_version;
bloom_settings.bits_per_entry = g->bloom_filter_settings->bits_per_entry;
bloom_settings.num_hashes = g->bloom_filter_settings->num_hashes;
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 5/9] commit-graph: stop using `the_hash_algo` via macros
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (3 preceding siblings ...)
2025-08-04 8:17 ` [PATCH 4/9] commit-graph: fix sign comparison warnings Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 22:05 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 6/9] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
` (7 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
We have two macros `GRAPH_DATA_WIDTH` and `GRAPH_MIN_SIZE` that compute
hash-dependent sizes. They do so by using the global `the_hash_algo`
variable though, which we want to get rid of over time.
Convert these macros into functions that accept the hash algorithm as
input parameter. Adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 443177ffd3..3c40f4a470 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -52,8 +52,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */
#define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */
-#define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16)
-
#define GRAPH_VERSION_1 0x1
#define GRAPH_VERSION GRAPH_VERSION_1
@@ -65,8 +63,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_HEADER_SIZE 8
#define GRAPH_FANOUT_SIZE (4 * 256)
-#define GRAPH_MIN_SIZE (GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE \
- + GRAPH_FANOUT_SIZE + the_hash_algo->rawsz)
#define CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW (1ULL << 31)
@@ -79,6 +75,16 @@ define_commit_slab(topo_level_slab, uint32_t);
define_commit_slab(commit_pos, int);
static struct commit_pos commit_pos = COMMIT_SLAB_INIT(1, commit_pos);
+static size_t graph_data_width(const struct git_hash_algo *algop)
+{
+ return algop->rawsz + 16;
+}
+
+static size_t graph_min_size(const struct git_hash_algo *algop)
+{
+ return GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE + GRAPH_FANOUT_SIZE + algop->rawsz;
+}
+
static void set_commit_pos(struct repository *r, const struct object_id *oid)
{
static int32_t max_pos;
@@ -257,7 +263,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < GRAPH_MIN_SIZE) {
+ if (graph_size < graph_min_size(the_hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -313,7 +319,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / GRAPH_DATA_WIDTH != g->num_commits)
+ if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -378,7 +384,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < GRAPH_MIN_SIZE)
+ if (graph_size < graph_min_size(the_hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -900,7 +906,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(GRAPH_DATA_WIDTH, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1104,7 +1110,8 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(GRAPH_DATA_WIDTH, graph_pos - g->num_commits_in_base);
+ st_mult(graph_data_width(the_hash_algo),
+ graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 6/9] commit-graph: store the hash algorithm instead of its length
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (4 preceding siblings ...)
2025-08-04 8:17 ` [PATCH 5/9] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 22:07 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 7/9] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
` (6 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
The commit-graph stores the length of the hash algorithm it uses. In
subsequent commits we'll need to pass the whole hash algorithm around
though, which we currently don't have access to.
Refactor the code so that we store the hash algorithm instead of only
its size.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 36 ++++++++++++++++++------------------
commit-graph.h | 2 +-
2 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 3c40f4a470..9c2278dd7a 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -310,7 +310,7 @@ static int graph_read_oid_lookup(const unsigned char *chunk_start,
{
struct commit_graph *g = data;
g->chunk_oid_lookup = chunk_start;
- if (chunk_size / g->hash_len != g->num_commits)
+ if (chunk_size / g->hash_algo->rawsz != g->num_commits)
return error(_("commit-graph OID lookup chunk is the wrong size"));
return 0;
}
@@ -412,7 +412,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
graph = alloc_commit_graph();
- graph->hash_len = the_hash_algo->rawsz;
+ graph->hash_algo = the_hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
@@ -477,7 +477,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
FREE_AND_NULL(graph->bloom_filter_settings);
}
- oidread(&graph->oid, graph->data + graph->data_len - graph->hash_len,
+ oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
the_repository->hash_algo);
free_chunkfile(cf);
@@ -583,7 +583,7 @@ static int add_graph_to_chain(struct commit_graph *g,
return 0;
}
- if (g->chunk_base_graphs_size / g->hash_len < n) {
+ if (g->chunk_base_graphs_size / g->hash_algo->rawsz < n) {
warning(_("commit-graph base graphs chunk is too small"));
return 0;
}
@@ -593,7 +593,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
- !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_len, n),
+ !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
the_repository->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
@@ -805,7 +805,7 @@ int generation_numbers_enabled(struct repository *r)
return 0;
first_generation = get_be32(g->chunk_commit_data +
- g->hash_len + 8) >> 2;
+ g->hash_algo->rawsz + 8) >> 2;
return !!first_generation;
}
@@ -849,7 +849,7 @@ void close_commit_graph(struct object_database *o)
static int bsearch_graph(struct commit_graph *g, const struct object_id *oid, uint32_t *pos)
{
return bsearch_hash(oid->hash, g->chunk_oid_fanout,
- g->chunk_oid_lookup, g->hash_len, pos);
+ g->chunk_oid_lookup, g->hash_algo->rawsz, pos);
}
static void load_oid_from_graph(struct commit_graph *g,
@@ -869,7 +869,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
- oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_len, lex_index),
+ oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
the_repository->hash_algo);
}
@@ -911,8 +911,8 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
- date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
- date_low = get_be32(commit_data + g->hash_len + 12);
+ date_high = get_be32(commit_data + g->hash_algo->rawsz + 8) & 0x3;
+ date_low = get_be32(commit_data + g->hash_algo->rawsz + 12);
item->date = (timestamp_t)((date_high << 32) | date_low);
if (g->read_generation_data) {
@@ -930,10 +930,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
} else
graph_data->generation = item->date + offset;
} else
- graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
+ graph_data->generation = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
if (g->topo_levels)
- *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2;
+ *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
}
static inline void set_commit_tree(struct commit *c, struct tree *t)
@@ -957,7 +957,7 @@ static int fill_commit_in_graph(struct repository *r,
fill_commit_graph_info(item, g, pos);
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(g->hash_len + 16, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(g->hash_algo->rawsz + 16, lex_index);
item->object.parsed = 1;
@@ -965,12 +965,12 @@ static int fill_commit_in_graph(struct repository *r,
pptr = &item->parents;
- edge_value = get_be32(commit_data + g->hash_len);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
pptr = insert_parent_or_die(r, g, edge_value, pptr);
- edge_value = get_be32(commit_data + g->hash_len + 4);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
@@ -2622,7 +2622,7 @@ int write_commit_graph(struct odb_source *source,
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
- oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
@@ -2753,7 +2753,7 @@ static int verify_one_commit_graph(struct repository *r,
for (i = 0; i < g->num_commits; i++) {
struct commit *graph_commit;
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
@@ -2798,7 +2798,7 @@ static int verify_one_commit_graph(struct repository *r,
timestamp_t generation;
display_progress(progress, ++(*seen));
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
diff --git a/commit-graph.h b/commit-graph.h
index b71cb55697..f20d28ff3a 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -84,7 +84,7 @@ struct commit_graph {
const unsigned char *data;
size_t data_len;
- unsigned char hash_len;
+ const struct git_hash_algo *hash_algo;
unsigned char num_chunks;
uint32_t num_commits;
struct object_id oid;
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 7/9] commit-graph: stop using `the_hash_algo`
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (5 preceding siblings ...)
2025-08-04 8:17 ` [PATCH 6/9] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 22:10 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 8/9] commit-graph: stop using `the_repository` Patrick Steinhardt
` (5 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
Stop using `the_hash_algo` as it implicitly relies on `the_repository`.
Instead, we either use the hash algo provided via the context or, if
there is no such hash algo, we use `the_repository` explicitly. Such
uses will be removed in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 3 ++-
commit-graph.c | 40 +++++++++++++++++++++-------------------
commit-graph.h | 4 +++-
oss-fuzz/fuzz-commit-graph.c | 4 +++-
4 files changed, 29 insertions(+), 22 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 145802afb79..680b03a83a8 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -108,7 +108,8 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
opened = OPENED_GRAPH;
else if (errno != ENOENT)
die_errno(_("Could not open commit-graph '%s'"), graph_name);
- else if (open_commit_graph_chain(chain_name, &fd, &st))
+ else if (open_commit_graph_chain(chain_name, &fd, &st,
+ the_repository->hash_algo))
opened = OPENED_CHAIN;
else if (errno != ENOENT)
die_errno(_("could not open commit-graph chain '%s'"), chain_name);
diff --git a/commit-graph.c b/commit-graph.c
index 9c2278dd7a1..b3feb6dfd77 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -263,7 +263,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(the_hash_algo)) {
+ if (graph_size < graph_min_size(r->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -271,7 +271,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
prepare_repo_settings(r);
- ret = parse_commit_graph(&r->settings, graph_map, graph_size);
+ ret = parse_commit_graph(&r->settings, r->hash_algo, graph_map, graph_size);
if (ret)
ret->odb_source = source;
@@ -319,7 +319,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
+ if (chunk_size / graph_data_width(g->hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -373,6 +373,7 @@ static int graph_read_bloom_data(const unsigned char *chunk_start,
}
struct commit_graph *parse_commit_graph(struct repo_settings *s,
+ const struct git_hash_algo *hash_algo,
void *graph_map, size_t graph_size)
{
const unsigned char *data;
@@ -384,7 +385,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < graph_min_size(the_hash_algo))
+ if (graph_size < graph_min_size(hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -404,22 +405,22 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
hash_version = *(unsigned char*)(data + 5);
- if (hash_version != oid_version(the_hash_algo)) {
+ if (hash_version != oid_version(hash_algo)) {
error(_("commit-graph hash version %X does not match version %X"),
- hash_version, oid_version(the_hash_algo));
+ hash_version, oid_version(hash_algo));
return NULL;
}
graph = alloc_commit_graph();
- graph->hash_algo = the_hash_algo;
+ graph->hash_algo = hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
if (graph_size < GRAPH_HEADER_SIZE +
(graph->num_chunks + 1) * CHUNK_TOC_ENTRY_SIZE +
- GRAPH_FANOUT_SIZE + the_hash_algo->rawsz) {
+ GRAPH_FANOUT_SIZE + hash_algo->rawsz) {
error(_("commit-graph file is too small to hold %u chunks"),
graph->num_chunks);
free(graph);
@@ -618,7 +619,8 @@ static int add_graph_to_chain(struct commit_graph *g,
}
int open_commit_graph_chain(const char *chain_file,
- int *fd, struct stat *st)
+ int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo)
{
*fd = git_open(chain_file);
if (*fd < 0)
@@ -627,7 +629,7 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
+ if (st->st_size < (ssize_t) hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ -652,7 +654,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
FILE *fp = xfdopen(fd, "r");
size_t count;
- count = st->st_size / (the_hash_algo->hexsz + 1);
+ count = st->st_size / (r->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
odb_prepare_alternates(r->objects);
@@ -714,7 +716,7 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
@@ -906,7 +908,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(g->hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1110,7 +1112,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(graph_data_width(the_hash_algo),
+ st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
@@ -1220,7 +1222,7 @@ static int write_graph_chunk_oids(struct hashfile *f,
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
- hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, (*list)->object.oid.hash, f->algop->rawsz);
}
return 0;
@@ -1251,7 +1253,7 @@ static int write_graph_chunk_data(struct hashfile *f,
die(_("unable to parse commit %s"),
oid_to_hex(&(*list)->object.oid));
tree = get_commit_tree_oid(*list);
- hashwrite(f, tree->hash, the_hash_algo->rawsz);
+ hashwrite(f, tree->hash, ctx->r->hash_algo->rawsz);
parent = (*list)->parents;
@@ -2033,7 +2035,7 @@ static size_t write_graph_chunk_base_1(struct hashfile *f,
return 0;
num = write_graph_chunk_base_1(f, g->base_graph);
- hashwrite(f, g->oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, g->oid.hash, g->hash_algo->rawsz);
return num + 1;
}
@@ -2057,7 +2059,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
struct hashfile *f;
struct tempfile *graph_layer; /* when ctx->split is non-zero */
struct lock_file lk = LOCK_INIT;
- const unsigned hashsz = the_hash_algo->rawsz;
+ const unsigned hashsz = ctx->r->hash_algo->rawsz;
struct strbuf progress_title = STRBUF_INIT;
struct chunkfile *cf;
unsigned char file_hash[GIT_MAX_RAWSZ];
@@ -2145,7 +2147,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
hashwrite_be32(f, GRAPH_SIGNATURE);
hashwrite_u8(f, GRAPH_VERSION);
- hashwrite_u8(f, oid_version(the_hash_algo));
+ hashwrite_u8(f, oid_version(ctx->r->hash_algo));
hashwrite_u8(f, get_num_chunks(cf));
hashwrite_u8(f, ctx->num_commit_graphs_after - 1);
diff --git a/commit-graph.h b/commit-graph.h
index f20d28ff3a0..f26881849d6 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -32,7 +32,8 @@ struct string_list;
char *get_commit_graph_filename(struct odb_source *source);
char *get_commit_graph_chain_filename(struct odb_source *source);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
-int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st);
+int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo);
/*
* Given a commit struct, try to fill the commit struct info, including:
@@ -129,6 +130,7 @@ struct repo_settings;
* prior to calling parse_commit_graph().
*/
struct commit_graph *parse_commit_graph(struct repo_settings *s,
+ const struct git_hash_algo *hash_algo,
void *graph_map, size_t graph_size);
/*
diff --git a/oss-fuzz/fuzz-commit-graph.c b/oss-fuzz/fuzz-commit-graph.c
index fbb77fec197..879072f9d3c 100644
--- a/oss-fuzz/fuzz-commit-graph.c
+++ b/oss-fuzz/fuzz-commit-graph.c
@@ -5,6 +5,7 @@
#include "repository.h"
struct commit_graph *parse_commit_graph(struct repo_settings *s,
+ const struct git_hash_algo *hash_algo,
void *graph_map, size_t graph_size);
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
@@ -24,7 +25,8 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
the_repository->settings.commit_graph_generation_version = 2;
the_repository->settings.commit_graph_changed_paths_version = 1;
- g = parse_commit_graph(&the_repository->settings, (void *)data, size);
+ g = parse_commit_graph(&the_repository->settings, the_repository->hash_algo,
+ (void *)data, size);
repo_clear(the_repository);
free_commit_graph(g);
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 8/9] commit-graph: stop using `the_repository`
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (6 preceding siblings ...)
2025-08-04 8:17 ` [PATCH 7/9] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-04 22:11 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 9/9] commit-graph: stop passing in redundant repository Patrick Steinhardt
` (4 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
There's still a bunch of uses of `the_repository` in "commit-graph.c",
which we want to stop using due to it being a global variable. Refactor
the code to stop using `the_repository` in favor of the repository
provided via the calling context.
This allows us to drop the `USE_THE_REPOSITORY_VARIABLE` macro.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 79 ++++++++++++++++++++++++++++----------------------------
commit-graph.h | 2 +-
4 files changed, 43 insertions(+), 42 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index 63e7158e98..8ca0aede48 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1933,7 +1933,7 @@ int cmd_commit(int argc,
"new index file. Check that disk is not full and quota is\n"
"not exceeded, and then \"git restore --staged :/\" to recover."));
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
repo_rerere(the_repository, 0);
run_auto_maintenance(quiet);
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..263cb58471 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1862,7 +1862,7 @@ int cmd_merge(int argc,
if (squash) {
finish(head_commit, remoteheads, NULL, NULL);
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
} else
write_merge_state(remoteheads);
diff --git a/commit-graph.c b/commit-graph.c
index b3feb6dfd7..7371db9702 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,3 @@
-#define USE_THE_REPOSITORY_VARIABLE
-
#include "git-compat-util.h"
#include "config.h"
#include "csum-file.h"
@@ -27,7 +25,7 @@
#include "tree.h"
#include "chunk-format.h"
-void git_test_write_commit_graph_or_die(void)
+void git_test_write_commit_graph_or_die(struct odb_source *source)
{
int flags = 0;
if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
@@ -36,8 +34,7 @@ void git_test_write_commit_graph_or_die(void)
if (git_env_bool(GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS, 0))
flags = COMMIT_GRAPH_WRITE_BLOOM_FILTERS;
- if (write_commit_graph_reachable(the_repository->objects->sources,
- flags, NULL))
+ if (write_commit_graph_reachable(source, flags, NULL))
die("failed to write commit-graph under GIT_TEST_COMMIT_GRAPH");
}
@@ -479,7 +476,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
- the_repository->hash_algo);
+ hash_algo);
free_chunkfile(cf);
return graph;
@@ -595,7 +592,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
!hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
- the_repository->hash_algo)) {
+ g->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
}
@@ -665,7 +662,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex(line.buf, &oids[i])) {
+ if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -751,7 +748,7 @@ static void prepare_commit_graph_one(struct repository *r,
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
* On the first invocation, this function attempts to load the commit
- * graph if the_repository is configured to have one.
+ * graph if the repository is configured to have one.
*/
static int prepare_commit_graph(struct repository *r)
{
@@ -872,7 +869,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
- the_repository->hash_algo);
+ g->hash_algo);
}
static struct commit_list **insert_parent_or_die(struct repository *r,
@@ -1115,7 +1112,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
- oidread(&oid, commit_data, the_repository->hash_algo);
+ oidread(&oid, commit_data, g->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
return c->maybe_tree;
@@ -1543,7 +1540,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Loading known commits in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1561,7 +1558,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
*/
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Expanding reachable commits in commit graph"),
0);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1582,7 +1579,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Clearing commit marks in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1699,7 +1696,7 @@ static void compute_topological_levels(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph topological levels"),
ctx->commits.nr);
@@ -1734,7 +1731,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph generation numbers"),
ctx->commits.nr);
@@ -1811,7 +1808,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit changed paths Bloom filters"),
ctx->commits.nr);
@@ -1857,6 +1854,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
}
struct refs_cb_data {
+ struct repository *repo;
struct oidset *commits;
struct progress *progress;
};
@@ -1869,9 +1867,9 @@ static int add_ref_to_set(const char *refname UNUSED,
struct object_id peeled;
struct refs_cb_data *data = (struct refs_cb_data *)cb_data;
- if (!peel_iterated_oid(the_repository, oid, &peeled))
+ if (!peel_iterated_oid(data->repo, oid, &peeled))
oid = &peeled;
- if (odb_read_object_info(the_repository->objects, oid, NULL) == OBJ_COMMIT)
+ if (odb_read_object_info(data->repo->objects, oid, NULL) == OBJ_COMMIT)
oidset_insert(data->commits, oid);
display_progress(data->progress, oidset_size(data->commits));
@@ -1888,13 +1886,15 @@ int write_commit_graph_reachable(struct odb_source *source,
int result;
memset(&data, 0, sizeof(data));
+ data.repo = source->odb->repo;
data.commits = &commits;
+
if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
data.progress = start_delayed_progress(
- the_repository,
+ source->odb->repo,
_("Collecting referenced commits"), 0);
- refs_for_each_ref(get_main_ref_store(the_repository), add_ref_to_set,
+ refs_for_each_ref(get_main_ref_store(source->odb->repo), add_ref_to_set,
&data);
stop_progress(&data.progress);
@@ -1923,7 +1923,7 @@ static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
"Finding commits for commit graph in %"PRIuMAX" packs",
pack_indexes->nr),
(uintmax_t)pack_indexes->nr);
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
progress_title.buf, 0);
ctx->progress_done = 0;
}
@@ -1977,7 +1977,7 @@ static void fill_oids_from_all_packs(struct write_commit_graph_context *ctx)
{
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding commits for commit graph among packed objects"),
ctx->approx_nr_objects);
for_each_packed_object(ctx->r, add_packed_commits, ctx,
@@ -1996,7 +1996,7 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
ctx->num_extra_edges = 0;
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding extra edges in commit graph"),
ctx->oids.nr);
oid_array_sort(&ctx->oids);
@@ -2075,7 +2075,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
ctx->graph_name = get_commit_graph_filename(ctx->odb_source);
}
- if (safe_create_leading_directories(the_repository, ctx->graph_name)) {
+ if (safe_create_leading_directories(ctx->r, ctx->graph_name)) {
error(_("unable to create leading directories of %s"),
ctx->graph_name);
return -1;
@@ -2094,18 +2094,18 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
return -1;
}
- if (adjust_shared_perm(the_repository, get_tempfile_path(graph_layer))) {
+ if (adjust_shared_perm(ctx->r, get_tempfile_path(graph_layer))) {
error(_("unable to adjust shared permissions for '%s'"),
get_tempfile_path(graph_layer));
return -1;
}
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_tempfile_fd(graph_layer), get_tempfile_path(graph_layer));
} else {
hold_lock_file_for_update_mode(&lk, ctx->graph_name,
LOCK_DIE_ON_ERROR, 0444);
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_lock_file_fd(&lk), get_lock_file_path(&lk));
}
@@ -2158,7 +2158,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
get_num_chunks(cf)),
get_num_chunks(cf));
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
progress_title.buf,
st_mult(get_num_chunks(cf), ctx->commits.nr));
}
@@ -2216,7 +2216,8 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
}
free(ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
- ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] = xstrdup(hash_to_hex(file_hash));
+ ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] =
+ xstrdup(hash_to_hex_algop(file_hash, ctx->r->hash_algo));
final_graph_name = get_split_graph_filename(ctx->odb_source,
ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1]);
@@ -2370,7 +2371,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Scanning merged commits"),
ctx->commits.nr);
@@ -2415,7 +2416,7 @@ static void merge_commit_graphs(struct write_commit_graph_context *ctx)
current_graph_number--;
if (ctx->report_progress)
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
_("Merging commit-graph"), 0);
merge_commit_graph(ctx, g);
@@ -2518,7 +2519,7 @@ int write_commit_graph(struct odb_source *source,
enum commit_graph_write_flags flags,
const struct commit_graph_opts *opts)
{
- struct repository *r = the_repository;
+ struct repository *r = source->odb->repo;
struct write_commit_graph_context ctx = {
.r = r,
.odb_source = source,
@@ -2618,14 +2619,14 @@ int write_commit_graph(struct odb_source *source,
replace = ctx.opts->split_flags & COMMIT_GRAPH_SPLIT_REPLACE;
}
- ctx.approx_nr_objects = repo_approximate_object_count(the_repository);
+ ctx.approx_nr_objects = repo_approximate_object_count(r);
if (ctx.append && ctx.r->objects->commit_graph) {
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ r->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
}
@@ -2733,7 +2734,7 @@ static void graph_report(const char *fmt, ...)
static int commit_graph_checksum_valid(struct commit_graph *g)
{
- return hashfile_checksum_valid(the_repository->hash_algo,
+ return hashfile_checksum_valid(g->hash_algo,
g->data, g->data_len);
}
@@ -2756,7 +2757,7 @@ static int verify_one_commit_graph(struct repository *r,
struct commit *graph_commit;
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
graph_report(_("commit-graph has incorrect OID order: %s then %s"),
@@ -2801,7 +2802,7 @@ static int verify_one_commit_graph(struct repository *r,
display_progress(progress, ++(*seen));
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
odb_commit = (struct commit *)create_object(r, &cur_oid, alloc_commit_node(r));
@@ -2905,7 +2906,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(the_repository,
+ progress = start_progress(r,
_("Verifying commits in commit graph"),
total);
}
diff --git a/commit-graph.h b/commit-graph.h
index f26881849d..24c1aca69e 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -21,7 +21,7 @@
* call this method oustide of a builtin, and only if you know what
* you are doing!
*/
-void git_test_write_commit_graph_or_die(void);
+void git_test_write_commit_graph_or_die(struct odb_source *source);
struct commit;
struct bloom_filter_settings;
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 9/9] commit-graph: stop passing in redundant repository
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (7 preceding siblings ...)
2025-08-04 8:17 ` [PATCH 8/9] commit-graph: stop using `the_repository` Patrick Steinhardt
@ 2025-08-04 8:17 ` Patrick Steinhardt
2025-08-05 4:27 ` [PATCH 0/9] commit-graph: remove reliance on global state Derrick Stolee
` (3 subsequent siblings)
12 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 8:17 UTC (permalink / raw)
To: git
Many of the commit-graph related functions take in both a repository and
the object database source (directly or via `struct commit_graph`) for
which we are supposed to load such a commit-graph. In the best case this
information is simply redundant as the source already contains a
reference to its owning object database, which in turn has a reference
to its repository. In the worst case this information could even
mismatch when passing in a source that doesn't belong to the same
repository.
Refactor the code so that we only pass in the object database source in
those cases.
There is one exception though, namely `load_commit_graph_chain_fd_st()`,
which is responsible for loading a commit-graph chain. It is expected
that parts of the commit-graph chain aren't located in the same object
source as the chain file itself, but in a different one. Consequently,
this function doesn't work on the source level but on the database level
instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 6 +--
commit-graph.c | 123 +++++++++++++++++++--------------------------
commit-graph.h | 12 ++---
t/helper/test-read-graph.c | 2 +-
4 files changed, 61 insertions(+), 82 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 680b03a83a..1b80993b2d 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -121,15 +121,15 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
if (opened == OPENED_NONE)
return 0;
else if (opened == OPENED_GRAPH)
- graph = load_commit_graph_one_fd_st(the_repository, fd, &st, source);
+ graph = load_commit_graph_one_fd_st(source, fd, &st);
else
- graph = load_commit_graph_chain_fd_st(the_repository, fd, &st,
+ graph = load_commit_graph_chain_fd_st(the_repository->objects, fd, &st,
&incomplete_chain);
if (!graph)
return 1;
- ret = verify_commit_graph(the_repository, graph, flags);
+ ret = verify_commit_graph(graph, flags);
free_commit_graph(graph);
if (incomplete_chain) {
diff --git a/commit-graph.c b/commit-graph.c
index 7371db9702..308f27046c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -250,9 +250,8 @@ int open_commit_graph(const char *graph_file, int *fd, struct stat *st)
return 1;
}
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source)
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st)
{
void *graph_map;
size_t graph_size;
@@ -260,15 +259,16 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(r->hash_algo)) {
+ if (graph_size < graph_min_size(source->odb->repo->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
}
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
- prepare_repo_settings(r);
- ret = parse_commit_graph(&r->settings, r->hash_algo, graph_map, graph_size);
+ prepare_repo_settings(source->odb->repo);
+ ret = parse_commit_graph(&source->odb->repo->settings, source->odb->repo->hash_algo,
+ graph_map, graph_size);
if (ret)
ret->odb_source = source;
@@ -488,11 +488,9 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
return NULL;
}
-static struct commit_graph *load_commit_graph_one(struct repository *r,
- const char *graph_file,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_one(struct odb_source *source,
+ const char *graph_file)
{
-
struct stat st;
int fd;
struct commit_graph *g;
@@ -501,19 +499,17 @@ static struct commit_graph *load_commit_graph_one(struct repository *r,
if (!open_ok)
return NULL;
- g = load_commit_graph_one_fd_st(r, fd, &st, source);
-
+ g = load_commit_graph_one_fd_st(source, fd, &st);
if (g)
g->filename = xstrdup(graph_file);
return g;
}
-static struct commit_graph *load_commit_graph_v1(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_v1(struct odb_source *source)
{
char *graph_name = get_commit_graph_filename(source);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
return g;
@@ -640,7 +636,7 @@ int open_commit_graph_chain(const char *chain_file,
return 1;
}
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain)
{
@@ -651,10 +647,10 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
FILE *fp = xfdopen(fd, "r");
size_t count;
- count = st->st_size / (r->hash_algo->hexsz + 1);
+ count = st->st_size / (odb->repo->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
- odb_prepare_alternates(r->objects);
+ odb_prepare_alternates(odb);
for (size_t i = 0; i < count; i++) {
struct odb_source *source;
@@ -662,7 +658,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
+ if (get_oid_hex_algop(line.buf, &oids[i], odb->repo->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -670,9 +666,9 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
}
valid = 0;
- for (source = r->objects->sources; source; source = source->next) {
+ for (source = odb->sources; source; source = source->next) {
char *graph_name = get_split_graph_filename(source, line.buf);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
@@ -705,45 +701,33 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
return graph_chain;
}
-static struct commit_graph *load_commit_graph_chain(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_chain(struct odb_source *source)
{
char *chain_file = get_commit_graph_chain_filename(source);
struct stat st;
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, source->odb->repo->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
- g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
+ g = load_commit_graph_chain_fd_st(source->odb, fd, &st, &incomplete);
}
free(chain_file);
return g;
}
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source)
+struct commit_graph *read_commit_graph_one(struct odb_source *source)
{
- struct commit_graph *g = load_commit_graph_v1(r, source);
+ struct commit_graph *g = load_commit_graph_v1(source);
if (!g)
- g = load_commit_graph_chain(r, source);
+ g = load_commit_graph_chain(source);
return g;
}
-static void prepare_commit_graph_one(struct repository *r,
- struct odb_source *source)
-{
-
- if (r->objects->commit_graph)
- return;
-
- r->objects->commit_graph = read_commit_graph_one(r, source);
-}
-
/*
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
@@ -784,10 +768,12 @@ static int prepare_commit_graph(struct repository *r)
return 0;
odb_prepare_alternates(r->objects);
- for (source = r->objects->sources;
- !r->objects->commit_graph && source;
- source = source->next)
- prepare_commit_graph_one(r, source);
+ for (source = r->objects->sources; source; source = source->next) {
+ r->objects->commit_graph = read_commit_graph_one(source);
+ if (r->objects->commit_graph)
+ break;
+ }
+
return !!r->objects->commit_graph;
}
@@ -872,8 +858,7 @@ static void load_oid_from_graph(struct commit_graph *g,
g->hash_algo);
}
-static struct commit_list **insert_parent_or_die(struct repository *r,
- struct commit_graph *g,
+static struct commit_list **insert_parent_or_die(struct commit_graph *g,
uint32_t pos,
struct commit_list **pptr)
{
@@ -884,7 +869,7 @@ static struct commit_list **insert_parent_or_die(struct repository *r,
die("invalid parent position %"PRIu32, pos);
load_oid_from_graph(g, pos, &oid);
- c = lookup_commit(r, &oid);
+ c = lookup_commit(g->odb_source->odb->repo, &oid);
if (!c)
die(_("could not find commit %s"), oid_to_hex(&oid));
commit_graph_data_at(c)->graph_pos = pos;
@@ -940,8 +925,7 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
c->maybe_tree = t;
}
-static int fill_commit_in_graph(struct repository *r,
- struct commit *item,
+static int fill_commit_in_graph(struct commit *item,
struct commit_graph *g, uint32_t pos)
{
uint32_t edge_value;
@@ -967,13 +951,13 @@ static int fill_commit_in_graph(struct repository *r,
edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
return 1;
}
@@ -988,7 +972,7 @@ static int fill_commit_in_graph(struct repository *r,
}
edge_value = get_be32(g->chunk_extra_edges +
sizeof(uint32_t) * parent_data_pos);
- pptr = insert_parent_or_die(r, g,
+ pptr = insert_parent_or_die(g,
edge_value & GRAPH_EDGE_LAST_MASK,
pptr);
parent_data_pos++;
@@ -1054,14 +1038,13 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
if (commit->object.parsed)
return commit;
- if (!fill_commit_in_graph(repo, commit, repo->objects->commit_graph, pos))
+ if (!fill_commit_in_graph(commit, repo->objects->commit_graph, pos))
return NULL;
return commit;
}
-static int parse_commit_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static int parse_commit_in_graph_one(struct commit_graph *g,
struct commit *item)
{
uint32_t pos;
@@ -1070,7 +1053,7 @@ static int parse_commit_in_graph_one(struct repository *r,
return 1;
if (find_commit_pos_in_graph(item, g, &pos))
- return fill_commit_in_graph(r, item, g, pos);
+ return fill_commit_in_graph(item, g, pos);
return 0;
}
@@ -1087,7 +1070,7 @@ int parse_commit_in_graph(struct repository *r, struct commit *item)
if (!prepare_commit_graph(r))
return 0;
- return parse_commit_in_graph_one(r, r->objects->commit_graph, item);
+ return parse_commit_in_graph_one(r->objects->commit_graph, item);
}
void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1097,8 +1080,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
fill_commit_graph_info(item, r->objects->commit_graph, pos);
}
-static struct tree *load_tree_for_commit(struct repository *r,
- struct commit_graph *g,
+static struct tree *load_tree_for_commit(struct commit_graph *g,
struct commit *c)
{
struct object_id oid;
@@ -1113,13 +1095,12 @@ static struct tree *load_tree_for_commit(struct repository *r,
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, g->hash_algo);
- set_commit_tree(c, lookup_tree(r, &oid));
+ set_commit_tree(c, lookup_tree(g->odb_source->odb->repo, &oid));
return c->maybe_tree;
}
-static struct tree *get_commit_tree_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g,
const struct commit *c)
{
if (c->maybe_tree)
@@ -1127,12 +1108,12 @@ static struct tree *get_commit_tree_in_graph_one(struct repository *r,
if (commit_graph_position(c) == COMMIT_NOT_FROM_GRAPH)
BUG("get_commit_tree_in_graph_one called from non-commit-graph commit");
- return load_tree_for_commit(r, g, (struct commit *)c);
+ return load_tree_for_commit(g, (struct commit *)c);
}
struct tree *get_commit_tree_in_graph(struct repository *r, const struct commit *c)
{
- return get_commit_tree_in_graph_one(r, r->objects->commit_graph, c);
+ return get_commit_tree_in_graph_one(r->objects->commit_graph, c);
}
struct packed_commit_list {
@@ -2738,11 +2719,11 @@ static int commit_graph_checksum_valid(struct commit_graph *g)
g->data, g->data_len);
}
-static int verify_one_commit_graph(struct repository *r,
- struct commit_graph *g,
+static int verify_one_commit_graph(struct commit_graph *g,
struct progress *progress,
uint64_t *seen)
{
+ struct repository *r = g->odb_source->odb->repo;
uint32_t i, cur_fanout_pos = 0;
struct object_id prev_oid, cur_oid;
struct commit *seen_gen_zero = NULL;
@@ -2776,7 +2757,7 @@ static int verify_one_commit_graph(struct repository *r,
}
graph_commit = lookup_commit(r, &cur_oid);
- if (!parse_commit_in_graph_one(r, g, graph_commit))
+ if (!parse_commit_in_graph_one(g, graph_commit))
graph_report(_("failed to parse commit %s from commit-graph"),
oid_to_hex(&cur_oid));
}
@@ -2812,7 +2793,7 @@ static int verify_one_commit_graph(struct repository *r,
continue;
}
- if (!oideq(&get_commit_tree_in_graph_one(r, g, graph_commit)->object.oid,
+ if (!oideq(&get_commit_tree_in_graph_one(g, graph_commit)->object.oid,
get_commit_tree_oid(odb_commit)))
graph_report(_("root tree OID for commit %s in commit-graph is %s != %s"),
oid_to_hex(&cur_oid),
@@ -2830,7 +2811,7 @@ static int verify_one_commit_graph(struct repository *r,
}
/* parse parent in case it is in a base graph */
- parse_commit_in_graph_one(r, g, graph_parents->item);
+ parse_commit_in_graph_one(g, graph_parents->item);
if (!oideq(&graph_parents->item->object.oid, &odb_parents->item->object.oid))
graph_report(_("commit-graph parent for %s is %s != %s"),
@@ -2890,7 +2871,7 @@ static int verify_one_commit_graph(struct repository *r,
return verify_commit_graph_error;
}
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
+int verify_commit_graph(struct commit_graph *g, int flags)
{
struct progress *progress = NULL;
int local_error = 0;
@@ -2906,13 +2887,13 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(r,
+ progress = start_progress(g->odb_source->odb->repo,
_("Verifying commits in commit graph"),
total);
}
for (; g; g = g->base_graph) {
- local_error |= verify_one_commit_graph(r, g, progress, &seen);
+ local_error |= verify_one_commit_graph(g, progress, &seen);
if (flags & COMMIT_GRAPH_VERIFY_SHALLOW)
break;
}
diff --git a/commit-graph.h b/commit-graph.h
index 24c1aca69e..c7970be661 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -114,14 +114,12 @@ struct commit_graph {
struct bloom_filter_settings *bloom_filter_settings;
};
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source);
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st);
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain);
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source);
+struct commit_graph *read_commit_graph_one(struct odb_source *source);
struct repo_settings;
@@ -186,7 +184,7 @@ int write_commit_graph(struct odb_source *source,
#define COMMIT_GRAPH_VERIFY_SHALLOW (1 << 0)
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags);
+int verify_commit_graph(struct commit_graph *g, int flags);
void close_commit_graph(struct object_database *);
void free_commit_graph(struct commit_graph *);
diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c
index ef5339bbee..6a5f64e473 100644
--- a/t/helper/test-read-graph.c
+++ b/t/helper/test-read-graph.c
@@ -81,7 +81,7 @@ int cmd__read_graph(int argc, const char **argv)
prepare_repo_settings(the_repository);
- graph = read_commit_graph_one(the_repository, source);
+ graph = read_commit_graph_one(source);
if (!graph) {
ret = 1;
goto done;
--
2.50.1.723.g3e08bea96f.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 8:17 ` [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters Patrick Steinhardt
@ 2025-08-04 9:13 ` Oswald Buddenhagen
2025-08-04 11:18 ` Patrick Steinhardt
2025-08-04 21:42 ` Taylor Blau
1 sibling, 1 reply; 69+ messages in thread
From: Oswald Buddenhagen @ 2025-08-04 9:13 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:18AM +0200, Patrick Steinhardt wrote:
>When writing a new commit graph we have a couple of counters that
>provide statistics around what kind of bloom filters we have or have not
>written. These counters naturally count from zero and are only ever
>incremented, but they use a signed integer as type regardless.
>
>Refactor those fields to be of type `size_t` instead.
>
mind elaborating on that choice?
it feels like abuse at the semantic level, and it increases the data
size on lp64 platforms. is it even compatible with OPT_UNSIGNED (in
later commits)? that would be unexpected ...
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 9:13 ` Oswald Buddenhagen
@ 2025-08-04 11:18 ` Patrick Steinhardt
2025-08-04 18:34 ` Junio C Hamano
0 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-04 11:18 UTC (permalink / raw)
To: Oswald Buddenhagen; +Cc: git
On Mon, Aug 04, 2025 at 11:13:28AM +0200, Oswald Buddenhagen wrote:
> On Mon, Aug 04, 2025 at 10:17:18AM +0200, Patrick Steinhardt wrote:
> > When writing a new commit graph we have a couple of counters that
> > provide statistics around what kind of bloom filters we have or have not
> > written. These counters naturally count from zero and are only ever
> > incremented, but they use a signed integer as type regardless.
> >
> > Refactor those fields to be of type `size_t` instead.
> >
> mind elaborating on that choice?
We tend to use `size_t` when counting stuff.
> it feels like abuse at the semantic level, and it increases the data size on
> lp64 platforms. is it even compatible with OPT_UNSIGNED (in later commits)?
> that would be unexpected ...
Yes, it is, starting with my 2bc5414c41 (Merge branch
'ps/parse-options-integers', 2025-04-24). Regarding the data size I
don't really think that matters much. It's not like we have hundreds of
thousands of commit graphs in-memory at any point in time.
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 11:18 ` Patrick Steinhardt
@ 2025-08-04 18:34 ` Junio C Hamano
2025-08-04 21:44 ` Taylor Blau
2025-08-05 15:13 ` Junio C Hamano
0 siblings, 2 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-08-04 18:34 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Oswald Buddenhagen, git
Patrick Steinhardt <ps@pks.im> writes:
> On Mon, Aug 04, 2025 at 11:13:28AM +0200, Oswald Buddenhagen wrote:
>> On Mon, Aug 04, 2025 at 10:17:18AM +0200, Patrick Steinhardt wrote:
>> > When writing a new commit graph we have a couple of counters that
>> > provide statistics around what kind of bloom filters we have or have not
>> > written. These counters naturally count from zero and are only ever
>> > incremented, but they use a signed integer as type regardless.
>> >
>> > Refactor those fields to be of type `size_t` instead.
>> >
>> mind elaborating on that choice?
>
> We tend to use `size_t` when counting stuff.
And I would have to say that it is wrong and we need to wean
ourselves from such a superstition. Unless you are measuring how
big a memory block you ask from the allocator, the platform natural
integer is often the right type to do the counting.
Each of your "stuff" may weigh N megabytes in core, and if you have
M of them, you may have to ask (N*2**20)*M bytes of memory from the
allocator. Your (N*2**20)*M must fit size_t _and_ you must compute
it without overflowing or wrapping around.
None of the above mean you have to express N in size_t, though.
And more importantly, nobody gives you any extra guarantee that you
would compute the result correctly if you used size_t. You can write
the right code with platform natural integer, and you have to take
the same care (e.g. by using st_mult()) to catch integer overflows
even if you used size_t.
> ... Regarding the data size I
> don't really think that matters much. It's not like we have hundreds of
> thousands of commit graphs in-memory at any point in time.
Aren't you saying that a platform natural integer is a much better
fit?
As to signedness, it sometimes is better for a struct member that is
used to record the number of "stuff" you have to be a signed integer
that is initialized to -1 to signal "we haven't counted so we do not
yet know how many there are". So
These counters naturally count from zero and are only ever
incremented.
is not always a valid excuse to insist that such a variable must be
unsigned.
In short, not all but much of the recent "use size_t" topics are
misguided, and -Wsign-compare is usually a wrong thing to rely on.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 1/9] trace2: introduce function to trace unsigned integers
2025-08-04 8:17 ` [PATCH 1/9] trace2: introduce function to trace unsigned integers Patrick Steinhardt
@ 2025-08-04 21:33 ` Taylor Blau
0 siblings, 0 replies; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 21:33 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:17AM +0200, Patrick Steinhardt wrote:
> While we have `trace2_data_intmax()`, there is no equivalent function
> that takes an unsigned integer. Introduce `trace2_data_uintmax()` to
> plug this gap.
>
> This function will be used in a subsequent commit.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> trace2.c | 14 ++++++++++++++
> trace2.h | 9 +++++++++
> 2 files changed, 23 insertions(+)
>
> diff --git a/trace2.c b/trace2.c
> index c23c0a227b..a687944f7b 100644
> --- a/trace2.c
> +++ b/trace2.c
> @@ -948,6 +948,20 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
> strbuf_release(&buf_string);
> }
>
> +void trace2_data_uintmax_fl(const char *file, int line, const char *category,
> + const struct repository *repo, const char *key,
> + uintmax_t value)
> +{
> + struct strbuf buf_string = STRBUF_INIT;
> +
> + if (!trace2_enabled)
> + return;
> +
> + strbuf_addf(&buf_string, "%" PRIuMAX, value);
> + trace2_data_string_fl(file, line, category, repo, key, buf_string.buf);
> + strbuf_release(&buf_string);
> +}
> +
Looks like a faithful copy of its signed counterpart above, which is
good. We *could* use a macro for this, but I don't think we *should*
;-).
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 8:17 ` [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters Patrick Steinhardt
2025-08-04 9:13 ` Oswald Buddenhagen
@ 2025-08-04 21:42 ` Taylor Blau
1 sibling, 0 replies; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 21:42 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:18AM +0200, Patrick Steinhardt wrote:
> When writing a new commit graph we have a couple of counters that
> provide statistics around what kind of bloom filters we have or have not
s/bloom/Bloom
> written. These counters naturally count from zero and are only ever
> incremented, but they use a signed integer as type regardless.
>
> Refactor those fields to be of type `size_t` instead.
I have some thoughts about this, but I see that others do as well, so
I'll refrain from sharing them here and instead join the discussion in
the sub-thread.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 18:34 ` Junio C Hamano
@ 2025-08-04 21:44 ` Taylor Blau
2025-08-06 6:23 ` Patrick Steinhardt
2025-08-05 15:13 ` Junio C Hamano
1 sibling, 1 reply; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 21:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, Oswald Buddenhagen, git
On Mon, Aug 04, 2025 at 11:34:22AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > On Mon, Aug 04, 2025 at 11:13:28AM +0200, Oswald Buddenhagen wrote:
> >> On Mon, Aug 04, 2025 at 10:17:18AM +0200, Patrick Steinhardt wrote:
> >> > When writing a new commit graph we have a couple of counters that
> >> > provide statistics around what kind of bloom filters we have or have not
> >> > written. These counters naturally count from zero and are only ever
> >> > incremented, but they use a signed integer as type regardless.
> >> >
> >> > Refactor those fields to be of type `size_t` instead.
> >> >
> >> mind elaborating on that choice?
> >
> > We tend to use `size_t` when counting stuff.
>
> And I would have to say that it is wrong and we need to wean
> ourselves from such a superstition. Unless you are measuring how
> big a memory block you ask from the allocator, the platform natural
> integer is often the right type to do the counting.
>
> Each of your "stuff" may weigh N megabytes in core, and if you have
> M of them, you may have to ask (N*2**20)*M bytes of memory from the
> allocator. Your (N*2**20)*M must fit size_t _and_ you must compute
> it without overflowing or wrapping around.
>
> None of the above mean you have to express N in size_t, though.
> And more importantly, nobody gives you any extra guarantee that you
> would compute the result correctly if you used size_t. You can write
> the right code with platform natural integer, and you have to take
> the same care (e.g. by using st_mult()) to catch integer overflows
> even if you used size_t.
Agreed. I think it makes sense to use size_t to keep track of, say, the
length and allocated size of a buffer, but when it comes to "counting"
something that isn't directly related to memory allocation or pointer
arithmetic, size_t is usually not the right choice.
For instance, the MIDX code counts the number of objects and packs in a
given MIDX (and likewise for its base MIDX(s)), but those are all
uint32_t. You could make the case to say that, "well, they are encoded
in the file format as 4-byte unsigned values, so we should treat them
the same way in memory at read-time", and I think that's reasonable. In
that instance, using "int" would be the wrong choice, since I have
definitely seen repositories that have in excess of 2^32-1 objects.
But there is no reason that we shouldn't use size_t to make that count.
> > ... Regarding the data size I
> > don't really think that matters much. It's not like we have hundreds of
> > thousands of commit graphs in-memory at any point in time.
>
> Aren't you saying that a platform natural integer is a much better
> fit?
>
> As to signedness, it sometimes is better for a struct member that is
> used to record the number of "stuff" you have to be a signed integer
> that is initialized to -1 to signal "we haven't counted so we do not
> yet know how many there are". So
>
> These counters naturally count from zero and are only ever
> incremented.
>
> is not always a valid excuse to insist that such a variable must be
> unsigned.
I wrote these counters in 312cff5207 (bloom: split 'get_bloom_filter()'
in two, 2020-09-16) and 59f0d5073f (bloom: encode out-of-bounds filters
as non-empty, 2020-09-17), and I don't see a compelling reason that
these should be unsigned.
It's true that we don't have any need for negative values here since we
are counting from zero, but I don't think that alone justifies changing
the signed-ness here.
Is there a reason beyond "these are always non-negative" that changing
the signed-ness is warranted? If so, let's discuss that and make sure
that it is documented in the commit message. If not, I think we could
drop this patch (and optionally the patch before it as well).
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 3/9] commit-graph: fix type for some write options
2025-08-04 8:17 ` [PATCH 3/9] commit-graph: fix type for some write options Patrick Steinhardt
@ 2025-08-04 21:52 ` Taylor Blau
0 siblings, 0 replies; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 21:52 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:19AM +0200, Patrick Steinhardt wrote:
> The options "max-commits" and "size-multiple" are both supposed to be
> positive integers and are documented as such, but we use a signed
> integer field to store them. This causes sign comparison warnings in
> `split_graph_merge_strategy()` because we end up comparing the option
> values with the observed number of commits.
>
> Fix the issue by converting the fields to be unsigned and convert the
> options to use `OPT_UNSIGNED()` accordingly. This macro has only been
> introduced recently, which might explain why the option values were
> signed in the first place.
I have the same general feeling about this patch as the previous one:
the -Wsign-compare warnings here are a bit of a red herring.
That said, I do think that we need stricter validation for these
options. Following the --size-multiple example, for instance, we get
this rather unfriendly error message if we pass a bogus value:
$ git.compile commit-graph write --split --size-multiple=-1 --reachable
fatal: size_t overflow: 18446744073709551615 * 34
This happens as a result of in
commit-graph.c::split_graph_merge_strategy() doing the following:
while (g && (g->num_commits <= st_mult(size_mult, num_commits) ||
(max_commits && num_commits > max_commits))) {
, where size_mult is changed from a size_t to an int implicitly when
assigning it to the stack-local "size_mult" variable.
Indeed, this patch does improve that warning, but I am not convinced
that changing the type is the right way to go about improving the error
message in and of itself.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 4/9] commit-graph: fix sign comparison warnings
2025-08-04 8:17 ` [PATCH 4/9] commit-graph: fix sign comparison warnings Patrick Steinhardt
@ 2025-08-04 22:04 ` Taylor Blau
2025-08-06 6:52 ` Patrick Steinhardt
0 siblings, 1 reply; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 22:04 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:20AM +0200, Patrick Steinhardt wrote:
> The "commit-graph.c" file has a bunch of sign comparison warnings:
>
> - There are a bunch of variables that are declared as signed integers
> even though they are used to count entities, like for example
> `num_commit_graphs_before` and `num_commit_graphs_after`.
I have similar thoughts as in the previous patch about this spot, too.
> - There are several cases where we use signed loop variables to
> iterate through an unsigned entity count.
Here I am more sympathetic to using a size_t to count the number of
allocated elements in some array/buffer.
> - The bloom settings hash version is being assigned `-1` even though
> it's an unsigned value. This is used to indicate an unspecified
> value and relies on 1's complement.
OK. I think that comparing "-1" to an unsigned value is meant equivalent
to saying "are all 32 of these bits set"? But making it explicit seems
reasonable, since it's not immediately clear from reading this function
that 'hash_version' is unsigned.
> @@ -622,7 +621,7 @@ int open_commit_graph_chain(const char *chain_file,
> close(*fd);
> return 0;
> }
> - if (st->st_size < the_hash_algo->hexsz) {
> + if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
I understand why the compiler is telling you to make hexsz a signed
quantity, but I am not sure that the cast here is aiding the reader, nor
am I sure that it is making the code safer.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 5/9] commit-graph: stop using `the_hash_algo` via macros
2025-08-04 8:17 ` [PATCH 5/9] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
@ 2025-08-04 22:05 ` Taylor Blau
0 siblings, 0 replies; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 22:05 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:21AM +0200, Patrick Steinhardt wrote:
> We have two macros `GRAPH_DATA_WIDTH` and `GRAPH_MIN_SIZE` that compute
> hash-dependent sizes. They do so by using the global `the_hash_algo`
> variable though, which we want to get rid of over time.
>
> Convert these macros into functions that accept the hash algorithm as
> input parameter. Adapt callers accordingly.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> commit-graph.c | 25 ++++++++++++++++---------
> 1 file changed, 16 insertions(+), 9 deletions(-)
Very nice.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 6/9] commit-graph: store the hash algorithm instead of its length
2025-08-04 8:17 ` [PATCH 6/9] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
@ 2025-08-04 22:07 ` Taylor Blau
0 siblings, 0 replies; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 22:07 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:22AM +0200, Patrick Steinhardt wrote:
> The commit-graph stores the length of the hash algorithm it uses. In
> subsequent commits we'll need to pass the whole hash algorithm around
> though, which we currently don't have access to.
>
> Refactor the code so that we store the hash algorithm instead of only
> its size.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> commit-graph.c | 36 ++++++++++++++++++------------------
> commit-graph.h | 2 +-
> 2 files changed, 19 insertions(+), 19 deletions(-)
Also makes sense. I briefly wondered about hash version mismatches, but
parse_commit_graph() already covers us here by comparing the
hash_version field written in the commit-graph's header against
oid_version(the_hash_algo).
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 7/9] commit-graph: stop using `the_hash_algo`
2025-08-04 8:17 ` [PATCH 7/9] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
@ 2025-08-04 22:10 ` Taylor Blau
2025-08-06 6:53 ` Patrick Steinhardt
0 siblings, 1 reply; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 22:10 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:23AM +0200, Patrick Steinhardt wrote:
> Stop using `the_hash_algo` as it implicitly relies on `the_repository`.
> Instead, we either use the hash algo provided via the context or, if
> there is no such hash algo, we use `the_repository` explicitly. Such
> uses will be removed in subsequent commits.
Seems reasonable, and the implementation looks straightforward to me,
however I wonder...
> @@ -129,6 +130,7 @@ struct repo_settings;
> * prior to calling parse_commit_graph().
> */
> struct commit_graph *parse_commit_graph(struct repo_settings *s,
> + const struct git_hash_algo *hash_algo,
> void *graph_map, size_t graph_size);
...does it make more sense to take a 'struct repository *' here instead
of passing both its settings and hash_algo separately? Is there a
scenario where we would want to parse a commit graph with a (settings,
hash_algo) pair that does not match that of any single repository?
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 8/9] commit-graph: stop using `the_repository`
2025-08-04 8:17 ` [PATCH 8/9] commit-graph: stop using `the_repository` Patrick Steinhardt
@ 2025-08-04 22:11 ` Taylor Blau
0 siblings, 0 replies; 69+ messages in thread
From: Taylor Blau @ 2025-08-04 22:11 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Aug 04, 2025 at 10:17:24AM +0200, Patrick Steinhardt wrote:
> ---
> builtin/commit.c | 2 +-
> builtin/merge.c | 2 +-
> commit-graph.c | 79 ++++++++++++++++++++++++++++----------------------------
> commit-graph.h | 2 +-
> 4 files changed, 43 insertions(+), 42 deletions(-)
Looking good.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 0/9] commit-graph: remove reliance on global state
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (8 preceding siblings ...)
2025-08-04 8:17 ` [PATCH 9/9] commit-graph: stop passing in redundant repository Patrick Steinhardt
@ 2025-08-05 4:27 ` Derrick Stolee
2025-08-06 6:53 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (2 subsequent siblings)
12 siblings, 1 reply; 69+ messages in thread
From: Derrick Stolee @ 2025-08-05 4:27 UTC (permalink / raw)
To: Patrick Steinhardt, git
On 8/4/25 1:17 AM, Patrick Steinhardt wrote:
> Hi,
>
> this patch series is another step on our long road towards not having
> global state. In addition to that, as commit-graphs are part of the
> object database layer, this is also another step towards pluggable
> object databases.
Thanks for carefully working through this code full of bad patterns
and fixing not just the bare minimum to get it working. Each change
was sufficiently motivated and carefully done. LGTM.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 18:34 ` Junio C Hamano
2025-08-04 21:44 ` Taylor Blau
@ 2025-08-05 15:13 ` Junio C Hamano
1 sibling, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-08-05 15:13 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Oswald Buddenhagen, git
Junio C Hamano <gitster@pobox.com> writes:
> Each of your "stuff" may weigh N megabytes in core, and if you have
> M of them, you may have to ask (N*2**20)*M bytes of memory from the
> allocator. Your (N*2**20)*M must fit size_t _and_ you must compute
> it without overflowing or wrapping around.
>
> None of the above mean you have to express N in size_t, though.
Small correction. I meant "there is no reason to count M in size_t"
in the above. I am perfectly OK with expressing N in size_t.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-04 21:44 ` Taylor Blau
@ 2025-08-06 6:23 ` Patrick Steinhardt
2025-08-06 12:54 ` Oswald Buddenhagen
2025-08-06 15:41 ` Junio C Hamano
0 siblings, 2 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 6:23 UTC (permalink / raw)
To: Taylor Blau; +Cc: Junio C Hamano, Oswald Buddenhagen, git
On Mon, Aug 04, 2025 at 05:44:06PM -0400, Taylor Blau wrote:
> On Mon, Aug 04, 2025 at 11:34:22AM -0700, Junio C Hamano wrote:
> > Patrick Steinhardt <ps@pks.im> writes:
> > Aren't you saying that a platform natural integer is a much better
> > fit?
> >
> > As to signedness, it sometimes is better for a struct member that is
> > used to record the number of "stuff" you have to be a signed integer
> > that is initialized to -1 to signal "we haven't counted so we do not
> > yet know how many there are". So
> >
> > These counters naturally count from zero and are only ever
> > incremented.
> >
> > is not always a valid excuse to insist that such a variable must be
> > unsigned.
>
> I wrote these counters in 312cff5207 (bloom: split 'get_bloom_filter()'
> in two, 2020-09-16) and 59f0d5073f (bloom: encode out-of-bounds filters
> as non-empty, 2020-09-17), and I don't see a compelling reason that
> these should be unsigned.
I think that is going backwards though: the question to ask is why
should these be signed if they cannot ever be negative?
> It's true that we don't have any need for negative values here since we
> are counting from zero, but I don't think that alone justifies changing
> the signed-ness here.
>
> Is there a reason beyond "these are always non-negative" that changing
> the signed-ness is warranted? If so, let's discuss that and make sure
> that it is documented in the commit message. If not, I think we could
> drop this patch (and optionally the patch before it as well).
Yes, there is: it makes code easier to reason about for the reader, and
it means that we don't have to guard ourselves against negative values.
If I see a counting variable that is signed I immediately jump to the
question of whether or not it can ever be a negative value. I assume
that the author of this code _intentfully_ made it signed to cater to a
specific edge case.
Whether or not we should be using `size_t`... I don't mind that one too
much, and I'm fine to use `unsigned` instead. But I really think that we
should do our best and help readers by using the proper type for the
task at hand and not let them figure out whether or not they have to
care about edge cases where the counting value could be negative.
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 4/9] commit-graph: fix sign comparison warnings
2025-08-04 22:04 ` Taylor Blau
@ 2025-08-06 6:52 ` Patrick Steinhardt
0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 6:52 UTC (permalink / raw)
To: Taylor Blau; +Cc: git
On Mon, Aug 04, 2025 at 06:04:22PM -0400, Taylor Blau wrote:
> On Mon, Aug 04, 2025 at 10:17:20AM +0200, Patrick Steinhardt wrote:
> > The "commit-graph.c" file has a bunch of sign comparison warnings:
> >
> > - There are a bunch of variables that are declared as signed integers
> > even though they are used to count entities, like for example
> > `num_commit_graphs_before` and `num_commit_graphs_after`.
>
> I have similar thoughts as in the previous patch about this spot, too.
I'll convert these to be `uint32_t` instead of `size_t`. We already use
that type to iterate through these counters anyway.
> > @@ -622,7 +621,7 @@ int open_commit_graph_chain(const char *chain_file,
> > close(*fd);
> > return 0;
> > }
> > - if (st->st_size < the_hash_algo->hexsz) {
> > + if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
>
> I understand why the compiler is telling you to make hexsz a signed
> quantity, but I am not sure that the cast here is aiding the reader, nor
> am I sure that it is making the code safer.
No, it definitely isn't. What makes the code safer is that from now on
we'll get warnings about signedness mismatches.
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 7/9] commit-graph: stop using `the_hash_algo`
2025-08-04 22:10 ` Taylor Blau
@ 2025-08-06 6:53 ` Patrick Steinhardt
0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 6:53 UTC (permalink / raw)
To: Taylor Blau; +Cc: git
On Mon, Aug 04, 2025 at 06:10:21PM -0400, Taylor Blau wrote:
> On Mon, Aug 04, 2025 at 10:17:23AM +0200, Patrick Steinhardt wrote:
> > Stop using `the_hash_algo` as it implicitly relies on `the_repository`.
> > Instead, we either use the hash algo provided via the context or, if
> > there is no such hash algo, we use `the_repository` explicitly. Such
> > uses will be removed in subsequent commits.
>
> Seems reasonable, and the implementation looks straightforward to me,
> however I wonder...
>
> > @@ -129,6 +130,7 @@ struct repo_settings;
> > * prior to calling parse_commit_graph().
> > */
> > struct commit_graph *parse_commit_graph(struct repo_settings *s,
> > + const struct git_hash_algo *hash_algo,
> > void *graph_map, size_t graph_size);
>
> ...does it make more sense to take a 'struct repository *' here instead
> of passing both its settings and hash_algo separately? Is there a
> scenario where we would want to parse a commit graph with a (settings,
> hash_algo) pair that does not match that of any single repository?
Fair. That'd also allow us to move the call of `prepare_repo_settings()`
into this function.
There's one catch though: in "oss-fuzz/fuzz-commit-graph.c" we manually
stub out both the repository's hash function and its settings. But we
can appease it by also setting `the_repository->settings.initialized`,
which ensures that we won't try to populate the settings anymore.
Will amend.
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 0/9] commit-graph: remove reliance on global state
2025-08-05 4:27 ` [PATCH 0/9] commit-graph: remove reliance on global state Derrick Stolee
@ 2025-08-06 6:53 ` Patrick Steinhardt
0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 6:53 UTC (permalink / raw)
To: Derrick Stolee; +Cc: git
On Mon, Aug 04, 2025 at 09:27:42PM -0700, Derrick Stolee wrote:
> On 8/4/25 1:17 AM, Patrick Steinhardt wrote:
> > Hi,
> >
> > this patch series is another step on our long road towards not having
> > global state. In addition to that, as commit-graphs are part of the
> > object database layer, this is also another step towards pluggable
> > object databases.
>
> Thanks for carefully working through this code full of bad patterns
> and fixing not just the bare minimum to get it working. Each change
> was sufficiently motivated and carefully done. LGTM.
Thanks for your review!
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH v2 00/10] commit-graph: remove reliance on global state
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (9 preceding siblings ...)
2025-08-05 4:27 ` [PATCH 0/9] commit-graph: remove reliance on global state Derrick Stolee
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
` (9 more replies)
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
12 siblings, 10 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Hi,
this patch series is another step on our long road towards not having
global state. In addition to that, as commit-graphs are part of the
object database layer, this is also another step towards pluggable
object databases.
Changes in v2:
- Use `unsigned` instead of `size_t` to count number of Bloom filters.
- Use `uint32_t` instead of `size_t` for number of commit graphs,
as this type is also used to iterate through this count already.
- Refactor `parse_commit_graph()` to take a repository instead of both
repo settings and a hash algo.
- Link to v1: https://lore.kernel.org/r/20250804-b4-pks-commit-graph-wo-the-repository-v1-0-850d626eb2e8@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (10):
trace2: introduce function to trace unsigned integers
commit-graph: stop using signed integers to count Bloom filters
commit-graph: fix type for some write options
commit-graph: fix sign comparison warnings
commit-graph: stop using `the_hash_algo` via macros
commit-graph: store the hash algorithm instead of its length
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: stop using `the_hash_algo`
commit-graph: stop using `the_repository`
commit-graph: stop passing in redundant repository
builtin/commit-graph.c | 13 +-
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 371 +++++++++++++++++++++----------------------
commit-graph.h | 25 ++-
oss-fuzz/fuzz-commit-graph.c | 6 +-
t/helper/test-read-graph.c | 2 +-
trace2.c | 14 ++
trace2.h | 9 ++
9 files changed, 227 insertions(+), 217 deletions(-)
Range-diff versus v1:
1: cb92085a3b = 1: a25e9cdbcc trace2: introduce function to trace unsigned integers
2: 25520448c6 ! 2: e03ca21ec2 commit-graph: stop using signed integers to count bloom filters
@@ Metadata
Author: Patrick Steinhardt <ps@pks.im>
## Commit message ##
- commit-graph: stop using signed integers to count bloom filters
+ commit-graph: stop using signed integers to count Bloom filters
When writing a new commit graph we have a couple of counters that
- provide statistics around what kind of bloom filters we have or have not
+ provide statistics around what kind of Bloom filters we have or have not
written. These counters naturally count from zero and are only ever
incremented, but they use a signed integer as type regardless.
- Refactor those fields to be of type `size_t` instead.
+ Refactor those fields to be unsigned instead. Using an unsigned type
+ makes it explicit to the reader that they never have to worry about
+ negative values and thus makes the code easier to understand.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
@@ commit-graph.c: struct write_commit_graph_context {
- int count_bloom_filter_trunc_empty;
- int count_bloom_filter_trunc_large;
- int count_bloom_filter_upgraded;
-+ size_t count_bloom_filter_computed;
-+ size_t count_bloom_filter_not_computed;
-+ size_t count_bloom_filter_trunc_empty;
-+ size_t count_bloom_filter_trunc_large;
-+ size_t count_bloom_filter_upgraded;
++ unsigned count_bloom_filter_computed;
++ unsigned count_bloom_filter_not_computed;
++ unsigned count_bloom_filter_trunc_empty;
++ unsigned count_bloom_filter_trunc_large;
++ unsigned count_bloom_filter_upgraded;
};
static int write_graph_chunk_fanout(struct hashfile *f,
3: 12e150a326 = 3: d569434715 commit-graph: fix type for some write options
4: 0bdaff4e76 ! 4: 3f820e3347 commit-graph: fix sign comparison warnings
@@ Commit message
quantity, we still return a signed integer that we then later
compare with unsigned values.
- - The bloom settings hash version is being assigned `-1` even though
+ - The Bloom settings hash version is being assigned `-1` even though
it's an unsigned value. This is used to indicate an unspecified
value and relies on 1's complement.
@@ commit-graph.c: struct write_commit_graph_context {
char *base_graph_name;
- int num_commit_graphs_before;
- int num_commit_graphs_after;
-+ size_t num_commit_graphs_before;
-+ size_t num_commit_graphs_after;
++ uint32_t num_commit_graphs_before;
++ uint32_t num_commit_graphs_after;
char **commit_graph_filenames_before;
char **commit_graph_filenames_after;
char **commit_graph_hash_after;
5: 6e5d4da7f1 = 5: c3be366e36 commit-graph: stop using `the_hash_algo` via macros
6: 8e8bf531d1 = 6: 2476994769 commit-graph: store the hash algorithm instead of its length
-: ---------- > 7: b582b49437 commit-graph: refactor `parse_commit_graph()` to take a repository
7: 20bce2f981 ! 8: 0d0bd20ceb commit-graph: stop using `the_hash_algo`
@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct reposito
close(fd);
error(_("commit-graph file is too small"));
return NULL;
-@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
- close(fd);
- prepare_repo_settings(r);
-- ret = parse_commit_graph(&r->settings, graph_map, graph_size);
-+ ret = parse_commit_graph(&r->settings, r->hash_algo, graph_map, graph_size);
-
- if (ret)
- ret->odb_source = source;
@@ commit-graph.c: static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
@@ commit-graph.c: static int graph_read_commit_data(const unsigned char *chunk_sta
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
-@@ commit-graph.c: static int graph_read_bloom_data(const unsigned char *chunk_start,
- }
-
- struct commit_graph *parse_commit_graph(struct repo_settings *s,
-+ const struct git_hash_algo *hash_algo,
- void *graph_map, size_t graph_size)
- {
- const unsigned char *data;
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
- if (!graph_map)
- return NULL;
-
-- if (graph_size < graph_min_size(the_hash_algo))
-+ if (graph_size < graph_min_size(hash_algo))
- return NULL;
-
- data = (const unsigned char *)graph_map;
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
- }
-
- hash_version = *(unsigned char*)(data + 5);
-- if (hash_version != oid_version(the_hash_algo)) {
-+ if (hash_version != oid_version(hash_algo)) {
- error(_("commit-graph hash version %X does not match version %X"),
-- hash_version, oid_version(the_hash_algo));
-+ hash_version, oid_version(hash_algo));
- return NULL;
- }
-
- graph = alloc_commit_graph();
-
-- graph->hash_algo = the_hash_algo;
-+ graph->hash_algo = hash_algo;
- graph->num_chunks = *(unsigned char*)(data + 6);
- graph->data = graph_map;
- graph->data_len = graph_size;
-
- if (graph_size < GRAPH_HEADER_SIZE +
- (graph->num_chunks + 1) * CHUNK_TOC_ENTRY_SIZE +
-- GRAPH_FANOUT_SIZE + the_hash_algo->rawsz) {
-+ GRAPH_FANOUT_SIZE + hash_algo->rawsz) {
- error(_("commit-graph file is too small to hold %u chunks"),
- graph->num_chunks);
- free(graph);
@@ commit-graph.c: static int add_graph_to_chain(struct commit_graph *g,
}
@@ commit-graph.h: struct string_list;
/*
* Given a commit struct, try to fill the commit struct info, including:
-@@ commit-graph.h: struct repo_settings;
- * prior to calling parse_commit_graph().
- */
- struct commit_graph *parse_commit_graph(struct repo_settings *s,
-+ const struct git_hash_algo *hash_algo,
- void *graph_map, size_t graph_size);
-
- /*
-
- ## oss-fuzz/fuzz-commit-graph.c ##
-@@
- #include "repository.h"
-
- struct commit_graph *parse_commit_graph(struct repo_settings *s,
-+ const struct git_hash_algo *hash_algo,
- void *graph_map, size_t graph_size);
-
- int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
-@@ oss-fuzz/fuzz-commit-graph.c: int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
- repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
- the_repository->settings.commit_graph_generation_version = 2;
- the_repository->settings.commit_graph_changed_paths_version = 1;
-- g = parse_commit_graph(&the_repository->settings, (void *)data, size);
-+ g = parse_commit_graph(&the_repository->settings, the_repository->hash_algo,
-+ (void *)data, size);
- repo_clear(the_repository);
- free_commit_graph(g);
-
8: 424567998e ! 9: a86c1ab958 commit-graph: stop using `the_repository`
@@ commit-graph.c: void git_test_write_commit_graph_or_die(void)
die("failed to write commit-graph under GIT_TEST_COMMIT_GRAPH");
}
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
- }
-
- oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
-- the_repository->hash_algo);
-+ hash_algo);
-
- free_chunkfile(cf);
- return graph;
@@ commit-graph.c: static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
9: cff4bc0329 ! 10: 70a7f6fecf commit-graph: stop passing in redundant repository
@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct reposito
close(fd);
error(_("commit-graph file is too small"));
return NULL;
- }
+@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
-- prepare_repo_settings(r);
-- ret = parse_commit_graph(&r->settings, r->hash_algo, graph_map, graph_size);
-+ prepare_repo_settings(source->odb->repo);
-+ ret = parse_commit_graph(&source->odb->repo->settings, source->odb->repo->hash_algo,
-+ graph_map, graph_size);
+- ret = parse_commit_graph(r, graph_map, graph_size);
++ ret = parse_commit_graph(source->odb->repo, graph_map, graph_size);
if (ret)
ret->odb_source = source;
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
+ else
+@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r,
return NULL;
}
---
base-commit: e813a0200a7121b97fec535f0d0b460b0a33356c
change-id: 20250717-b4-pks-commit-graph-wo-the-repository-1dc2cacbc8e3
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH v2 01/10] trace2: introduce function to trace unsigned integers
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
` (8 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
While we have `trace2_data_intmax()`, there is no equivalent function
that takes an unsigned integer. Introduce `trace2_data_uintmax()` to
plug this gap.
This function will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
trace2.c | 14 ++++++++++++++
trace2.h | 9 +++++++++
2 files changed, 23 insertions(+)
diff --git a/trace2.c b/trace2.c
index c23c0a227b..a687944f7b 100644
--- a/trace2.c
+++ b/trace2.c
@@ -948,6 +948,20 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
strbuf_release(&buf_string);
}
+void trace2_data_uintmax_fl(const char *file, int line, const char *category,
+ const struct repository *repo, const char *key,
+ uintmax_t value)
+{
+ struct strbuf buf_string = STRBUF_INIT;
+
+ if (!trace2_enabled)
+ return;
+
+ strbuf_addf(&buf_string, "%" PRIuMAX, value);
+ trace2_data_string_fl(file, line, category, repo, key, buf_string.buf);
+ strbuf_release(&buf_string);
+}
+
void trace2_data_json_fl(const char *file, int line, const char *category,
const struct repository *repo, const char *key,
const struct json_writer *value)
diff --git a/trace2.h b/trace2.h
index e4f23784e4..115c45a1eb 100644
--- a/trace2.h
+++ b/trace2.h
@@ -463,6 +463,15 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
(value))
+void trace2_data_uintmax_fl(const char *file, int line, const char *category,
+ const struct repository *repo, const char *key,
+ uintmax_t value);
+
+#define trace2_data_uintmax(category, repo, key, value) \
+ trace2_data_uintmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
+ (value))
+
+
void trace2_data_json_fl(const char *file, int line, const char *category,
const struct repository *repo, const char *key,
const struct json_writer *jw);
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 02/10] commit-graph: stop using signed integers to count Bloom filters
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 03/10] commit-graph: fix type for some write options Patrick Steinhardt
` (7 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
When writing a new commit graph we have a couple of counters that
provide statistics around what kind of Bloom filters we have or have not
written. These counters naturally count from zero and are only ever
incremented, but they use a signed integer as type regardless.
Refactor those fields to be unsigned instead. Using an unsigned type
makes it explicit to the reader that they never have to worry about
negative values and thus makes the code easier to understand.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index bd7b6f5338..3fc1273ba5 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1170,11 +1170,11 @@ struct write_commit_graph_context {
size_t total_bloom_filter_data_size;
const struct bloom_filter_settings *bloom_settings;
- int count_bloom_filter_computed;
- int count_bloom_filter_not_computed;
- int count_bloom_filter_trunc_empty;
- int count_bloom_filter_trunc_large;
- int count_bloom_filter_upgraded;
+ unsigned count_bloom_filter_computed;
+ unsigned count_bloom_filter_not_computed;
+ unsigned count_bloom_filter_trunc_empty;
+ unsigned count_bloom_filter_trunc_large;
+ unsigned count_bloom_filter_upgraded;
};
static int write_graph_chunk_fanout(struct hashfile *f,
@@ -1779,16 +1779,16 @@ void ensure_generations_valid(struct repository *r,
static void trace2_bloom_filter_write_statistics(struct write_commit_graph_context *ctx)
{
- trace2_data_intmax("commit-graph", ctx->r, "filter-computed",
- ctx->count_bloom_filter_computed);
- trace2_data_intmax("commit-graph", ctx->r, "filter-not-computed",
- ctx->count_bloom_filter_not_computed);
- trace2_data_intmax("commit-graph", ctx->r, "filter-trunc-empty",
- ctx->count_bloom_filter_trunc_empty);
- trace2_data_intmax("commit-graph", ctx->r, "filter-trunc-large",
- ctx->count_bloom_filter_trunc_large);
- trace2_data_intmax("commit-graph", ctx->r, "filter-upgraded",
- ctx->count_bloom_filter_upgraded);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-computed",
+ ctx->count_bloom_filter_computed);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-not-computed",
+ ctx->count_bloom_filter_not_computed);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-trunc-empty",
+ ctx->count_bloom_filter_trunc_empty);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-trunc-large",
+ ctx->count_bloom_filter_trunc_large);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-upgraded",
+ ctx->count_bloom_filter_upgraded);
}
static void compute_bloom_filters(struct write_commit_graph_context *ctx)
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 03/10] commit-graph: fix type for some write options
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:34 ` Oswald Buddenhagen
2025-08-06 12:00 ` [PATCH v2 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
` (6 subsequent siblings)
9 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
The options "max-commits" and "size-multiple" are both supposed to be
positive integers and are documented as such, but we use a signed
integer field to store them. This causes sign comparison warnings in
`split_graph_merge_strategy()` because we end up comparing the option
values with the observed number of commits.
Fix the issue by converting the fields to be unsigned and convert the
options to use `OPT_UNSIGNED()` accordingly. This macro has only been
introduced recently, which might explain why the option values were
signed in the first place.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 4 ++--
commit-graph.c | 5 ++---
commit-graph.h | 4 ++--
3 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 25018a0b9d..145802afb7 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -241,9 +241,9 @@ static int graph_write(int argc, const char **argv, const char *prefix,
N_("allow writing an incremental commit-graph file"),
PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
write_option_parse_split),
- OPT_INTEGER(0, "max-commits", &write_opts.max_commits,
+ OPT_UNSIGNED(0, "max-commits", &write_opts.max_commits,
N_("maximum number of commits in a non-base split commit-graph")),
- OPT_INTEGER(0, "size-multiple", &write_opts.size_multiple,
+ OPT_UNSIGNED(0, "size-multiple", &write_opts.size_multiple,
N_("maximum ratio between two levels of a split commit-graph")),
OPT_EXPIRY_DATE(0, "expire-time", &write_opts.expire_time,
N_("only expire files older than a given date-time")),
diff --git a/commit-graph.c b/commit-graph.c
index 3fc1273ba5..ba04fe75db 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2235,9 +2235,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
uint32_t num_commits;
enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_UNSPECIFIED;
uint32_t i;
-
- int max_commits = 0;
- int size_mult = 2;
+ size_t max_commits = 0;
+ size_t size_mult = 2;
if (ctx->opts) {
max_commits = ctx->opts->max_commits;
diff --git a/commit-graph.h b/commit-graph.h
index 78ab7b875b..b71cb55697 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -160,8 +160,8 @@ enum commit_graph_split_flags {
};
struct commit_graph_opts {
- int size_multiple;
- int max_commits;
+ size_t size_multiple;
+ size_t max_commits;
timestamp_t expire_time;
enum commit_graph_split_flags split_flags;
int max_new_filters;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 04/10] commit-graph: fix sign comparison warnings
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (2 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 03/10] commit-graph: fix type for some write options Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
` (5 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
The "commit-graph.c" file has a bunch of sign comparison warnings:
- There are a bunch of variables that are declared as signed integers
even though they are used to count entities, like for example
`num_commit_graphs_before` and `num_commit_graphs_after`.
- There are several cases where we use signed loop variables to
iterate through an unsigned entity count.
- In `write_graph_chunk_base_1()` we count how many chunks we have
written in total. But while the value represents a positive
quantity, we still return a signed integer that we then later
compare with unsigned values.
- The Bloom settings hash version is being assigned `-1` even though
it's an unsigned value. This is used to indicate an unspecified
value and relies on 1's complement.
Fix all of these cases by either using the proper variable type or by
adding casts as required.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 54 +++++++++++++++++++++++++++---------------------------
1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index ba04fe75db..6d5b5444e7 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,4 @@
#define USE_THE_REPOSITORY_VARIABLE
-#define DISABLE_SIGN_COMPARE_WARNINGS
#include "git-compat-util.h"
#include "config.h"
@@ -569,7 +568,7 @@ static void validate_mixed_bloom_settings(struct commit_graph *g)
static int add_graph_to_chain(struct commit_graph *g,
struct commit_graph *chain,
struct object_id *oids,
- int n)
+ size_t n)
{
struct commit_graph *cur_g = chain;
@@ -622,7 +621,7 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size < the_hash_algo->hexsz) {
+ if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ -643,15 +642,16 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
struct object_id *oids;
- int i = 0, valid = 1, count;
+ int valid = 1;
FILE *fp = xfdopen(fd, "r");
+ size_t count;
count = st->st_size / (the_hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
odb_prepare_alternates(r->objects);
- for (i = 0; i < count; i++) {
+ for (size_t i = 0; i < count; i++) {
struct odb_source *source;
if (strbuf_getline_lf(&line, fp) == EOF)
@@ -1145,12 +1145,12 @@ struct write_commit_graph_context {
int num_generation_data_overflows;
unsigned long approx_nr_objects;
struct progress *progress;
- int progress_done;
+ uint64_t progress_done;
uint64_t progress_cnt;
char *base_graph_name;
- int num_commit_graphs_before;
- int num_commit_graphs_after;
+ uint32_t num_commit_graphs_before;
+ uint32_t num_commit_graphs_after;
char **commit_graph_filenames_before;
char **commit_graph_filenames_after;
char **commit_graph_hash_after;
@@ -1181,7 +1181,7 @@ static int write_graph_chunk_fanout(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i, count = 0;
+ size_t i, count = 0;
struct commit **list = ctx->commits.list;
/*
@@ -1209,7 +1209,8 @@ static int write_graph_chunk_oids(struct hashfile *f,
{
struct write_commit_graph_context *ctx = data;
struct commit **list = ctx->commits.list;
- int count;
+ size_t count;
+
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
@@ -1331,9 +1332,9 @@ static int write_graph_chunk_generation_data(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i, num_generation_data_overflows = 0;
+ int num_generation_data_overflows = 0;
- for (i = 0; i < ctx->commits.nr; i++) {
+ for (size_t i = 0; i < ctx->commits.nr; i++) {
struct commit *c = ctx->commits.list[i];
timestamp_t offset;
repo_parse_commit(ctx->r, c);
@@ -1355,8 +1356,8 @@ static int write_graph_chunk_generation_data_overflow(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i;
- for (i = 0; i < ctx->commits.nr; i++) {
+
+ for (size_t i = 0; i < ctx->commits.nr; i++) {
struct commit *c = ctx->commits.list[i];
timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
display_progress(ctx->progress, ++ctx->progress_cnt);
@@ -1526,7 +1527,7 @@ static void add_missing_parents(struct write_commit_graph_context *ctx, struct c
static void close_reachable(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct commit *commit;
enum commit_graph_split_flags flags = ctx->opts ?
ctx->opts->split_flags : COMMIT_GRAPH_SPLIT_UNSPECIFIED;
@@ -1620,10 +1621,9 @@ static void compute_reachable_generation_numbers(
struct compute_generation_info *info,
int generation_version)
{
- int i;
struct commit_list *list = NULL;
- for (i = 0; i < info->commits->nr; i++) {
+ for (size_t i = 0; i < info->commits->nr; i++) {
struct commit *c = info->commits->list[i];
timestamp_t gen;
repo_parse_commit(info->r, c);
@@ -1714,7 +1714,7 @@ static void set_generation_v2(struct commit *c, timestamp_t t,
static void compute_generation_numbers(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct compute_generation_info info = {
.r = ctx->r,
.commits = &ctx->commits,
@@ -1793,10 +1793,10 @@ static void trace2_bloom_filter_write_statistics(struct write_commit_graph_conte
static void compute_bloom_filters(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct progress *progress = NULL;
struct commit **sorted_commits;
- int max_new_filters;
+ size_t max_new_filters;
init_bloom_filters();
@@ -1814,7 +1814,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
QSORT(sorted_commits, ctx->commits.nr, commit_gen_cmp);
max_new_filters = ctx->opts && ctx->opts->max_new_filters >= 0 ?
- ctx->opts->max_new_filters : ctx->commits.nr;
+ (size_t) ctx->opts->max_new_filters : ctx->commits.nr;
for (i = 0; i < ctx->commits.nr; i++) {
enum bloom_filter_computed computed = 0;
@@ -2017,10 +2017,10 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
stop_progress(&ctx->progress);
}
-static int write_graph_chunk_base_1(struct hashfile *f,
- struct commit_graph *g)
+static size_t write_graph_chunk_base_1(struct hashfile *f,
+ struct commit_graph *g)
{
- int num = 0;
+ size_t num = 0;
if (!g)
return 0;
@@ -2034,7 +2034,7 @@ static int write_graph_chunk_base(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int num = write_graph_chunk_base_1(f, ctx->new_base_graph);
+ size_t num = write_graph_chunk_base_1(f, ctx->new_base_graph);
if (num != ctx->num_commit_graphs_after - 1) {
error(_("failed to write correct number of base graph ids"));
@@ -2480,7 +2480,7 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx)
if (stat(path.buf, &st) < 0)
continue;
- if (st.st_mtime > expire_time)
+ if ((unsigned) st.st_mtime > expire_time)
continue;
if (path.len < 6 || strcmp(path.buf + path.len - 6, ".graph"))
continue;
@@ -2576,7 +2576,7 @@ int write_commit_graph(struct odb_source *source,
ctx.changed_paths = 1;
/* don't propagate the hash_version unless unspecified */
- if (bloom_settings.hash_version == -1)
+ if (bloom_settings.hash_version == (unsigned) -1)
bloom_settings.hash_version = g->bloom_filter_settings->hash_version;
bloom_settings.bits_per_entry = g->bloom_filter_settings->bits_per_entry;
bloom_settings.num_hashes = g->bloom_filter_settings->num_hashes;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 05/10] commit-graph: stop using `the_hash_algo` via macros
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (3 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
` (4 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
We have two macros `GRAPH_DATA_WIDTH` and `GRAPH_MIN_SIZE` that compute
hash-dependent sizes. They do so by using the global `the_hash_algo`
variable though, which we want to get rid of over time.
Convert these macros into functions that accept the hash algorithm as
input parameter. Adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 6d5b5444e7..c4a0b12b6c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -52,8 +52,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */
#define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */
-#define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16)
-
#define GRAPH_VERSION_1 0x1
#define GRAPH_VERSION GRAPH_VERSION_1
@@ -65,8 +63,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_HEADER_SIZE 8
#define GRAPH_FANOUT_SIZE (4 * 256)
-#define GRAPH_MIN_SIZE (GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE \
- + GRAPH_FANOUT_SIZE + the_hash_algo->rawsz)
#define CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW (1ULL << 31)
@@ -79,6 +75,16 @@ define_commit_slab(topo_level_slab, uint32_t);
define_commit_slab(commit_pos, int);
static struct commit_pos commit_pos = COMMIT_SLAB_INIT(1, commit_pos);
+static size_t graph_data_width(const struct git_hash_algo *algop)
+{
+ return algop->rawsz + 16;
+}
+
+static size_t graph_min_size(const struct git_hash_algo *algop)
+{
+ return GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE + GRAPH_FANOUT_SIZE + algop->rawsz;
+}
+
static void set_commit_pos(struct repository *r, const struct object_id *oid)
{
static int32_t max_pos;
@@ -257,7 +263,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < GRAPH_MIN_SIZE) {
+ if (graph_size < graph_min_size(the_hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -313,7 +319,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / GRAPH_DATA_WIDTH != g->num_commits)
+ if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -378,7 +384,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < GRAPH_MIN_SIZE)
+ if (graph_size < graph_min_size(the_hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -900,7 +906,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(GRAPH_DATA_WIDTH, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1104,7 +1110,8 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(GRAPH_DATA_WIDTH, graph_pos - g->num_commits_in_base);
+ st_mult(graph_data_width(the_hash_algo),
+ graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 06/10] commit-graph: store the hash algorithm instead of its length
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (4 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
` (3 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
The commit-graph stores the length of the hash algorithm it uses. In
subsequent commits we'll need to pass the whole hash algorithm around
though, which we currently don't have access to.
Refactor the code so that we store the hash algorithm instead of only
its size.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 36 ++++++++++++++++++------------------
commit-graph.h | 2 +-
2 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index c4a0b12b6c..590a6972d2 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -310,7 +310,7 @@ static int graph_read_oid_lookup(const unsigned char *chunk_start,
{
struct commit_graph *g = data;
g->chunk_oid_lookup = chunk_start;
- if (chunk_size / g->hash_len != g->num_commits)
+ if (chunk_size / g->hash_algo->rawsz != g->num_commits)
return error(_("commit-graph OID lookup chunk is the wrong size"));
return 0;
}
@@ -412,7 +412,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
graph = alloc_commit_graph();
- graph->hash_len = the_hash_algo->rawsz;
+ graph->hash_algo = the_hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
@@ -477,7 +477,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
FREE_AND_NULL(graph->bloom_filter_settings);
}
- oidread(&graph->oid, graph->data + graph->data_len - graph->hash_len,
+ oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
the_repository->hash_algo);
free_chunkfile(cf);
@@ -583,7 +583,7 @@ static int add_graph_to_chain(struct commit_graph *g,
return 0;
}
- if (g->chunk_base_graphs_size / g->hash_len < n) {
+ if (g->chunk_base_graphs_size / g->hash_algo->rawsz < n) {
warning(_("commit-graph base graphs chunk is too small"));
return 0;
}
@@ -593,7 +593,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
- !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_len, n),
+ !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
the_repository->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
@@ -805,7 +805,7 @@ int generation_numbers_enabled(struct repository *r)
return 0;
first_generation = get_be32(g->chunk_commit_data +
- g->hash_len + 8) >> 2;
+ g->hash_algo->rawsz + 8) >> 2;
return !!first_generation;
}
@@ -849,7 +849,7 @@ void close_commit_graph(struct object_database *o)
static int bsearch_graph(struct commit_graph *g, const struct object_id *oid, uint32_t *pos)
{
return bsearch_hash(oid->hash, g->chunk_oid_fanout,
- g->chunk_oid_lookup, g->hash_len, pos);
+ g->chunk_oid_lookup, g->hash_algo->rawsz, pos);
}
static void load_oid_from_graph(struct commit_graph *g,
@@ -869,7 +869,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
- oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_len, lex_index),
+ oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
the_repository->hash_algo);
}
@@ -911,8 +911,8 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
- date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
- date_low = get_be32(commit_data + g->hash_len + 12);
+ date_high = get_be32(commit_data + g->hash_algo->rawsz + 8) & 0x3;
+ date_low = get_be32(commit_data + g->hash_algo->rawsz + 12);
item->date = (timestamp_t)((date_high << 32) | date_low);
if (g->read_generation_data) {
@@ -930,10 +930,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
} else
graph_data->generation = item->date + offset;
} else
- graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
+ graph_data->generation = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
if (g->topo_levels)
- *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2;
+ *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
}
static inline void set_commit_tree(struct commit *c, struct tree *t)
@@ -957,7 +957,7 @@ static int fill_commit_in_graph(struct repository *r,
fill_commit_graph_info(item, g, pos);
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(g->hash_len + 16, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(g->hash_algo->rawsz + 16, lex_index);
item->object.parsed = 1;
@@ -965,12 +965,12 @@ static int fill_commit_in_graph(struct repository *r,
pptr = &item->parents;
- edge_value = get_be32(commit_data + g->hash_len);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
pptr = insert_parent_or_die(r, g, edge_value, pptr);
- edge_value = get_be32(commit_data + g->hash_len + 4);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
@@ -2622,7 +2622,7 @@ int write_commit_graph(struct odb_source *source,
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
- oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
@@ -2753,7 +2753,7 @@ static int verify_one_commit_graph(struct repository *r,
for (i = 0; i < g->num_commits; i++) {
struct commit *graph_commit;
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
@@ -2798,7 +2798,7 @@ static int verify_one_commit_graph(struct repository *r,
timestamp_t generation;
display_progress(progress, ++(*seen));
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
diff --git a/commit-graph.h b/commit-graph.h
index b71cb55697..f20d28ff3a 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -84,7 +84,7 @@ struct commit_graph {
const unsigned char *data;
size_t data_len;
- unsigned char hash_len;
+ const struct git_hash_algo *hash_algo;
unsigned char num_chunks;
uint32_t num_commits;
struct object_id oid;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (5 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
` (2 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Refactor `parse_commit_graph()` so that it takes a repository instead of
taking repository settings. On the one hand this allows us to get rid of
instances where we access `the_hash_algo` by using the repository's hash
algorithm instead. On the other hand it also allows us to move the call
of `prepare_repo_settings()` into the function itself.
Note that there's one small catch, as the commit-graph fuzzer calls this
function directly without having a fully functional repository at hand.
And while the fuzzer already initializes `the_repository` with relevant
info, the call to `prepare_repo_settings()` would fail because we don't
have a fully-initialized repository.
Work around the issue by also settings `settings.initialized` to pretend
that we've already read the settings.
While at it, remove the redundant `parse_commit_graph()` declaration in
the fuzzer. It was added together with aa658574bf (commit-graph, fuzz:
add fuzzer for commit-graph, 2019-01-15), but as we also declared the
same function in "commit-graph.h" it wasn't ever needed.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 23 ++++++++++++-----------
commit-graph.h | 2 +-
oss-fuzz/fuzz-commit-graph.c | 6 ++----
3 files changed, 15 insertions(+), 16 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 590a6972d2a..50391cc0f32 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -270,9 +270,8 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
}
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
- prepare_repo_settings(r);
- ret = parse_commit_graph(&r->settings, graph_map, graph_size);
+ ret = parse_commit_graph(r, graph_map, graph_size);
if (ret)
ret->odb_source = source;
else
@@ -372,7 +371,7 @@ static int graph_read_bloom_data(const unsigned char *chunk_start,
return 0;
}
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
+struct commit_graph *parse_commit_graph(struct repository *r,
void *graph_map, size_t graph_size)
{
const unsigned char *data;
@@ -384,7 +383,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < graph_min_size(the_hash_algo))
+ if (graph_size < graph_min_size(r->hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -404,22 +403,22 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
hash_version = *(unsigned char*)(data + 5);
- if (hash_version != oid_version(the_hash_algo)) {
+ if (hash_version != oid_version(r->hash_algo)) {
error(_("commit-graph hash version %X does not match version %X"),
- hash_version, oid_version(the_hash_algo));
+ hash_version, oid_version(r->hash_algo));
return NULL;
}
graph = alloc_commit_graph();
- graph->hash_algo = the_hash_algo;
+ graph->hash_algo = r->hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
if (graph_size < GRAPH_HEADER_SIZE +
(graph->num_chunks + 1) * CHUNK_TOC_ENTRY_SIZE +
- GRAPH_FANOUT_SIZE + the_hash_algo->rawsz) {
+ GRAPH_FANOUT_SIZE + r->hash_algo->rawsz) {
error(_("commit-graph file is too small to hold %u chunks"),
graph->num_chunks);
free(graph);
@@ -450,7 +449,9 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
pair_chunk(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs,
&graph->chunk_base_graphs_size);
- if (s->commit_graph_generation_version >= 2) {
+ prepare_repo_settings(r);
+
+ if (r->settings.commit_graph_generation_version >= 2) {
read_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA,
graph_read_generation_data, graph);
pair_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW,
@@ -461,7 +462,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
graph->read_generation_data = 1;
}
- if (s->commit_graph_changed_paths_version) {
+ if (r->settings.commit_graph_changed_paths_version) {
read_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
graph_read_bloom_index, graph);
read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
@@ -478,7 +479,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
- the_repository->hash_algo);
+ r->hash_algo);
free_chunkfile(cf);
return graph;
diff --git a/commit-graph.h b/commit-graph.h
index f20d28ff3a0..5a5c876af0b 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -128,7 +128,7 @@ struct repo_settings;
* Callers should initialize the repo_settings with prepare_repo_settings()
* prior to calling parse_commit_graph().
*/
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
+struct commit_graph *parse_commit_graph(struct repository *r,
void *graph_map, size_t graph_size);
/*
diff --git a/oss-fuzz/fuzz-commit-graph.c b/oss-fuzz/fuzz-commit-graph.c
index fbb77fec197..fb8b8787a46 100644
--- a/oss-fuzz/fuzz-commit-graph.c
+++ b/oss-fuzz/fuzz-commit-graph.c
@@ -4,9 +4,6 @@
#include "commit-graph.h"
#include "repository.h"
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
- void *graph_map, size_t graph_size);
-
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
@@ -22,9 +19,10 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
* possible.
*/
repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+ the_repository->settings.initialized = 1;
the_repository->settings.commit_graph_generation_version = 2;
the_repository->settings.commit_graph_changed_paths_version = 1;
- g = parse_commit_graph(&the_repository->settings, (void *)data, size);
+ g = parse_commit_graph(the_repository, (void *)data, size);
repo_clear(the_repository);
free_commit_graph(g);
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 08/10] commit-graph: stop using `the_hash_algo`
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (6 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Stop using `the_hash_algo` as it implicitly relies on `the_repository`.
Instead, we either use the hash algo provided via the context or, if
there is no such hash algo, we use `the_repository` explicitly. Such
uses will be removed in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 3 ++-
commit-graph.c | 27 ++++++++++++++-------------
commit-graph.h | 3 ++-
3 files changed, 18 insertions(+), 15 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 145802afb7..680b03a83a 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -108,7 +108,8 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
opened = OPENED_GRAPH;
else if (errno != ENOENT)
die_errno(_("Could not open commit-graph '%s'"), graph_name);
- else if (open_commit_graph_chain(chain_name, &fd, &st))
+ else if (open_commit_graph_chain(chain_name, &fd, &st,
+ the_repository->hash_algo))
opened = OPENED_CHAIN;
else if (errno != ENOENT)
die_errno(_("could not open commit-graph chain '%s'"), chain_name);
diff --git a/commit-graph.c b/commit-graph.c
index 50391cc0f3..d351ea5806 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -263,7 +263,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(the_hash_algo)) {
+ if (graph_size < graph_min_size(r->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -318,7 +318,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
+ if (chunk_size / graph_data_width(g->hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -619,7 +619,8 @@ static int add_graph_to_chain(struct commit_graph *g,
}
int open_commit_graph_chain(const char *chain_file,
- int *fd, struct stat *st)
+ int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo)
{
*fd = git_open(chain_file);
if (*fd < 0)
@@ -628,7 +629,7 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
+ if (st->st_size < (ssize_t) hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ -653,7 +654,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
FILE *fp = xfdopen(fd, "r");
size_t count;
- count = st->st_size / (the_hash_algo->hexsz + 1);
+ count = st->st_size / (r->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
odb_prepare_alternates(r->objects);
@@ -715,7 +716,7 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
@@ -907,7 +908,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(g->hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1111,7 +1112,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(graph_data_width(the_hash_algo),
+ st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
@@ -1221,7 +1222,7 @@ static int write_graph_chunk_oids(struct hashfile *f,
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
- hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, (*list)->object.oid.hash, f->algop->rawsz);
}
return 0;
@@ -1252,7 +1253,7 @@ static int write_graph_chunk_data(struct hashfile *f,
die(_("unable to parse commit %s"),
oid_to_hex(&(*list)->object.oid));
tree = get_commit_tree_oid(*list);
- hashwrite(f, tree->hash, the_hash_algo->rawsz);
+ hashwrite(f, tree->hash, ctx->r->hash_algo->rawsz);
parent = (*list)->parents;
@@ -2034,7 +2035,7 @@ static size_t write_graph_chunk_base_1(struct hashfile *f,
return 0;
num = write_graph_chunk_base_1(f, g->base_graph);
- hashwrite(f, g->oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, g->oid.hash, g->hash_algo->rawsz);
return num + 1;
}
@@ -2058,7 +2059,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
struct hashfile *f;
struct tempfile *graph_layer; /* when ctx->split is non-zero */
struct lock_file lk = LOCK_INIT;
- const unsigned hashsz = the_hash_algo->rawsz;
+ const unsigned hashsz = ctx->r->hash_algo->rawsz;
struct strbuf progress_title = STRBUF_INIT;
struct chunkfile *cf;
unsigned char file_hash[GIT_MAX_RAWSZ];
@@ -2146,7 +2147,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
hashwrite_be32(f, GRAPH_SIGNATURE);
hashwrite_u8(f, GRAPH_VERSION);
- hashwrite_u8(f, oid_version(the_hash_algo));
+ hashwrite_u8(f, oid_version(ctx->r->hash_algo));
hashwrite_u8(f, get_num_chunks(cf));
hashwrite_u8(f, ctx->num_commit_graphs_after - 1);
diff --git a/commit-graph.h b/commit-graph.h
index 5a5c876af0..6de624785c 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -32,7 +32,8 @@ struct string_list;
char *get_commit_graph_filename(struct odb_source *source);
char *get_commit_graph_chain_filename(struct odb_source *source);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
-int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st);
+int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo);
/*
* Given a commit struct, try to fill the commit struct info, including:
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 09/10] commit-graph: stop using `the_repository`
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (7 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
There's still a bunch of uses of `the_repository` in "commit-graph.c",
which we want to stop using due to it being a global variable. Refactor
the code to stop using `the_repository` in favor of the repository
provided via the calling context.
This allows us to drop the `USE_THE_REPOSITORY_VARIABLE` macro.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 77 ++++++++++++++++++++++++++++----------------------------
commit-graph.h | 2 +-
4 files changed, 42 insertions(+), 41 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index 63e7158e98..8ca0aede48 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1933,7 +1933,7 @@ int cmd_commit(int argc,
"new index file. Check that disk is not full and quota is\n"
"not exceeded, and then \"git restore --staged :/\" to recover."));
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
repo_rerere(the_repository, 0);
run_auto_maintenance(quiet);
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..263cb58471 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1862,7 +1862,7 @@ int cmd_merge(int argc,
if (squash) {
finish(head_commit, remoteheads, NULL, NULL);
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
} else
write_merge_state(remoteheads);
diff --git a/commit-graph.c b/commit-graph.c
index d351ea5806..35143c356c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,3 @@
-#define USE_THE_REPOSITORY_VARIABLE
-
#include "git-compat-util.h"
#include "config.h"
#include "csum-file.h"
@@ -27,7 +25,7 @@
#include "tree.h"
#include "chunk-format.h"
-void git_test_write_commit_graph_or_die(void)
+void git_test_write_commit_graph_or_die(struct odb_source *source)
{
int flags = 0;
if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
@@ -36,8 +34,7 @@ void git_test_write_commit_graph_or_die(void)
if (git_env_bool(GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS, 0))
flags = COMMIT_GRAPH_WRITE_BLOOM_FILTERS;
- if (write_commit_graph_reachable(the_repository->objects->sources,
- flags, NULL))
+ if (write_commit_graph_reachable(source, flags, NULL))
die("failed to write commit-graph under GIT_TEST_COMMIT_GRAPH");
}
@@ -595,7 +592,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
!hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
- the_repository->hash_algo)) {
+ g->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
}
@@ -665,7 +662,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex(line.buf, &oids[i])) {
+ if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -751,7 +748,7 @@ static void prepare_commit_graph_one(struct repository *r,
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
* On the first invocation, this function attempts to load the commit
- * graph if the_repository is configured to have one.
+ * graph if the repository is configured to have one.
*/
static int prepare_commit_graph(struct repository *r)
{
@@ -872,7 +869,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
- the_repository->hash_algo);
+ g->hash_algo);
}
static struct commit_list **insert_parent_or_die(struct repository *r,
@@ -1115,7 +1112,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
- oidread(&oid, commit_data, the_repository->hash_algo);
+ oidread(&oid, commit_data, g->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
return c->maybe_tree;
@@ -1543,7 +1540,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Loading known commits in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1561,7 +1558,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
*/
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Expanding reachable commits in commit graph"),
0);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1582,7 +1579,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Clearing commit marks in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1699,7 +1696,7 @@ static void compute_topological_levels(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph topological levels"),
ctx->commits.nr);
@@ -1734,7 +1731,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph generation numbers"),
ctx->commits.nr);
@@ -1811,7 +1808,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit changed paths Bloom filters"),
ctx->commits.nr);
@@ -1857,6 +1854,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
}
struct refs_cb_data {
+ struct repository *repo;
struct oidset *commits;
struct progress *progress;
};
@@ -1869,9 +1867,9 @@ static int add_ref_to_set(const char *refname UNUSED,
struct object_id peeled;
struct refs_cb_data *data = (struct refs_cb_data *)cb_data;
- if (!peel_iterated_oid(the_repository, oid, &peeled))
+ if (!peel_iterated_oid(data->repo, oid, &peeled))
oid = &peeled;
- if (odb_read_object_info(the_repository->objects, oid, NULL) == OBJ_COMMIT)
+ if (odb_read_object_info(data->repo->objects, oid, NULL) == OBJ_COMMIT)
oidset_insert(data->commits, oid);
display_progress(data->progress, oidset_size(data->commits));
@@ -1888,13 +1886,15 @@ int write_commit_graph_reachable(struct odb_source *source,
int result;
memset(&data, 0, sizeof(data));
+ data.repo = source->odb->repo;
data.commits = &commits;
+
if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
data.progress = start_delayed_progress(
- the_repository,
+ source->odb->repo,
_("Collecting referenced commits"), 0);
- refs_for_each_ref(get_main_ref_store(the_repository), add_ref_to_set,
+ refs_for_each_ref(get_main_ref_store(source->odb->repo), add_ref_to_set,
&data);
stop_progress(&data.progress);
@@ -1923,7 +1923,7 @@ static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
"Finding commits for commit graph in %"PRIuMAX" packs",
pack_indexes->nr),
(uintmax_t)pack_indexes->nr);
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
progress_title.buf, 0);
ctx->progress_done = 0;
}
@@ -1977,7 +1977,7 @@ static void fill_oids_from_all_packs(struct write_commit_graph_context *ctx)
{
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding commits for commit graph among packed objects"),
ctx->approx_nr_objects);
for_each_packed_object(ctx->r, add_packed_commits, ctx,
@@ -1996,7 +1996,7 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
ctx->num_extra_edges = 0;
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding extra edges in commit graph"),
ctx->oids.nr);
oid_array_sort(&ctx->oids);
@@ -2075,7 +2075,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
ctx->graph_name = get_commit_graph_filename(ctx->odb_source);
}
- if (safe_create_leading_directories(the_repository, ctx->graph_name)) {
+ if (safe_create_leading_directories(ctx->r, ctx->graph_name)) {
error(_("unable to create leading directories of %s"),
ctx->graph_name);
return -1;
@@ -2094,18 +2094,18 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
return -1;
}
- if (adjust_shared_perm(the_repository, get_tempfile_path(graph_layer))) {
+ if (adjust_shared_perm(ctx->r, get_tempfile_path(graph_layer))) {
error(_("unable to adjust shared permissions for '%s'"),
get_tempfile_path(graph_layer));
return -1;
}
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_tempfile_fd(graph_layer), get_tempfile_path(graph_layer));
} else {
hold_lock_file_for_update_mode(&lk, ctx->graph_name,
LOCK_DIE_ON_ERROR, 0444);
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_lock_file_fd(&lk), get_lock_file_path(&lk));
}
@@ -2158,7 +2158,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
get_num_chunks(cf)),
get_num_chunks(cf));
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
progress_title.buf,
st_mult(get_num_chunks(cf), ctx->commits.nr));
}
@@ -2216,7 +2216,8 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
}
free(ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
- ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] = xstrdup(hash_to_hex(file_hash));
+ ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] =
+ xstrdup(hash_to_hex_algop(file_hash, ctx->r->hash_algo));
final_graph_name = get_split_graph_filename(ctx->odb_source,
ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1]);
@@ -2370,7 +2371,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Scanning merged commits"),
ctx->commits.nr);
@@ -2415,7 +2416,7 @@ static void merge_commit_graphs(struct write_commit_graph_context *ctx)
current_graph_number--;
if (ctx->report_progress)
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
_("Merging commit-graph"), 0);
merge_commit_graph(ctx, g);
@@ -2518,7 +2519,7 @@ int write_commit_graph(struct odb_source *source,
enum commit_graph_write_flags flags,
const struct commit_graph_opts *opts)
{
- struct repository *r = the_repository;
+ struct repository *r = source->odb->repo;
struct write_commit_graph_context ctx = {
.r = r,
.odb_source = source,
@@ -2618,14 +2619,14 @@ int write_commit_graph(struct odb_source *source,
replace = ctx.opts->split_flags & COMMIT_GRAPH_SPLIT_REPLACE;
}
- ctx.approx_nr_objects = repo_approximate_object_count(the_repository);
+ ctx.approx_nr_objects = repo_approximate_object_count(r);
if (ctx.append && ctx.r->objects->commit_graph) {
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ r->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
}
@@ -2733,7 +2734,7 @@ static void graph_report(const char *fmt, ...)
static int commit_graph_checksum_valid(struct commit_graph *g)
{
- return hashfile_checksum_valid(the_repository->hash_algo,
+ return hashfile_checksum_valid(g->hash_algo,
g->data, g->data_len);
}
@@ -2756,7 +2757,7 @@ static int verify_one_commit_graph(struct repository *r,
struct commit *graph_commit;
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
graph_report(_("commit-graph has incorrect OID order: %s then %s"),
@@ -2801,7 +2802,7 @@ static int verify_one_commit_graph(struct repository *r,
display_progress(progress, ++(*seen));
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
odb_commit = (struct commit *)create_object(r, &cur_oid, alloc_commit_node(r));
@@ -2905,7 +2906,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(the_repository,
+ progress = start_progress(r,
_("Verifying commits in commit graph"),
total);
}
diff --git a/commit-graph.h b/commit-graph.h
index 6de624785c..a2471e9bdc 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -21,7 +21,7 @@
* call this method oustide of a builtin, and only if you know what
* you are doing!
*/
-void git_test_write_commit_graph_or_die(void);
+void git_test_write_commit_graph_or_die(struct odb_source *source);
struct commit;
struct bloom_filter_settings;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v2 10/10] commit-graph: stop passing in redundant repository
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
` (8 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
@ 2025-08-06 12:00 ` Patrick Steinhardt
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-06 12:00 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Many of the commit-graph related functions take in both a repository and
the object database source (directly or via `struct commit_graph`) for
which we are supposed to load such a commit-graph. In the best case this
information is simply redundant as the source already contains a
reference to its owning object database, which in turn has a reference
to its repository. In the worst case this information could even
mismatch when passing in a source that doesn't belong to the same
repository.
Refactor the code so that we only pass in the object database source in
those cases.
There is one exception though, namely `load_commit_graph_chain_fd_st()`,
which is responsible for loading a commit-graph chain. It is expected
that parts of the commit-graph chain aren't located in the same object
source as the chain file itself, but in a different one. Consequently,
this function doesn't work on the source level but on the database level
instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 6 +--
commit-graph.c | 120 +++++++++++++++++++--------------------------
commit-graph.h | 12 ++---
t/helper/test-read-graph.c | 2 +-
4 files changed, 59 insertions(+), 81 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 680b03a83a..1b80993b2d 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -121,15 +121,15 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
if (opened == OPENED_NONE)
return 0;
else if (opened == OPENED_GRAPH)
- graph = load_commit_graph_one_fd_st(the_repository, fd, &st, source);
+ graph = load_commit_graph_one_fd_st(source, fd, &st);
else
- graph = load_commit_graph_chain_fd_st(the_repository, fd, &st,
+ graph = load_commit_graph_chain_fd_st(the_repository->objects, fd, &st,
&incomplete_chain);
if (!graph)
return 1;
- ret = verify_commit_graph(the_repository, graph, flags);
+ ret = verify_commit_graph(graph, flags);
free_commit_graph(graph);
if (incomplete_chain) {
diff --git a/commit-graph.c b/commit-graph.c
index 35143c356c..e2e70ea4f6 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -250,9 +250,8 @@ int open_commit_graph(const char *graph_file, int *fd, struct stat *st)
return 1;
}
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source)
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st)
{
void *graph_map;
size_t graph_size;
@@ -260,7 +259,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(r->hash_algo)) {
+ if (graph_size < graph_min_size(source->odb->repo->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -268,7 +267,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
- ret = parse_commit_graph(r, graph_map, graph_size);
+ ret = parse_commit_graph(source->odb->repo, graph_map, graph_size);
if (ret)
ret->odb_source = source;
else
@@ -488,11 +487,9 @@ struct commit_graph *parse_commit_graph(struct repository *r,
return NULL;
}
-static struct commit_graph *load_commit_graph_one(struct repository *r,
- const char *graph_file,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_one(struct odb_source *source,
+ const char *graph_file)
{
-
struct stat st;
int fd;
struct commit_graph *g;
@@ -501,19 +498,17 @@ static struct commit_graph *load_commit_graph_one(struct repository *r,
if (!open_ok)
return NULL;
- g = load_commit_graph_one_fd_st(r, fd, &st, source);
-
+ g = load_commit_graph_one_fd_st(source, fd, &st);
if (g)
g->filename = xstrdup(graph_file);
return g;
}
-static struct commit_graph *load_commit_graph_v1(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_v1(struct odb_source *source)
{
char *graph_name = get_commit_graph_filename(source);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
return g;
@@ -640,7 +635,7 @@ int open_commit_graph_chain(const char *chain_file,
return 1;
}
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain)
{
@@ -651,10 +646,10 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
FILE *fp = xfdopen(fd, "r");
size_t count;
- count = st->st_size / (r->hash_algo->hexsz + 1);
+ count = st->st_size / (odb->repo->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
- odb_prepare_alternates(r->objects);
+ odb_prepare_alternates(odb);
for (size_t i = 0; i < count; i++) {
struct odb_source *source;
@@ -662,7 +657,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
+ if (get_oid_hex_algop(line.buf, &oids[i], odb->repo->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -670,9 +665,9 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
}
valid = 0;
- for (source = r->objects->sources; source; source = source->next) {
+ for (source = odb->sources; source; source = source->next) {
char *graph_name = get_split_graph_filename(source, line.buf);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
@@ -705,45 +700,33 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
return graph_chain;
}
-static struct commit_graph *load_commit_graph_chain(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_chain(struct odb_source *source)
{
char *chain_file = get_commit_graph_chain_filename(source);
struct stat st;
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, source->odb->repo->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
- g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
+ g = load_commit_graph_chain_fd_st(source->odb, fd, &st, &incomplete);
}
free(chain_file);
return g;
}
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source)
+struct commit_graph *read_commit_graph_one(struct odb_source *source)
{
- struct commit_graph *g = load_commit_graph_v1(r, source);
+ struct commit_graph *g = load_commit_graph_v1(source);
if (!g)
- g = load_commit_graph_chain(r, source);
+ g = load_commit_graph_chain(source);
return g;
}
-static void prepare_commit_graph_one(struct repository *r,
- struct odb_source *source)
-{
-
- if (r->objects->commit_graph)
- return;
-
- r->objects->commit_graph = read_commit_graph_one(r, source);
-}
-
/*
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
@@ -784,10 +767,12 @@ static int prepare_commit_graph(struct repository *r)
return 0;
odb_prepare_alternates(r->objects);
- for (source = r->objects->sources;
- !r->objects->commit_graph && source;
- source = source->next)
- prepare_commit_graph_one(r, source);
+ for (source = r->objects->sources; source; source = source->next) {
+ r->objects->commit_graph = read_commit_graph_one(source);
+ if (r->objects->commit_graph)
+ break;
+ }
+
return !!r->objects->commit_graph;
}
@@ -872,8 +857,7 @@ static void load_oid_from_graph(struct commit_graph *g,
g->hash_algo);
}
-static struct commit_list **insert_parent_or_die(struct repository *r,
- struct commit_graph *g,
+static struct commit_list **insert_parent_or_die(struct commit_graph *g,
uint32_t pos,
struct commit_list **pptr)
{
@@ -884,7 +868,7 @@ static struct commit_list **insert_parent_or_die(struct repository *r,
die("invalid parent position %"PRIu32, pos);
load_oid_from_graph(g, pos, &oid);
- c = lookup_commit(r, &oid);
+ c = lookup_commit(g->odb_source->odb->repo, &oid);
if (!c)
die(_("could not find commit %s"), oid_to_hex(&oid));
commit_graph_data_at(c)->graph_pos = pos;
@@ -940,8 +924,7 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
c->maybe_tree = t;
}
-static int fill_commit_in_graph(struct repository *r,
- struct commit *item,
+static int fill_commit_in_graph(struct commit *item,
struct commit_graph *g, uint32_t pos)
{
uint32_t edge_value;
@@ -967,13 +950,13 @@ static int fill_commit_in_graph(struct repository *r,
edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
return 1;
}
@@ -988,7 +971,7 @@ static int fill_commit_in_graph(struct repository *r,
}
edge_value = get_be32(g->chunk_extra_edges +
sizeof(uint32_t) * parent_data_pos);
- pptr = insert_parent_or_die(r, g,
+ pptr = insert_parent_or_die(g,
edge_value & GRAPH_EDGE_LAST_MASK,
pptr);
parent_data_pos++;
@@ -1054,14 +1037,13 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
if (commit->object.parsed)
return commit;
- if (!fill_commit_in_graph(repo, commit, repo->objects->commit_graph, pos))
+ if (!fill_commit_in_graph(commit, repo->objects->commit_graph, pos))
return NULL;
return commit;
}
-static int parse_commit_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static int parse_commit_in_graph_one(struct commit_graph *g,
struct commit *item)
{
uint32_t pos;
@@ -1070,7 +1052,7 @@ static int parse_commit_in_graph_one(struct repository *r,
return 1;
if (find_commit_pos_in_graph(item, g, &pos))
- return fill_commit_in_graph(r, item, g, pos);
+ return fill_commit_in_graph(item, g, pos);
return 0;
}
@@ -1087,7 +1069,7 @@ int parse_commit_in_graph(struct repository *r, struct commit *item)
if (!prepare_commit_graph(r))
return 0;
- return parse_commit_in_graph_one(r, r->objects->commit_graph, item);
+ return parse_commit_in_graph_one(r->objects->commit_graph, item);
}
void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1097,8 +1079,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
fill_commit_graph_info(item, r->objects->commit_graph, pos);
}
-static struct tree *load_tree_for_commit(struct repository *r,
- struct commit_graph *g,
+static struct tree *load_tree_for_commit(struct commit_graph *g,
struct commit *c)
{
struct object_id oid;
@@ -1113,13 +1094,12 @@ static struct tree *load_tree_for_commit(struct repository *r,
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, g->hash_algo);
- set_commit_tree(c, lookup_tree(r, &oid));
+ set_commit_tree(c, lookup_tree(g->odb_source->odb->repo, &oid));
return c->maybe_tree;
}
-static struct tree *get_commit_tree_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g,
const struct commit *c)
{
if (c->maybe_tree)
@@ -1127,12 +1107,12 @@ static struct tree *get_commit_tree_in_graph_one(struct repository *r,
if (commit_graph_position(c) == COMMIT_NOT_FROM_GRAPH)
BUG("get_commit_tree_in_graph_one called from non-commit-graph commit");
- return load_tree_for_commit(r, g, (struct commit *)c);
+ return load_tree_for_commit(g, (struct commit *)c);
}
struct tree *get_commit_tree_in_graph(struct repository *r, const struct commit *c)
{
- return get_commit_tree_in_graph_one(r, r->objects->commit_graph, c);
+ return get_commit_tree_in_graph_one(r->objects->commit_graph, c);
}
struct packed_commit_list {
@@ -2738,11 +2718,11 @@ static int commit_graph_checksum_valid(struct commit_graph *g)
g->data, g->data_len);
}
-static int verify_one_commit_graph(struct repository *r,
- struct commit_graph *g,
+static int verify_one_commit_graph(struct commit_graph *g,
struct progress *progress,
uint64_t *seen)
{
+ struct repository *r = g->odb_source->odb->repo;
uint32_t i, cur_fanout_pos = 0;
struct object_id prev_oid, cur_oid;
struct commit *seen_gen_zero = NULL;
@@ -2776,7 +2756,7 @@ static int verify_one_commit_graph(struct repository *r,
}
graph_commit = lookup_commit(r, &cur_oid);
- if (!parse_commit_in_graph_one(r, g, graph_commit))
+ if (!parse_commit_in_graph_one(g, graph_commit))
graph_report(_("failed to parse commit %s from commit-graph"),
oid_to_hex(&cur_oid));
}
@@ -2812,7 +2792,7 @@ static int verify_one_commit_graph(struct repository *r,
continue;
}
- if (!oideq(&get_commit_tree_in_graph_one(r, g, graph_commit)->object.oid,
+ if (!oideq(&get_commit_tree_in_graph_one(g, graph_commit)->object.oid,
get_commit_tree_oid(odb_commit)))
graph_report(_("root tree OID for commit %s in commit-graph is %s != %s"),
oid_to_hex(&cur_oid),
@@ -2830,7 +2810,7 @@ static int verify_one_commit_graph(struct repository *r,
}
/* parse parent in case it is in a base graph */
- parse_commit_in_graph_one(r, g, graph_parents->item);
+ parse_commit_in_graph_one(g, graph_parents->item);
if (!oideq(&graph_parents->item->object.oid, &odb_parents->item->object.oid))
graph_report(_("commit-graph parent for %s is %s != %s"),
@@ -2890,7 +2870,7 @@ static int verify_one_commit_graph(struct repository *r,
return verify_commit_graph_error;
}
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
+int verify_commit_graph(struct commit_graph *g, int flags)
{
struct progress *progress = NULL;
int local_error = 0;
@@ -2906,13 +2886,13 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(r,
+ progress = start_progress(g->odb_source->odb->repo,
_("Verifying commits in commit graph"),
total);
}
for (; g; g = g->base_graph) {
- local_error |= verify_one_commit_graph(r, g, progress, &seen);
+ local_error |= verify_one_commit_graph(g, progress, &seen);
if (flags & COMMIT_GRAPH_VERIFY_SHALLOW)
break;
}
diff --git a/commit-graph.h b/commit-graph.h
index a2471e9bdc..ad0bce5ed1 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -114,14 +114,12 @@ struct commit_graph {
struct bloom_filter_settings *bloom_filter_settings;
};
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source);
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st);
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain);
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source);
+struct commit_graph *read_commit_graph_one(struct odb_source *source);
struct repo_settings;
@@ -185,7 +183,7 @@ int write_commit_graph(struct odb_source *source,
#define COMMIT_GRAPH_VERIFY_SHALLOW (1 << 0)
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags);
+int verify_commit_graph(struct commit_graph *g, int flags);
void close_commit_graph(struct object_database *);
void free_commit_graph(struct commit_graph *);
diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c
index ef5339bbee..6a5f64e473 100644
--- a/t/helper/test-read-graph.c
+++ b/t/helper/test-read-graph.c
@@ -81,7 +81,7 @@ int cmd__read_graph(int argc, const char **argv)
prepare_repo_settings(the_repository);
- graph = read_commit_graph_one(the_repository, source);
+ graph = read_commit_graph_one(source);
if (!graph) {
ret = 1;
goto done;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH v2 03/10] commit-graph: fix type for some write options
2025-08-06 12:00 ` [PATCH v2 03/10] commit-graph: fix type for some write options Patrick Steinhardt
@ 2025-08-06 12:34 ` Oswald Buddenhagen
2025-08-06 15:40 ` Junio C Hamano
0 siblings, 1 reply; 69+ messages in thread
From: Oswald Buddenhagen @ 2025-08-06 12:34 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Taylor Blau, Derrick Stolee, Junio C Hamano
On Wed, Aug 06, 2025 at 02:00:08PM +0200, Patrick Steinhardt wrote:
>+ OPT_UNSIGNED(0, "max-commits", &write_opts.max_commits,
>
>+ size_t max_commits;
>
dunno, this really seems to be crying for OPT_SIZE_T being split off.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-06 6:23 ` Patrick Steinhardt
@ 2025-08-06 12:54 ` Oswald Buddenhagen
2025-08-06 19:04 ` Junio C Hamano
2025-08-06 15:41 ` Junio C Hamano
1 sibling, 1 reply; 69+ messages in thread
From: Oswald Buddenhagen @ 2025-08-06 12:54 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Taylor Blau, Junio C Hamano, git
On Wed, Aug 06, 2025 at 08:23:36AM +0200, Patrick Steinhardt wrote:
>On Mon, Aug 04, 2025 at 05:44:06PM -0400, Taylor Blau wrote:
>> I wrote these counters in 312cff5207 (bloom: split
>> 'get_bloom_filter()'
>> in two, 2020-09-16) and 59f0d5073f (bloom: encode out-of-bounds filters
>> as non-empty, 2020-09-17), and I don't see a compelling reason that
>> these should be unsigned.
>
>I think that is going backwards though: the question to ask is why
>should these be signed if they cannot ever be negative?
>
>> It's true that we don't have any need for negative values here since we
>> are counting from zero, but I don't think that alone justifies changing
>> the signed-ness here.
>>
>> Is there a reason beyond "these are always non-negative" that changing
>> the signed-ness is warranted? If so, let's discuss that and make sure
>> that it is documented in the commit message. If not, I think we could
>> drop this patch (and optionally the patch before it as well).
>
>Yes, there is: it makes code easier to reason about for the reader, and
>it means that we don't have to guard ourselves against negative values.
>
>If I see a counting variable that is signed I immediately jump to the
>question of whether or not it can ever be a negative value. I assume
>that the author of this code _intentfully_ made it signed to cater to a
>specific edge case.
>
well, there is also the diametrically opposed view:
https://google.github.io/styleguide/cppguide.html#Integer_Types
https://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
https://soundsoftware.ac.uk/c-pitfall-unsigned
https://stackoverflow.com/questions/30395205/why-are-unsigned-integers-error-prone
..
in isync, i standardized on unsigned where possible (e2d3b4d55), and
sure enough, i introduced one of those underflow bugs not much later
(859b7dd7f => 12e30ce56) ...
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH v2 03/10] commit-graph: fix type for some write options
2025-08-06 12:34 ` Oswald Buddenhagen
@ 2025-08-06 15:40 ` Junio C Hamano
2025-08-07 7:07 ` Patrick Steinhardt
0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2025-08-06 15:40 UTC (permalink / raw)
To: Oswald Buddenhagen; +Cc: Patrick Steinhardt, git, Taylor Blau, Derrick Stolee
Oswald Buddenhagen <oswald.buddenhagen@gmx.de> writes:
> On Wed, Aug 06, 2025 at 02:00:08PM +0200, Patrick Steinhardt wrote:
>>+ OPT_UNSIGNED(0, "max-commits", &write_opts.max_commits,
>>
>>+ size_t max_commits;
>>
> dunno, this really seems to be crying for OPT_SIZE_T being split off.
Or just use "unsigned int".
Really, what does NUMBER OF commits we will handle have anything to
do with how many bytes of core we ask to grab from the system?
This "we count things in size_t" is a superstition we should stop.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-06 6:23 ` Patrick Steinhardt
2025-08-06 12:54 ` Oswald Buddenhagen
@ 2025-08-06 15:41 ` Junio C Hamano
2025-08-07 7:04 ` Patrick Steinhardt
1 sibling, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2025-08-06 15:41 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Taylor Blau, Oswald Buddenhagen, git
Patrick Steinhardt <ps@pks.im> writes:
>> I wrote these counters in 312cff5207 (bloom: split 'get_bloom_filter()'
>> in two, 2020-09-16) and 59f0d5073f (bloom: encode out-of-bounds filters
>> as non-empty, 2020-09-17), and I don't see a compelling reason that
>> these should be unsigned.
>
> I think that is going backwards though: the question to ask is why
> should these be signed if they cannot ever be negative?
Earlier I gave an example of allowing for a "not yet counted"
sentinel value for a variable or a structure member. Another
example may be for a function that counts that also needs to signal
an error, and as usual in any C programs, the natural way to do so
for any function whose "normal" return values are non-negative
integers is to signal errors with a negative value.
Note that a structure member or a variable that does not need such a
"not yet counted" sentinel value (e.g., it may have a separate
"counted already" member associated with it, or the nature of the
thing it counts does not have such "not yet counted" state), and it
is possible for such a variable to live happily with a function that
can signal an error.
It means the variable that receives the counted result from such a
function may be able to use only half a range of values as its type
implies, if that helper function is the only source of information
that is assigned to it, though.
If the counter in question never needs to store such a sentinel
value itself, then I am OK for it to be unsigned, and that is
exactly why I said "not always a valid excuse". But if the counter
variable or structure member has to work with functions that need to
return sentinel values (like platform natural int that can use the
usual "negative is an error, non-negative is a normal result"), it
may have less chance to trigger the -Wsign-compare irritation, if
you made it also signed.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-06 12:54 ` Oswald Buddenhagen
@ 2025-08-06 19:04 ` Junio C Hamano
0 siblings, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-08-06 19:04 UTC (permalink / raw)
To: Oswald Buddenhagen; +Cc: Patrick Steinhardt, Taylor Blau, git
Oswald Buddenhagen <oswald.buddenhagen@gmx.de> writes:
> On Wed, Aug 06, 2025 at 08:23:36AM +0200, Patrick Steinhardt wrote:
>>If I see a counting variable that is signed I immediately jump to the
>>question of whether or not it can ever be a negative value. I assume
>>that the author of this code _intentfully_ made it signed to cater to a
>>specific edge case.
>>
> well, there is also the diametrically opposed view:
> https://google.github.io/styleguide/cppguide.html#Integer_Types
> https://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
> https://soundsoftware.ac.uk/c-pitfall-unsigned
> https://stackoverflow.com/questions/30395205/why-are-unsigned-integers-error-prone
> ..
>
> in isync, i standardized on unsigned where possible (e2d3b4d55), and
> sure enough, i introduced one of those underflow bugs not much later
> (859b7dd7f => 12e30ce56) ...
Thanks for an amusing reading list ;-)
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-06 15:41 ` Junio C Hamano
@ 2025-08-07 7:04 ` Patrick Steinhardt
2025-08-07 22:41 ` Junio C Hamano
0 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 7:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Taylor Blau, Oswald Buddenhagen, git
On Wed, Aug 06, 2025 at 08:41:32AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> >> I wrote these counters in 312cff5207 (bloom: split 'get_bloom_filter()'
> >> in two, 2020-09-16) and 59f0d5073f (bloom: encode out-of-bounds filters
> >> as non-empty, 2020-09-17), and I don't see a compelling reason that
> >> these should be unsigned.
> >
> > I think that is going backwards though: the question to ask is why
> > should these be signed if they cannot ever be negative?
>
> Earlier I gave an example of allowing for a "not yet counted"
> sentinel value for a variable or a structure member. Another
> example may be for a function that counts that also needs to signal
> an error, and as usual in any C programs, the natural way to do so
> for any function whose "normal" return values are non-negative
> integers is to signal errors with a negative value.
>
> Note that a structure member or a variable that does not need such a
> "not yet counted" sentinel value (e.g., it may have a separate
> "counted already" member associated with it, or the nature of the
> thing it counts does not have such "not yet counted" state), and it
> is possible for such a variable to live happily with a function that
> can signal an error.
>
> It means the variable that receives the counted result from such a
> function may be able to use only half a range of values as its type
> implies, if that helper function is the only source of information
> that is assigned to it, though.
>
> If the counter in question never needs to store such a sentinel
> value itself, then I am OK for it to be unsigned, and that is
> exactly why I said "not always a valid excuse". But if the counter
> variable or structure member has to work with functions that need to
> return sentinel values (like platform natural int that can use the
> usual "negative is an error, non-negative is a normal result"), it
> may have less chance to trigger the -Wsign-compare irritation, if
> you made it also signed.
Yup, fully agreed, and this is a good reason why it should be signed. In
the case at hand though we never use such sentinel values, and I think
making that explicit by using an unsigned type is a good thing as it
tells the reader that "Yup, no sentinels involved, it's a plain counter
from 0 to $NUM_ENTRIES".
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH v2 03/10] commit-graph: fix type for some write options
2025-08-06 15:40 ` Junio C Hamano
@ 2025-08-07 7:07 ` Patrick Steinhardt
0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 7:07 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Oswald Buddenhagen, git, Taylor Blau, Derrick Stolee
On Wed, Aug 06, 2025 at 08:40:32AM -0700, Junio C Hamano wrote:
> Oswald Buddenhagen <oswald.buddenhagen@gmx.de> writes:
>
> > On Wed, Aug 06, 2025 at 02:00:08PM +0200, Patrick Steinhardt wrote:
> >>+ OPT_UNSIGNED(0, "max-commits", &write_opts.max_commits,
> >>
> >>+ size_t max_commits;
> >>
> > dunno, this really seems to be crying for OPT_SIZE_T being split off.
>
> Or just use "unsigned int".
We don't need `OPT_SIZE_T` because `OPT_UNSIGNED()` knows to handle
unsigned integers of arbitrary widths. It does a `sizeof()` of the value
and passes that as precision to the parsing code.
> Really, what does NUMBER OF commits we will handle have anything to
> do with how many bytes of core we ask to grab from the system?
>
> This "we count things in size_t" is a superstition we should stop.
Will adapt to use `unsigned`.
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH v3 00/10] commit-graph: remove reliance on global state
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (10 preceding siblings ...)
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
` (9 more replies)
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
12 siblings, 10 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Hi,
this patch series is another step on our long road towards not having
global state. In addition to that, as commit-graphs are part of the
object database layer, this is also another step towards pluggable
object databases.
Changes in v2:
- Use `unsigned` instead of `size_t` to count number of Bloom filters.
- Use `uint32_t` instead of `size_t` for number of commit graphs,
as this type is also used to iterate through this count already.
- Refactor `parse_commit_graph()` to take a repository instead of both
repo settings and a hash algo.
- Link to v1: https://lore.kernel.org/r/20250804-b4-pks-commit-graph-wo-the-repository-v1-0-850d626eb2e8@pks.im
Changes in v3:
- Use `unsigned` for commit-graph options instead of `size_t`.
- Link to v2: https://lore.kernel.org/r/20250806-b4-pks-commit-graph-wo-the-repository-v2-0-911bae638e61@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (10):
trace2: introduce function to trace unsigned integers
commit-graph: stop using signed integers to count Bloom filters
commit-graph: fix type for some write options
commit-graph: fix sign comparison warnings
commit-graph: stop using `the_hash_algo` via macros
commit-graph: store the hash algorithm instead of its length
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: stop using `the_hash_algo`
commit-graph: stop using `the_repository`
commit-graph: stop passing in redundant repository
builtin/commit-graph.c | 13 +-
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 371 +++++++++++++++++++++----------------------
commit-graph.h | 25 ++-
oss-fuzz/fuzz-commit-graph.c | 6 +-
t/helper/test-read-graph.c | 2 +-
trace2.c | 14 ++
trace2.h | 9 ++
9 files changed, 227 insertions(+), 217 deletions(-)
Range-diff versus v2:
1: 16f0fd6fb4 = 1: a652405a05 trace2: introduce function to trace unsigned integers
2: 53f12d827b = 2: 16a02e5dc0 commit-graph: stop using signed integers to count Bloom filters
3: f8d920e132 ! 3: 0cbd808dab commit-graph: fix type for some write options
@@ commit-graph.c: static void split_graph_merge_strategy(struct write_commit_graph
-
- int max_commits = 0;
- int size_mult = 2;
-+ size_t max_commits = 0;
-+ size_t size_mult = 2;
++ unsigned max_commits = 0;
++ unsigned size_mult = 2;
if (ctx->opts) {
max_commits = ctx->opts->max_commits;
@@ commit-graph.h: enum commit_graph_split_flags {
struct commit_graph_opts {
- int size_multiple;
- int max_commits;
-+ size_t size_multiple;
-+ size_t max_commits;
++ unsigned size_multiple;
++ unsigned max_commits;
timestamp_t expire_time;
enum commit_graph_split_flags split_flags;
int max_new_filters;
4: 12d8d8f087 = 4: 8c1e6dc24c commit-graph: fix sign comparison warnings
5: c7fc957de1 = 5: 9b0c61d221 commit-graph: stop using `the_hash_algo` via macros
6: d41c5a419a = 6: 41e2e742ee commit-graph: store the hash algorithm instead of its length
7: fec6cf25c7 = 7: 06dc3545fe commit-graph: refactor `parse_commit_graph()` to take a repository
8: 6a3ba128c2 = 8: e06517c3a2 commit-graph: stop using `the_hash_algo`
9: c2e549b474 = 9: 0f10e272bb commit-graph: stop using `the_repository`
10: 59a325475d = 10: e00dc6651c commit-graph: stop passing in redundant repository
---
base-commit: e813a0200a7121b97fec535f0d0b460b0a33356c
change-id: 20250717-b4-pks-commit-graph-wo-the-repository-1dc2cacbc8e3
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH v3 01/10] trace2: introduce function to trace unsigned integers
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
` (8 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
While we have `trace2_data_intmax()`, there is no equivalent function
that takes an unsigned integer. Introduce `trace2_data_uintmax()` to
plug this gap.
This function will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
trace2.c | 14 ++++++++++++++
trace2.h | 9 +++++++++
2 files changed, 23 insertions(+)
diff --git a/trace2.c b/trace2.c
index c23c0a227b..a687944f7b 100644
--- a/trace2.c
+++ b/trace2.c
@@ -948,6 +948,20 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
strbuf_release(&buf_string);
}
+void trace2_data_uintmax_fl(const char *file, int line, const char *category,
+ const struct repository *repo, const char *key,
+ uintmax_t value)
+{
+ struct strbuf buf_string = STRBUF_INIT;
+
+ if (!trace2_enabled)
+ return;
+
+ strbuf_addf(&buf_string, "%" PRIuMAX, value);
+ trace2_data_string_fl(file, line, category, repo, key, buf_string.buf);
+ strbuf_release(&buf_string);
+}
+
void trace2_data_json_fl(const char *file, int line, const char *category,
const struct repository *repo, const char *key,
const struct json_writer *value)
diff --git a/trace2.h b/trace2.h
index e4f23784e4..115c45a1eb 100644
--- a/trace2.h
+++ b/trace2.h
@@ -463,6 +463,15 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
(value))
+void trace2_data_uintmax_fl(const char *file, int line, const char *category,
+ const struct repository *repo, const char *key,
+ uintmax_t value);
+
+#define trace2_data_uintmax(category, repo, key, value) \
+ trace2_data_uintmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
+ (value))
+
+
void trace2_data_json_fl(const char *file, int line, const char *category,
const struct repository *repo, const char *key,
const struct json_writer *jw);
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 02/10] commit-graph: stop using signed integers to count Bloom filters
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 03/10] commit-graph: fix type for some write options Patrick Steinhardt
` (7 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
When writing a new commit graph we have a couple of counters that
provide statistics around what kind of Bloom filters we have or have not
written. These counters naturally count from zero and are only ever
incremented, but they use a signed integer as type regardless.
Refactor those fields to be unsigned instead. Using an unsigned type
makes it explicit to the reader that they never have to worry about
negative values and thus makes the code easier to understand.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index bd7b6f5338..3fc1273ba5 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1170,11 +1170,11 @@ struct write_commit_graph_context {
size_t total_bloom_filter_data_size;
const struct bloom_filter_settings *bloom_settings;
- int count_bloom_filter_computed;
- int count_bloom_filter_not_computed;
- int count_bloom_filter_trunc_empty;
- int count_bloom_filter_trunc_large;
- int count_bloom_filter_upgraded;
+ unsigned count_bloom_filter_computed;
+ unsigned count_bloom_filter_not_computed;
+ unsigned count_bloom_filter_trunc_empty;
+ unsigned count_bloom_filter_trunc_large;
+ unsigned count_bloom_filter_upgraded;
};
static int write_graph_chunk_fanout(struct hashfile *f,
@@ -1779,16 +1779,16 @@ void ensure_generations_valid(struct repository *r,
static void trace2_bloom_filter_write_statistics(struct write_commit_graph_context *ctx)
{
- trace2_data_intmax("commit-graph", ctx->r, "filter-computed",
- ctx->count_bloom_filter_computed);
- trace2_data_intmax("commit-graph", ctx->r, "filter-not-computed",
- ctx->count_bloom_filter_not_computed);
- trace2_data_intmax("commit-graph", ctx->r, "filter-trunc-empty",
- ctx->count_bloom_filter_trunc_empty);
- trace2_data_intmax("commit-graph", ctx->r, "filter-trunc-large",
- ctx->count_bloom_filter_trunc_large);
- trace2_data_intmax("commit-graph", ctx->r, "filter-upgraded",
- ctx->count_bloom_filter_upgraded);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-computed",
+ ctx->count_bloom_filter_computed);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-not-computed",
+ ctx->count_bloom_filter_not_computed);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-trunc-empty",
+ ctx->count_bloom_filter_trunc_empty);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-trunc-large",
+ ctx->count_bloom_filter_trunc_large);
+ trace2_data_uintmax("commit-graph", ctx->r, "filter-upgraded",
+ ctx->count_bloom_filter_upgraded);
}
static void compute_bloom_filters(struct write_commit_graph_context *ctx)
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 03/10] commit-graph: fix type for some write options
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 22:40 ` Junio C Hamano
2025-08-07 8:04 ` [PATCH v3 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
` (6 subsequent siblings)
9 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
The options "max-commits" and "size-multiple" are both supposed to be
positive integers and are documented as such, but we use a signed
integer field to store them. This causes sign comparison warnings in
`split_graph_merge_strategy()` because we end up comparing the option
values with the observed number of commits.
Fix the issue by converting the fields to be unsigned and convert the
options to use `OPT_UNSIGNED()` accordingly. This macro has only been
introduced recently, which might explain why the option values were
signed in the first place.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 4 ++--
commit-graph.c | 5 ++---
commit-graph.h | 4 ++--
3 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 25018a0b9d..145802afb7 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -241,9 +241,9 @@ static int graph_write(int argc, const char **argv, const char *prefix,
N_("allow writing an incremental commit-graph file"),
PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
write_option_parse_split),
- OPT_INTEGER(0, "max-commits", &write_opts.max_commits,
+ OPT_UNSIGNED(0, "max-commits", &write_opts.max_commits,
N_("maximum number of commits in a non-base split commit-graph")),
- OPT_INTEGER(0, "size-multiple", &write_opts.size_multiple,
+ OPT_UNSIGNED(0, "size-multiple", &write_opts.size_multiple,
N_("maximum ratio between two levels of a split commit-graph")),
OPT_EXPIRY_DATE(0, "expire-time", &write_opts.expire_time,
N_("only expire files older than a given date-time")),
diff --git a/commit-graph.c b/commit-graph.c
index 3fc1273ba5..53bf83f7b5 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2235,9 +2235,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
uint32_t num_commits;
enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_UNSPECIFIED;
uint32_t i;
-
- int max_commits = 0;
- int size_mult = 2;
+ unsigned max_commits = 0;
+ unsigned size_mult = 2;
if (ctx->opts) {
max_commits = ctx->opts->max_commits;
diff --git a/commit-graph.h b/commit-graph.h
index 78ab7b875b..f5b032e982 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -160,8 +160,8 @@ enum commit_graph_split_flags {
};
struct commit_graph_opts {
- int size_multiple;
- int max_commits;
+ unsigned size_multiple;
+ unsigned max_commits;
timestamp_t expire_time;
enum commit_graph_split_flags split_flags;
int max_new_filters;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 04/10] commit-graph: fix sign comparison warnings
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
` (2 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 03/10] commit-graph: fix type for some write options Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
` (5 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
The "commit-graph.c" file has a bunch of sign comparison warnings:
- There are a bunch of variables that are declared as signed integers
even though they are used to count entities, like for example
`num_commit_graphs_before` and `num_commit_graphs_after`.
- There are several cases where we use signed loop variables to
iterate through an unsigned entity count.
- In `write_graph_chunk_base_1()` we count how many chunks we have
written in total. But while the value represents a positive
quantity, we still return a signed integer that we then later
compare with unsigned values.
- The Bloom settings hash version is being assigned `-1` even though
it's an unsigned value. This is used to indicate an unspecified
value and relies on 1's complement.
Fix all of these cases by either using the proper variable type or by
adding casts as required.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 54 +++++++++++++++++++++++++++---------------------------
1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 53bf83f7b5..7c0ded4532 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,4 @@
#define USE_THE_REPOSITORY_VARIABLE
-#define DISABLE_SIGN_COMPARE_WARNINGS
#include "git-compat-util.h"
#include "config.h"
@@ -569,7 +568,7 @@ static void validate_mixed_bloom_settings(struct commit_graph *g)
static int add_graph_to_chain(struct commit_graph *g,
struct commit_graph *chain,
struct object_id *oids,
- int n)
+ size_t n)
{
struct commit_graph *cur_g = chain;
@@ -622,7 +621,7 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size < the_hash_algo->hexsz) {
+ if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ -643,15 +642,16 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
struct commit_graph *graph_chain = NULL;
struct strbuf line = STRBUF_INIT;
struct object_id *oids;
- int i = 0, valid = 1, count;
+ int valid = 1;
FILE *fp = xfdopen(fd, "r");
+ size_t count;
count = st->st_size / (the_hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
odb_prepare_alternates(r->objects);
- for (i = 0; i < count; i++) {
+ for (size_t i = 0; i < count; i++) {
struct odb_source *source;
if (strbuf_getline_lf(&line, fp) == EOF)
@@ -1145,12 +1145,12 @@ struct write_commit_graph_context {
int num_generation_data_overflows;
unsigned long approx_nr_objects;
struct progress *progress;
- int progress_done;
+ uint64_t progress_done;
uint64_t progress_cnt;
char *base_graph_name;
- int num_commit_graphs_before;
- int num_commit_graphs_after;
+ uint32_t num_commit_graphs_before;
+ uint32_t num_commit_graphs_after;
char **commit_graph_filenames_before;
char **commit_graph_filenames_after;
char **commit_graph_hash_after;
@@ -1181,7 +1181,7 @@ static int write_graph_chunk_fanout(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i, count = 0;
+ size_t i, count = 0;
struct commit **list = ctx->commits.list;
/*
@@ -1209,7 +1209,8 @@ static int write_graph_chunk_oids(struct hashfile *f,
{
struct write_commit_graph_context *ctx = data;
struct commit **list = ctx->commits.list;
- int count;
+ size_t count;
+
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
@@ -1331,9 +1332,9 @@ static int write_graph_chunk_generation_data(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i, num_generation_data_overflows = 0;
+ int num_generation_data_overflows = 0;
- for (i = 0; i < ctx->commits.nr; i++) {
+ for (size_t i = 0; i < ctx->commits.nr; i++) {
struct commit *c = ctx->commits.list[i];
timestamp_t offset;
repo_parse_commit(ctx->r, c);
@@ -1355,8 +1356,8 @@ static int write_graph_chunk_generation_data_overflow(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int i;
- for (i = 0; i < ctx->commits.nr; i++) {
+
+ for (size_t i = 0; i < ctx->commits.nr; i++) {
struct commit *c = ctx->commits.list[i];
timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
display_progress(ctx->progress, ++ctx->progress_cnt);
@@ -1526,7 +1527,7 @@ static void add_missing_parents(struct write_commit_graph_context *ctx, struct c
static void close_reachable(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct commit *commit;
enum commit_graph_split_flags flags = ctx->opts ?
ctx->opts->split_flags : COMMIT_GRAPH_SPLIT_UNSPECIFIED;
@@ -1620,10 +1621,9 @@ static void compute_reachable_generation_numbers(
struct compute_generation_info *info,
int generation_version)
{
- int i;
struct commit_list *list = NULL;
- for (i = 0; i < info->commits->nr; i++) {
+ for (size_t i = 0; i < info->commits->nr; i++) {
struct commit *c = info->commits->list[i];
timestamp_t gen;
repo_parse_commit(info->r, c);
@@ -1714,7 +1714,7 @@ static void set_generation_v2(struct commit *c, timestamp_t t,
static void compute_generation_numbers(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct compute_generation_info info = {
.r = ctx->r,
.commits = &ctx->commits,
@@ -1793,10 +1793,10 @@ static void trace2_bloom_filter_write_statistics(struct write_commit_graph_conte
static void compute_bloom_filters(struct write_commit_graph_context *ctx)
{
- int i;
+ size_t i;
struct progress *progress = NULL;
struct commit **sorted_commits;
- int max_new_filters;
+ size_t max_new_filters;
init_bloom_filters();
@@ -1814,7 +1814,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
QSORT(sorted_commits, ctx->commits.nr, commit_gen_cmp);
max_new_filters = ctx->opts && ctx->opts->max_new_filters >= 0 ?
- ctx->opts->max_new_filters : ctx->commits.nr;
+ (size_t) ctx->opts->max_new_filters : ctx->commits.nr;
for (i = 0; i < ctx->commits.nr; i++) {
enum bloom_filter_computed computed = 0;
@@ -2017,10 +2017,10 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
stop_progress(&ctx->progress);
}
-static int write_graph_chunk_base_1(struct hashfile *f,
- struct commit_graph *g)
+static size_t write_graph_chunk_base_1(struct hashfile *f,
+ struct commit_graph *g)
{
- int num = 0;
+ size_t num = 0;
if (!g)
return 0;
@@ -2034,7 +2034,7 @@ static int write_graph_chunk_base(struct hashfile *f,
void *data)
{
struct write_commit_graph_context *ctx = data;
- int num = write_graph_chunk_base_1(f, ctx->new_base_graph);
+ size_t num = write_graph_chunk_base_1(f, ctx->new_base_graph);
if (num != ctx->num_commit_graphs_after - 1) {
error(_("failed to write correct number of base graph ids"));
@@ -2480,7 +2480,7 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx)
if (stat(path.buf, &st) < 0)
continue;
- if (st.st_mtime > expire_time)
+ if ((unsigned) st.st_mtime > expire_time)
continue;
if (path.len < 6 || strcmp(path.buf + path.len - 6, ".graph"))
continue;
@@ -2576,7 +2576,7 @@ int write_commit_graph(struct odb_source *source,
ctx.changed_paths = 1;
/* don't propagate the hash_version unless unspecified */
- if (bloom_settings.hash_version == -1)
+ if (bloom_settings.hash_version == (unsigned) -1)
bloom_settings.hash_version = g->bloom_filter_settings->hash_version;
bloom_settings.bits_per_entry = g->bloom_filter_settings->bits_per_entry;
bloom_settings.num_hashes = g->bloom_filter_settings->num_hashes;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 05/10] commit-graph: stop using `the_hash_algo` via macros
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
` (3 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
` (4 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
We have two macros `GRAPH_DATA_WIDTH` and `GRAPH_MIN_SIZE` that compute
hash-dependent sizes. They do so by using the global `the_hash_algo`
variable though, which we want to get rid of over time.
Convert these macros into functions that accept the hash algorithm as
input parameter. Adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 7c0ded4532..e75fb8e6ea 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -52,8 +52,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */
#define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */
-#define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16)
-
#define GRAPH_VERSION_1 0x1
#define GRAPH_VERSION GRAPH_VERSION_1
@@ -65,8 +63,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_HEADER_SIZE 8
#define GRAPH_FANOUT_SIZE (4 * 256)
-#define GRAPH_MIN_SIZE (GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE \
- + GRAPH_FANOUT_SIZE + the_hash_algo->rawsz)
#define CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW (1ULL << 31)
@@ -79,6 +75,16 @@ define_commit_slab(topo_level_slab, uint32_t);
define_commit_slab(commit_pos, int);
static struct commit_pos commit_pos = COMMIT_SLAB_INIT(1, commit_pos);
+static size_t graph_data_width(const struct git_hash_algo *algop)
+{
+ return algop->rawsz + 16;
+}
+
+static size_t graph_min_size(const struct git_hash_algo *algop)
+{
+ return GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE + GRAPH_FANOUT_SIZE + algop->rawsz;
+}
+
static void set_commit_pos(struct repository *r, const struct object_id *oid)
{
static int32_t max_pos;
@@ -257,7 +263,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < GRAPH_MIN_SIZE) {
+ if (graph_size < graph_min_size(the_hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -313,7 +319,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / GRAPH_DATA_WIDTH != g->num_commits)
+ if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -378,7 +384,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < GRAPH_MIN_SIZE)
+ if (graph_size < graph_min_size(the_hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -900,7 +906,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(GRAPH_DATA_WIDTH, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1104,7 +1110,8 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(GRAPH_DATA_WIDTH, graph_pos - g->num_commits_in_base);
+ st_mult(graph_data_width(the_hash_algo),
+ graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 06/10] commit-graph: store the hash algorithm instead of its length
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
` (4 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
` (3 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
The commit-graph stores the length of the hash algorithm it uses. In
subsequent commits we'll need to pass the whole hash algorithm around
though, which we currently don't have access to.
Refactor the code so that we store the hash algorithm instead of only
its size.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 36 ++++++++++++++++++------------------
commit-graph.h | 2 +-
2 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index e75fb8e6ea..430eab0b86 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -310,7 +310,7 @@ static int graph_read_oid_lookup(const unsigned char *chunk_start,
{
struct commit_graph *g = data;
g->chunk_oid_lookup = chunk_start;
- if (chunk_size / g->hash_len != g->num_commits)
+ if (chunk_size / g->hash_algo->rawsz != g->num_commits)
return error(_("commit-graph OID lookup chunk is the wrong size"));
return 0;
}
@@ -412,7 +412,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
graph = alloc_commit_graph();
- graph->hash_len = the_hash_algo->rawsz;
+ graph->hash_algo = the_hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
@@ -477,7 +477,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
FREE_AND_NULL(graph->bloom_filter_settings);
}
- oidread(&graph->oid, graph->data + graph->data_len - graph->hash_len,
+ oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
the_repository->hash_algo);
free_chunkfile(cf);
@@ -583,7 +583,7 @@ static int add_graph_to_chain(struct commit_graph *g,
return 0;
}
- if (g->chunk_base_graphs_size / g->hash_len < n) {
+ if (g->chunk_base_graphs_size / g->hash_algo->rawsz < n) {
warning(_("commit-graph base graphs chunk is too small"));
return 0;
}
@@ -593,7 +593,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
- !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_len, n),
+ !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
the_repository->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
@@ -805,7 +805,7 @@ int generation_numbers_enabled(struct repository *r)
return 0;
first_generation = get_be32(g->chunk_commit_data +
- g->hash_len + 8) >> 2;
+ g->hash_algo->rawsz + 8) >> 2;
return !!first_generation;
}
@@ -849,7 +849,7 @@ void close_commit_graph(struct object_database *o)
static int bsearch_graph(struct commit_graph *g, const struct object_id *oid, uint32_t *pos)
{
return bsearch_hash(oid->hash, g->chunk_oid_fanout,
- g->chunk_oid_lookup, g->hash_len, pos);
+ g->chunk_oid_lookup, g->hash_algo->rawsz, pos);
}
static void load_oid_from_graph(struct commit_graph *g,
@@ -869,7 +869,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
- oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_len, lex_index),
+ oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
the_repository->hash_algo);
}
@@ -911,8 +911,8 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
- date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
- date_low = get_be32(commit_data + g->hash_len + 12);
+ date_high = get_be32(commit_data + g->hash_algo->rawsz + 8) & 0x3;
+ date_low = get_be32(commit_data + g->hash_algo->rawsz + 12);
item->date = (timestamp_t)((date_high << 32) | date_low);
if (g->read_generation_data) {
@@ -930,10 +930,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
} else
graph_data->generation = item->date + offset;
} else
- graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
+ graph_data->generation = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
if (g->topo_levels)
- *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2;
+ *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
}
static inline void set_commit_tree(struct commit *c, struct tree *t)
@@ -957,7 +957,7 @@ static int fill_commit_in_graph(struct repository *r,
fill_commit_graph_info(item, g, pos);
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(g->hash_len + 16, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(g->hash_algo->rawsz + 16, lex_index);
item->object.parsed = 1;
@@ -965,12 +965,12 @@ static int fill_commit_in_graph(struct repository *r,
pptr = &item->parents;
- edge_value = get_be32(commit_data + g->hash_len);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
pptr = insert_parent_or_die(r, g, edge_value, pptr);
- edge_value = get_be32(commit_data + g->hash_len + 4);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
@@ -2622,7 +2622,7 @@ int write_commit_graph(struct odb_source *source,
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
- oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
@@ -2753,7 +2753,7 @@ static int verify_one_commit_graph(struct repository *r,
for (i = 0; i < g->num_commits; i++) {
struct commit *graph_commit;
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
@@ -2798,7 +2798,7 @@ static int verify_one_commit_graph(struct repository *r,
timestamp_t generation;
display_progress(progress, ++(*seen));
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
diff --git a/commit-graph.h b/commit-graph.h
index f5b032e982..2228c714cb 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -84,7 +84,7 @@ struct commit_graph {
const unsigned char *data;
size_t data_len;
- unsigned char hash_len;
+ const struct git_hash_algo *hash_algo;
unsigned char num_chunks;
uint32_t num_commits;
struct object_id oid;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
` (5 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
` (2 subsequent siblings)
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Refactor `parse_commit_graph()` so that it takes a repository instead of
taking repository settings. On the one hand this allows us to get rid of
instances where we access `the_hash_algo` by using the repository's hash
algorithm instead. On the other hand it also allows us to move the call
of `prepare_repo_settings()` into the function itself.
Note that there's one small catch, as the commit-graph fuzzer calls this
function directly without having a fully functional repository at hand.
And while the fuzzer already initializes `the_repository` with relevant
info, the call to `prepare_repo_settings()` would fail because we don't
have a fully-initialized repository.
Work around the issue by also settings `settings.initialized` to pretend
that we've already read the settings.
While at it, remove the redundant `parse_commit_graph()` declaration in
the fuzzer. It was added together with aa658574bf (commit-graph, fuzz:
add fuzzer for commit-graph, 2019-01-15), but as we also declared the
same function in "commit-graph.h" it wasn't ever needed.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 23 ++++++++++++-----------
commit-graph.h | 2 +-
oss-fuzz/fuzz-commit-graph.c | 6 ++----
3 files changed, 15 insertions(+), 16 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 430eab0b869..77b785a5e05 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -270,9 +270,8 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
}
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
- prepare_repo_settings(r);
- ret = parse_commit_graph(&r->settings, graph_map, graph_size);
+ ret = parse_commit_graph(r, graph_map, graph_size);
if (ret)
ret->odb_source = source;
else
@@ -372,7 +371,7 @@ static int graph_read_bloom_data(const unsigned char *chunk_start,
return 0;
}
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
+struct commit_graph *parse_commit_graph(struct repository *r,
void *graph_map, size_t graph_size)
{
const unsigned char *data;
@@ -384,7 +383,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < graph_min_size(the_hash_algo))
+ if (graph_size < graph_min_size(r->hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -404,22 +403,22 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
hash_version = *(unsigned char*)(data + 5);
- if (hash_version != oid_version(the_hash_algo)) {
+ if (hash_version != oid_version(r->hash_algo)) {
error(_("commit-graph hash version %X does not match version %X"),
- hash_version, oid_version(the_hash_algo));
+ hash_version, oid_version(r->hash_algo));
return NULL;
}
graph = alloc_commit_graph();
- graph->hash_algo = the_hash_algo;
+ graph->hash_algo = r->hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
if (graph_size < GRAPH_HEADER_SIZE +
(graph->num_chunks + 1) * CHUNK_TOC_ENTRY_SIZE +
- GRAPH_FANOUT_SIZE + the_hash_algo->rawsz) {
+ GRAPH_FANOUT_SIZE + r->hash_algo->rawsz) {
error(_("commit-graph file is too small to hold %u chunks"),
graph->num_chunks);
free(graph);
@@ -450,7 +449,9 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
pair_chunk(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs,
&graph->chunk_base_graphs_size);
- if (s->commit_graph_generation_version >= 2) {
+ prepare_repo_settings(r);
+
+ if (r->settings.commit_graph_generation_version >= 2) {
read_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA,
graph_read_generation_data, graph);
pair_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW,
@@ -461,7 +462,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
graph->read_generation_data = 1;
}
- if (s->commit_graph_changed_paths_version) {
+ if (r->settings.commit_graph_changed_paths_version) {
read_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
graph_read_bloom_index, graph);
read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
@@ -478,7 +479,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
- the_repository->hash_algo);
+ r->hash_algo);
free_chunkfile(cf);
return graph;
diff --git a/commit-graph.h b/commit-graph.h
index 2228c714cb1..4879643db0f 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -128,7 +128,7 @@ struct repo_settings;
* Callers should initialize the repo_settings with prepare_repo_settings()
* prior to calling parse_commit_graph().
*/
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
+struct commit_graph *parse_commit_graph(struct repository *r,
void *graph_map, size_t graph_size);
/*
diff --git a/oss-fuzz/fuzz-commit-graph.c b/oss-fuzz/fuzz-commit-graph.c
index fbb77fec197..fb8b8787a46 100644
--- a/oss-fuzz/fuzz-commit-graph.c
+++ b/oss-fuzz/fuzz-commit-graph.c
@@ -4,9 +4,6 @@
#include "commit-graph.h"
#include "repository.h"
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
- void *graph_map, size_t graph_size);
-
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
@@ -22,9 +19,10 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
* possible.
*/
repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+ the_repository->settings.initialized = 1;
the_repository->settings.commit_graph_generation_version = 2;
the_repository->settings.commit_graph_changed_paths_version = 1;
- g = parse_commit_graph(&the_repository->settings, (void *)data, size);
+ g = parse_commit_graph(the_repository, (void *)data, size);
repo_clear(the_repository);
free_commit_graph(g);
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 08/10] commit-graph: stop using `the_hash_algo`
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
` (6 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Stop using `the_hash_algo` as it implicitly relies on `the_repository`.
Instead, we either use the hash algo provided via the context or, if
there is no such hash algo, we use `the_repository` explicitly. Such
uses will be removed in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 3 ++-
commit-graph.c | 27 ++++++++++++++-------------
commit-graph.h | 3 ++-
3 files changed, 18 insertions(+), 15 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 145802afb7..680b03a83a 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -108,7 +108,8 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
opened = OPENED_GRAPH;
else if (errno != ENOENT)
die_errno(_("Could not open commit-graph '%s'"), graph_name);
- else if (open_commit_graph_chain(chain_name, &fd, &st))
+ else if (open_commit_graph_chain(chain_name, &fd, &st,
+ the_repository->hash_algo))
opened = OPENED_CHAIN;
else if (errno != ENOENT)
die_errno(_("could not open commit-graph chain '%s'"), chain_name);
diff --git a/commit-graph.c b/commit-graph.c
index 77b785a5e0..594eca4110 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -263,7 +263,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(the_hash_algo)) {
+ if (graph_size < graph_min_size(r->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -318,7 +318,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
+ if (chunk_size / graph_data_width(g->hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -619,7 +619,8 @@ static int add_graph_to_chain(struct commit_graph *g,
}
int open_commit_graph_chain(const char *chain_file,
- int *fd, struct stat *st)
+ int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo)
{
*fd = git_open(chain_file);
if (*fd < 0)
@@ -628,7 +629,7 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
+ if (st->st_size < (ssize_t) hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ -653,7 +654,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
FILE *fp = xfdopen(fd, "r");
size_t count;
- count = st->st_size / (the_hash_algo->hexsz + 1);
+ count = st->st_size / (r->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
odb_prepare_alternates(r->objects);
@@ -715,7 +716,7 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
@@ -907,7 +908,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(g->hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1111,7 +1112,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(graph_data_width(the_hash_algo),
+ st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
@@ -1221,7 +1222,7 @@ static int write_graph_chunk_oids(struct hashfile *f,
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
- hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, (*list)->object.oid.hash, f->algop->rawsz);
}
return 0;
@@ -1252,7 +1253,7 @@ static int write_graph_chunk_data(struct hashfile *f,
die(_("unable to parse commit %s"),
oid_to_hex(&(*list)->object.oid));
tree = get_commit_tree_oid(*list);
- hashwrite(f, tree->hash, the_hash_algo->rawsz);
+ hashwrite(f, tree->hash, ctx->r->hash_algo->rawsz);
parent = (*list)->parents;
@@ -2034,7 +2035,7 @@ static size_t write_graph_chunk_base_1(struct hashfile *f,
return 0;
num = write_graph_chunk_base_1(f, g->base_graph);
- hashwrite(f, g->oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, g->oid.hash, g->hash_algo->rawsz);
return num + 1;
}
@@ -2058,7 +2059,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
struct hashfile *f;
struct tempfile *graph_layer; /* when ctx->split is non-zero */
struct lock_file lk = LOCK_INIT;
- const unsigned hashsz = the_hash_algo->rawsz;
+ const unsigned hashsz = ctx->r->hash_algo->rawsz;
struct strbuf progress_title = STRBUF_INIT;
struct chunkfile *cf;
unsigned char file_hash[GIT_MAX_RAWSZ];
@@ -2146,7 +2147,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
hashwrite_be32(f, GRAPH_SIGNATURE);
hashwrite_u8(f, GRAPH_VERSION);
- hashwrite_u8(f, oid_version(the_hash_algo));
+ hashwrite_u8(f, oid_version(ctx->r->hash_algo));
hashwrite_u8(f, get_num_chunks(cf));
hashwrite_u8(f, ctx->num_commit_graphs_after - 1);
diff --git a/commit-graph.h b/commit-graph.h
index 4879643db0..5f417f7666 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -32,7 +32,8 @@ struct string_list;
char *get_commit_graph_filename(struct odb_source *source);
char *get_commit_graph_chain_filename(struct odb_source *source);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
-int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st);
+int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo);
/*
* Given a commit struct, try to fill the commit struct info, including:
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 09/10] commit-graph: stop using `the_repository`
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
` (7 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
There's still a bunch of uses of `the_repository` in "commit-graph.c",
which we want to stop using due to it being a global variable. Refactor
the code to stop using `the_repository` in favor of the repository
provided via the calling context.
This allows us to drop the `USE_THE_REPOSITORY_VARIABLE` macro.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 77 ++++++++++++++++++++++++++++----------------------------
commit-graph.h | 2 +-
4 files changed, 42 insertions(+), 41 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index 63e7158e98..8ca0aede48 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1933,7 +1933,7 @@ int cmd_commit(int argc,
"new index file. Check that disk is not full and quota is\n"
"not exceeded, and then \"git restore --staged :/\" to recover."));
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
repo_rerere(the_repository, 0);
run_auto_maintenance(quiet);
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..263cb58471 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1862,7 +1862,7 @@ int cmd_merge(int argc,
if (squash) {
finish(head_commit, remoteheads, NULL, NULL);
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
} else
write_merge_state(remoteheads);
diff --git a/commit-graph.c b/commit-graph.c
index 594eca4110..8e087fc4e6 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,3 @@
-#define USE_THE_REPOSITORY_VARIABLE
-
#include "git-compat-util.h"
#include "config.h"
#include "csum-file.h"
@@ -27,7 +25,7 @@
#include "tree.h"
#include "chunk-format.h"
-void git_test_write_commit_graph_or_die(void)
+void git_test_write_commit_graph_or_die(struct odb_source *source)
{
int flags = 0;
if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
@@ -36,8 +34,7 @@ void git_test_write_commit_graph_or_die(void)
if (git_env_bool(GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS, 0))
flags = COMMIT_GRAPH_WRITE_BLOOM_FILTERS;
- if (write_commit_graph_reachable(the_repository->objects->sources,
- flags, NULL))
+ if (write_commit_graph_reachable(source, flags, NULL))
die("failed to write commit-graph under GIT_TEST_COMMIT_GRAPH");
}
@@ -595,7 +592,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
!hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
- the_repository->hash_algo)) {
+ g->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
}
@@ -665,7 +662,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex(line.buf, &oids[i])) {
+ if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -751,7 +748,7 @@ static void prepare_commit_graph_one(struct repository *r,
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
* On the first invocation, this function attempts to load the commit
- * graph if the_repository is configured to have one.
+ * graph if the repository is configured to have one.
*/
static int prepare_commit_graph(struct repository *r)
{
@@ -872,7 +869,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
- the_repository->hash_algo);
+ g->hash_algo);
}
static struct commit_list **insert_parent_or_die(struct repository *r,
@@ -1115,7 +1112,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
- oidread(&oid, commit_data, the_repository->hash_algo);
+ oidread(&oid, commit_data, g->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
return c->maybe_tree;
@@ -1543,7 +1540,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Loading known commits in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1561,7 +1558,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
*/
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Expanding reachable commits in commit graph"),
0);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1582,7 +1579,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Clearing commit marks in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1699,7 +1696,7 @@ static void compute_topological_levels(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph topological levels"),
ctx->commits.nr);
@@ -1734,7 +1731,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph generation numbers"),
ctx->commits.nr);
@@ -1811,7 +1808,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit changed paths Bloom filters"),
ctx->commits.nr);
@@ -1857,6 +1854,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
}
struct refs_cb_data {
+ struct repository *repo;
struct oidset *commits;
struct progress *progress;
};
@@ -1869,9 +1867,9 @@ static int add_ref_to_set(const char *refname UNUSED,
struct object_id peeled;
struct refs_cb_data *data = (struct refs_cb_data *)cb_data;
- if (!peel_iterated_oid(the_repository, oid, &peeled))
+ if (!peel_iterated_oid(data->repo, oid, &peeled))
oid = &peeled;
- if (odb_read_object_info(the_repository->objects, oid, NULL) == OBJ_COMMIT)
+ if (odb_read_object_info(data->repo->objects, oid, NULL) == OBJ_COMMIT)
oidset_insert(data->commits, oid);
display_progress(data->progress, oidset_size(data->commits));
@@ -1888,13 +1886,15 @@ int write_commit_graph_reachable(struct odb_source *source,
int result;
memset(&data, 0, sizeof(data));
+ data.repo = source->odb->repo;
data.commits = &commits;
+
if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
data.progress = start_delayed_progress(
- the_repository,
+ source->odb->repo,
_("Collecting referenced commits"), 0);
- refs_for_each_ref(get_main_ref_store(the_repository), add_ref_to_set,
+ refs_for_each_ref(get_main_ref_store(source->odb->repo), add_ref_to_set,
&data);
stop_progress(&data.progress);
@@ -1923,7 +1923,7 @@ static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
"Finding commits for commit graph in %"PRIuMAX" packs",
pack_indexes->nr),
(uintmax_t)pack_indexes->nr);
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
progress_title.buf, 0);
ctx->progress_done = 0;
}
@@ -1977,7 +1977,7 @@ static void fill_oids_from_all_packs(struct write_commit_graph_context *ctx)
{
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding commits for commit graph among packed objects"),
ctx->approx_nr_objects);
for_each_packed_object(ctx->r, add_packed_commits, ctx,
@@ -1996,7 +1996,7 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
ctx->num_extra_edges = 0;
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding extra edges in commit graph"),
ctx->oids.nr);
oid_array_sort(&ctx->oids);
@@ -2075,7 +2075,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
ctx->graph_name = get_commit_graph_filename(ctx->odb_source);
}
- if (safe_create_leading_directories(the_repository, ctx->graph_name)) {
+ if (safe_create_leading_directories(ctx->r, ctx->graph_name)) {
error(_("unable to create leading directories of %s"),
ctx->graph_name);
return -1;
@@ -2094,18 +2094,18 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
return -1;
}
- if (adjust_shared_perm(the_repository, get_tempfile_path(graph_layer))) {
+ if (adjust_shared_perm(ctx->r, get_tempfile_path(graph_layer))) {
error(_("unable to adjust shared permissions for '%s'"),
get_tempfile_path(graph_layer));
return -1;
}
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_tempfile_fd(graph_layer), get_tempfile_path(graph_layer));
} else {
hold_lock_file_for_update_mode(&lk, ctx->graph_name,
LOCK_DIE_ON_ERROR, 0444);
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_lock_file_fd(&lk), get_lock_file_path(&lk));
}
@@ -2158,7 +2158,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
get_num_chunks(cf)),
get_num_chunks(cf));
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
progress_title.buf,
st_mult(get_num_chunks(cf), ctx->commits.nr));
}
@@ -2216,7 +2216,8 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
}
free(ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
- ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] = xstrdup(hash_to_hex(file_hash));
+ ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] =
+ xstrdup(hash_to_hex_algop(file_hash, ctx->r->hash_algo));
final_graph_name = get_split_graph_filename(ctx->odb_source,
ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1]);
@@ -2370,7 +2371,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Scanning merged commits"),
ctx->commits.nr);
@@ -2415,7 +2416,7 @@ static void merge_commit_graphs(struct write_commit_graph_context *ctx)
current_graph_number--;
if (ctx->report_progress)
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
_("Merging commit-graph"), 0);
merge_commit_graph(ctx, g);
@@ -2518,7 +2519,7 @@ int write_commit_graph(struct odb_source *source,
enum commit_graph_write_flags flags,
const struct commit_graph_opts *opts)
{
- struct repository *r = the_repository;
+ struct repository *r = source->odb->repo;
struct write_commit_graph_context ctx = {
.r = r,
.odb_source = source,
@@ -2618,14 +2619,14 @@ int write_commit_graph(struct odb_source *source,
replace = ctx.opts->split_flags & COMMIT_GRAPH_SPLIT_REPLACE;
}
- ctx.approx_nr_objects = repo_approximate_object_count(the_repository);
+ ctx.approx_nr_objects = repo_approximate_object_count(r);
if (ctx.append && ctx.r->objects->commit_graph) {
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ r->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
}
@@ -2733,7 +2734,7 @@ static void graph_report(const char *fmt, ...)
static int commit_graph_checksum_valid(struct commit_graph *g)
{
- return hashfile_checksum_valid(the_repository->hash_algo,
+ return hashfile_checksum_valid(g->hash_algo,
g->data, g->data_len);
}
@@ -2756,7 +2757,7 @@ static int verify_one_commit_graph(struct repository *r,
struct commit *graph_commit;
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
graph_report(_("commit-graph has incorrect OID order: %s then %s"),
@@ -2801,7 +2802,7 @@ static int verify_one_commit_graph(struct repository *r,
display_progress(progress, ++(*seen));
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
odb_commit = (struct commit *)create_object(r, &cur_oid, alloc_commit_node(r));
@@ -2905,7 +2906,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(the_repository,
+ progress = start_progress(r,
_("Verifying commits in commit graph"),
total);
}
diff --git a/commit-graph.h b/commit-graph.h
index 5f417f7666..ba2856437f 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -21,7 +21,7 @@
* call this method oustide of a builtin, and only if you know what
* you are doing!
*/
-void git_test_write_commit_graph_or_die(void);
+void git_test_write_commit_graph_or_die(struct odb_source *source);
struct commit;
struct bloom_filter_settings;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v3 10/10] commit-graph: stop passing in redundant repository
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
` (8 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
@ 2025-08-07 8:04 ` Patrick Steinhardt
9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-07 8:04 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Many of the commit-graph related functions take in both a repository and
the object database source (directly or via `struct commit_graph`) for
which we are supposed to load such a commit-graph. In the best case this
information is simply redundant as the source already contains a
reference to its owning object database, which in turn has a reference
to its repository. In the worst case this information could even
mismatch when passing in a source that doesn't belong to the same
repository.
Refactor the code so that we only pass in the object database source in
those cases.
There is one exception though, namely `load_commit_graph_chain_fd_st()`,
which is responsible for loading a commit-graph chain. It is expected
that parts of the commit-graph chain aren't located in the same object
source as the chain file itself, but in a different one. Consequently,
this function doesn't work on the source level but on the database level
instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 6 +--
commit-graph.c | 120 +++++++++++++++++++--------------------------
commit-graph.h | 12 ++---
t/helper/test-read-graph.c | 2 +-
4 files changed, 59 insertions(+), 81 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 680b03a83a..1b80993b2d 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -121,15 +121,15 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
if (opened == OPENED_NONE)
return 0;
else if (opened == OPENED_GRAPH)
- graph = load_commit_graph_one_fd_st(the_repository, fd, &st, source);
+ graph = load_commit_graph_one_fd_st(source, fd, &st);
else
- graph = load_commit_graph_chain_fd_st(the_repository, fd, &st,
+ graph = load_commit_graph_chain_fd_st(the_repository->objects, fd, &st,
&incomplete_chain);
if (!graph)
return 1;
- ret = verify_commit_graph(the_repository, graph, flags);
+ ret = verify_commit_graph(graph, flags);
free_commit_graph(graph);
if (incomplete_chain) {
diff --git a/commit-graph.c b/commit-graph.c
index 8e087fc4e6..b4a2924889 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -250,9 +250,8 @@ int open_commit_graph(const char *graph_file, int *fd, struct stat *st)
return 1;
}
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source)
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st)
{
void *graph_map;
size_t graph_size;
@@ -260,7 +259,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(r->hash_algo)) {
+ if (graph_size < graph_min_size(source->odb->repo->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -268,7 +267,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
- ret = parse_commit_graph(r, graph_map, graph_size);
+ ret = parse_commit_graph(source->odb->repo, graph_map, graph_size);
if (ret)
ret->odb_source = source;
else
@@ -488,11 +487,9 @@ struct commit_graph *parse_commit_graph(struct repository *r,
return NULL;
}
-static struct commit_graph *load_commit_graph_one(struct repository *r,
- const char *graph_file,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_one(struct odb_source *source,
+ const char *graph_file)
{
-
struct stat st;
int fd;
struct commit_graph *g;
@@ -501,19 +498,17 @@ static struct commit_graph *load_commit_graph_one(struct repository *r,
if (!open_ok)
return NULL;
- g = load_commit_graph_one_fd_st(r, fd, &st, source);
-
+ g = load_commit_graph_one_fd_st(source, fd, &st);
if (g)
g->filename = xstrdup(graph_file);
return g;
}
-static struct commit_graph *load_commit_graph_v1(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_v1(struct odb_source *source)
{
char *graph_name = get_commit_graph_filename(source);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
return g;
@@ -640,7 +635,7 @@ int open_commit_graph_chain(const char *chain_file,
return 1;
}
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain)
{
@@ -651,10 +646,10 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
FILE *fp = xfdopen(fd, "r");
size_t count;
- count = st->st_size / (r->hash_algo->hexsz + 1);
+ count = st->st_size / (odb->repo->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
- odb_prepare_alternates(r->objects);
+ odb_prepare_alternates(odb);
for (size_t i = 0; i < count; i++) {
struct odb_source *source;
@@ -662,7 +657,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
+ if (get_oid_hex_algop(line.buf, &oids[i], odb->repo->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -670,9 +665,9 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
}
valid = 0;
- for (source = r->objects->sources; source; source = source->next) {
+ for (source = odb->sources; source; source = source->next) {
char *graph_name = get_split_graph_filename(source, line.buf);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
@@ -705,45 +700,33 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
return graph_chain;
}
-static struct commit_graph *load_commit_graph_chain(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_chain(struct odb_source *source)
{
char *chain_file = get_commit_graph_chain_filename(source);
struct stat st;
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, source->odb->repo->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
- g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
+ g = load_commit_graph_chain_fd_st(source->odb, fd, &st, &incomplete);
}
free(chain_file);
return g;
}
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source)
+struct commit_graph *read_commit_graph_one(struct odb_source *source)
{
- struct commit_graph *g = load_commit_graph_v1(r, source);
+ struct commit_graph *g = load_commit_graph_v1(source);
if (!g)
- g = load_commit_graph_chain(r, source);
+ g = load_commit_graph_chain(source);
return g;
}
-static void prepare_commit_graph_one(struct repository *r,
- struct odb_source *source)
-{
-
- if (r->objects->commit_graph)
- return;
-
- r->objects->commit_graph = read_commit_graph_one(r, source);
-}
-
/*
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
@@ -784,10 +767,12 @@ static int prepare_commit_graph(struct repository *r)
return 0;
odb_prepare_alternates(r->objects);
- for (source = r->objects->sources;
- !r->objects->commit_graph && source;
- source = source->next)
- prepare_commit_graph_one(r, source);
+ for (source = r->objects->sources; source; source = source->next) {
+ r->objects->commit_graph = read_commit_graph_one(source);
+ if (r->objects->commit_graph)
+ break;
+ }
+
return !!r->objects->commit_graph;
}
@@ -872,8 +857,7 @@ static void load_oid_from_graph(struct commit_graph *g,
g->hash_algo);
}
-static struct commit_list **insert_parent_or_die(struct repository *r,
- struct commit_graph *g,
+static struct commit_list **insert_parent_or_die(struct commit_graph *g,
uint32_t pos,
struct commit_list **pptr)
{
@@ -884,7 +868,7 @@ static struct commit_list **insert_parent_or_die(struct repository *r,
die("invalid parent position %"PRIu32, pos);
load_oid_from_graph(g, pos, &oid);
- c = lookup_commit(r, &oid);
+ c = lookup_commit(g->odb_source->odb->repo, &oid);
if (!c)
die(_("could not find commit %s"), oid_to_hex(&oid));
commit_graph_data_at(c)->graph_pos = pos;
@@ -940,8 +924,7 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
c->maybe_tree = t;
}
-static int fill_commit_in_graph(struct repository *r,
- struct commit *item,
+static int fill_commit_in_graph(struct commit *item,
struct commit_graph *g, uint32_t pos)
{
uint32_t edge_value;
@@ -967,13 +950,13 @@ static int fill_commit_in_graph(struct repository *r,
edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
return 1;
}
@@ -988,7 +971,7 @@ static int fill_commit_in_graph(struct repository *r,
}
edge_value = get_be32(g->chunk_extra_edges +
sizeof(uint32_t) * parent_data_pos);
- pptr = insert_parent_or_die(r, g,
+ pptr = insert_parent_or_die(g,
edge_value & GRAPH_EDGE_LAST_MASK,
pptr);
parent_data_pos++;
@@ -1054,14 +1037,13 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
if (commit->object.parsed)
return commit;
- if (!fill_commit_in_graph(repo, commit, repo->objects->commit_graph, pos))
+ if (!fill_commit_in_graph(commit, repo->objects->commit_graph, pos))
return NULL;
return commit;
}
-static int parse_commit_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static int parse_commit_in_graph_one(struct commit_graph *g,
struct commit *item)
{
uint32_t pos;
@@ -1070,7 +1052,7 @@ static int parse_commit_in_graph_one(struct repository *r,
return 1;
if (find_commit_pos_in_graph(item, g, &pos))
- return fill_commit_in_graph(r, item, g, pos);
+ return fill_commit_in_graph(item, g, pos);
return 0;
}
@@ -1087,7 +1069,7 @@ int parse_commit_in_graph(struct repository *r, struct commit *item)
if (!prepare_commit_graph(r))
return 0;
- return parse_commit_in_graph_one(r, r->objects->commit_graph, item);
+ return parse_commit_in_graph_one(r->objects->commit_graph, item);
}
void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1097,8 +1079,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
fill_commit_graph_info(item, r->objects->commit_graph, pos);
}
-static struct tree *load_tree_for_commit(struct repository *r,
- struct commit_graph *g,
+static struct tree *load_tree_for_commit(struct commit_graph *g,
struct commit *c)
{
struct object_id oid;
@@ -1113,13 +1094,12 @@ static struct tree *load_tree_for_commit(struct repository *r,
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, g->hash_algo);
- set_commit_tree(c, lookup_tree(r, &oid));
+ set_commit_tree(c, lookup_tree(g->odb_source->odb->repo, &oid));
return c->maybe_tree;
}
-static struct tree *get_commit_tree_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g,
const struct commit *c)
{
if (c->maybe_tree)
@@ -1127,12 +1107,12 @@ static struct tree *get_commit_tree_in_graph_one(struct repository *r,
if (commit_graph_position(c) == COMMIT_NOT_FROM_GRAPH)
BUG("get_commit_tree_in_graph_one called from non-commit-graph commit");
- return load_tree_for_commit(r, g, (struct commit *)c);
+ return load_tree_for_commit(g, (struct commit *)c);
}
struct tree *get_commit_tree_in_graph(struct repository *r, const struct commit *c)
{
- return get_commit_tree_in_graph_one(r, r->objects->commit_graph, c);
+ return get_commit_tree_in_graph_one(r->objects->commit_graph, c);
}
struct packed_commit_list {
@@ -2738,11 +2718,11 @@ static int commit_graph_checksum_valid(struct commit_graph *g)
g->data, g->data_len);
}
-static int verify_one_commit_graph(struct repository *r,
- struct commit_graph *g,
+static int verify_one_commit_graph(struct commit_graph *g,
struct progress *progress,
uint64_t *seen)
{
+ struct repository *r = g->odb_source->odb->repo;
uint32_t i, cur_fanout_pos = 0;
struct object_id prev_oid, cur_oid;
struct commit *seen_gen_zero = NULL;
@@ -2776,7 +2756,7 @@ static int verify_one_commit_graph(struct repository *r,
}
graph_commit = lookup_commit(r, &cur_oid);
- if (!parse_commit_in_graph_one(r, g, graph_commit))
+ if (!parse_commit_in_graph_one(g, graph_commit))
graph_report(_("failed to parse commit %s from commit-graph"),
oid_to_hex(&cur_oid));
}
@@ -2812,7 +2792,7 @@ static int verify_one_commit_graph(struct repository *r,
continue;
}
- if (!oideq(&get_commit_tree_in_graph_one(r, g, graph_commit)->object.oid,
+ if (!oideq(&get_commit_tree_in_graph_one(g, graph_commit)->object.oid,
get_commit_tree_oid(odb_commit)))
graph_report(_("root tree OID for commit %s in commit-graph is %s != %s"),
oid_to_hex(&cur_oid),
@@ -2830,7 +2810,7 @@ static int verify_one_commit_graph(struct repository *r,
}
/* parse parent in case it is in a base graph */
- parse_commit_in_graph_one(r, g, graph_parents->item);
+ parse_commit_in_graph_one(g, graph_parents->item);
if (!oideq(&graph_parents->item->object.oid, &odb_parents->item->object.oid))
graph_report(_("commit-graph parent for %s is %s != %s"),
@@ -2890,7 +2870,7 @@ static int verify_one_commit_graph(struct repository *r,
return verify_commit_graph_error;
}
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
+int verify_commit_graph(struct commit_graph *g, int flags)
{
struct progress *progress = NULL;
int local_error = 0;
@@ -2906,13 +2886,13 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(r,
+ progress = start_progress(g->odb_source->odb->repo,
_("Verifying commits in commit graph"),
total);
}
for (; g; g = g->base_graph) {
- local_error |= verify_one_commit_graph(r, g, progress, &seen);
+ local_error |= verify_one_commit_graph(g, progress, &seen);
if (flags & COMMIT_GRAPH_VERIFY_SHALLOW)
break;
}
diff --git a/commit-graph.h b/commit-graph.h
index ba2856437f..269f86be56 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -114,14 +114,12 @@ struct commit_graph {
struct bloom_filter_settings *bloom_filter_settings;
};
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source);
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st);
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain);
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source);
+struct commit_graph *read_commit_graph_one(struct odb_source *source);
struct repo_settings;
@@ -185,7 +183,7 @@ int write_commit_graph(struct odb_source *source,
#define COMMIT_GRAPH_VERIFY_SHALLOW (1 << 0)
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags);
+int verify_commit_graph(struct commit_graph *g, int flags);
void close_commit_graph(struct object_database *);
void free_commit_graph(struct commit_graph *);
diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c
index ef5339bbee..6a5f64e473 100644
--- a/t/helper/test-read-graph.c
+++ b/t/helper/test-read-graph.c
@@ -81,7 +81,7 @@ int cmd__read_graph(int argc, const char **argv)
prepare_repo_settings(the_repository);
- graph = read_commit_graph_one(the_repository, source);
+ graph = read_commit_graph_one(source);
if (!graph) {
ret = 1;
goto done;
--
2.51.0.rc0.215.g125493bb4a.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH v3 03/10] commit-graph: fix type for some write options
2025-08-07 8:04 ` [PATCH v3 03/10] commit-graph: fix type for some write options Patrick Steinhardt
@ 2025-08-07 22:40 ` Junio C Hamano
2025-08-11 8:24 ` Patrick Steinhardt
0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2025-08-07 22:40 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Taylor Blau, Derrick Stolee, Oswald Buddenhagen
Patrick Steinhardt <ps@pks.im> writes:
> The options "max-commits" and "size-multiple" are both supposed to be
> positive integers and are documented as such, but we use a signed
> integer field to store them. This causes sign comparison warnings in
> `split_graph_merge_strategy()` because we end up comparing the option
> values with the observed number of commits.
>
> Fix the issue by converting the fields to be unsigned and convert the
> options to use `OPT_UNSIGNED()` accordingly. This macro has only been
> introduced recently, which might explain why the option values were
> signed in the first place.
These are platform natural "int" from their inception at c2bc6e6a
(commit-graph: create options for split files, 2019-06-18), which
way predates the recent push to appease -Wsign-compare, so yes, it
does explain it. But because the developer who wrote it in the
first place is around and with us, why not ask them instead of
speculating?
As the max_commits member is comparable to 4-byte network byte order
integer that is .num_commits in the file, using platform natural
"int" or int32_t is not correct, because you may not be able to tell
the command to hold 3 billion objects before splitting, even though
the underlying file format does support such settings. It has to be
uint32_t or wider (but if it is wider, you'd need to be prepared to
correctly compare max_commits with num_commits, and take an overly
large max as "unlimited", or something). And unsigned usually is at
least that wide, so the change may be justified. I do not see a
reason why we want to avoid using uint32_t, though.
As to size_multiple, it appears to me that the number is really
designed to be a small integer (for which even 100 is probably way
too many), so I do not see any reason to insist it to be unsigned.
Even "short" _ought_ to do fine. And if our macros and compiler
settings do not support it well and DEVELOPER=YesPlease build
complain, that is what we need to fix. Papering over the problem by
using unnecessarily wide type, or by using signedness that happens
to squelch the misguided compiler warnings, is skirting around it.
If there is any bug, it is st_mult(size_mult, num_commit) and the
compiler warning that complains about its CPP expansion, even though
it should be able to ensure that "int size_mult" (or "short
size_mult") that gets promoted to unsigned to match uint32_t
num_commit would not overflow the unsigned multiplication.
So my take on this change is that "fix type for ..." on the subject
line greatly misrepresents what this change does. It smells to me
that "squelch -Wsign-compare warnings" would be closer to what is
happening.
It really is arithmetic overflowing and wrapping around that we want
to be careful about. -Wsign-compare alone is not doing a good job
to do so, is it?
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-07 7:04 ` Patrick Steinhardt
@ 2025-08-07 22:41 ` Junio C Hamano
2025-08-11 8:05 ` Patrick Steinhardt
0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2025-08-07 22:41 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Taylor Blau, Oswald Buddenhagen, git
Patrick Steinhardt <ps@pks.im> writes:
> Yup, fully agreed, and this is a good reason why it should be signed. In
> the case at hand though we never use such sentinel values, I think
> making that explicit by using an unsigned type is a good thing as it
> tells the reader that "Yup, no sentinels involved, it's a plain counter
> from 0 to $NUM_ENTRIES".
I do not think such a "statement" has much values, especially the
right $NUM_ENTRIES is different for specific cases and is not
expressed anywhere.
Also, by making it explicit, such a move is also making it explicit
that we want to close the door for certain future evolution of the
code paths involved. I.e. anything that starts to require the
member or the variable to use a sentinel value is unwelcome.
So, I am not sure if I buy the above as a justification for this
change.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters
2025-08-07 22:41 ` Junio C Hamano
@ 2025-08-11 8:05 ` Patrick Steinhardt
0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-11 8:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Taylor Blau, Oswald Buddenhagen, git
On Thu, Aug 07, 2025 at 03:41:38PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > Yup, fully agreed, and this is a good reason why it should be signed. In
> > the case at hand though we never use such sentinel values, I think
> > making that explicit by using an unsigned type is a good thing as it
> > tells the reader that "Yup, no sentinels involved, it's a plain counter
> > from 0 to $NUM_ENTRIES".
>
> I do not think such a "statement" has much values, especially the
> right $NUM_ENTRIES is different for specific cases and is not
> expressed anywhere.
>
> Also, by making it explicit, such a move is also making it explicit
> that we want to close the door for certain future evolution of the
> code paths involved. I.e. anything that starts to require the
> member or the variable to use a sentinel value is unwelcome.
I think that's actually a good thing. If we wanted to start using a
sentinel value we'd have to change the type to be signed. Combined with
-Wsign-compare this would then alert us of cases where we compare this
counter with an unsigned index, which are exactly all the sites where we
might have to adjust the code to take into account the new sentinel
value.
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH v3 03/10] commit-graph: fix type for some write options
2025-08-07 22:40 ` Junio C Hamano
@ 2025-08-11 8:24 ` Patrick Steinhardt
0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-11 8:24 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Taylor Blau, Derrick Stolee, Oswald Buddenhagen
On Thu, Aug 07, 2025 at 03:40:50PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > The options "max-commits" and "size-multiple" are both supposed to be
> > positive integers and are documented as such, but we use a signed
> > integer field to store them. This causes sign comparison warnings in
> > `split_graph_merge_strategy()` because we end up comparing the option
> > values with the observed number of commits.
> >
> > Fix the issue by converting the fields to be unsigned and convert the
> > options to use `OPT_UNSIGNED()` accordingly. This macro has only been
> > introduced recently, which might explain why the option values were
> > signed in the first place.
>
> These are platform natural "int" from their inception at c2bc6e6a
> (commit-graph: create options for split files, 2019-06-18), which
> way predates the recent push to appease -Wsign-compare, so yes, it
> does explain it. But because the developer who wrote it in the
> first place is around and with us, why not ask them instead of
> speculating?
>
> As the max_commits member is comparable to 4-byte network byte order
> integer that is .num_commits in the file, using platform natural
> "int" or int32_t is not correct, because you may not be able to tell
> the command to hold 3 billion objects before splitting, even though
> the underlying file format does support such settings. It has to be
> uint32_t or wider (but if it is wider, you'd need to be prepared to
> correctly compare max_commits with num_commits, and take an overly
> large max as "unlimited", or something). And unsigned usually is at
> least that wide, so the change may be justified. I do not see a
> reason why we want to avoid using uint32_t, though.
Using `uint32_t` might cause regressions on some platforms. If for
example a signed integer was 64 bits on certain platforms and we
restrict it to `uint32_t` then we'd now refuse to take any values
between `2^32` and `2^63`, even though those are valid values that we
accepted beforehand. So if somebody was using any such value to say
"make this essentially unlimited" then we'd now die.
But by using `unsigned` we avoid this pitfall, as we only extend the
range of accepted valid values.
> As to size_multiple, it appears to me that the number is really
> designed to be a small integer (for which even 100 is probably way
> too many), so I do not see any reason to insist it to be unsigned.
> Even "short" _ought_ to do fine. And if our macros and compiler
> settings do not support it well and DEVELOPER=YesPlease build
> complain, that is what we need to fix. Papering over the problem by
> using unnecessarily wide type, or by using signedness that happens
> to squelch the misguided compiler warnings, is skirting around it.
Yeah, it is. At one point I was pondering whether we should extend our
parse-options interface to allow restricting to arbitrary values like
100. I even implemented all of this, but ultimately discarded it because
it wasn't easy to decide where we can retroactively harden accepted
values without causing regressions.
While large values for options may frequently be unreasonable, it
wouldn't be the first time that I see users doing unreasonable things
intentfully. And sometimes the outcome is even something sensible.
Patrick
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH v4 0/6] commit-graph: remove reliance on global state
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
` (11 preceding siblings ...)
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
@ 2025-08-15 5:49 ` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 1/6] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
` (6 more replies)
12 siblings, 7 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-15 5:49 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Hi,
this patch series is another step on our long road towards not having
global state. In addition to that, as commit-graphs are part of the
object database layer, this is also another step towards pluggable
object databases.
Changes in v2:
- Use `unsigned` instead of `size_t` to count number of Bloom filters.
- Use `uint32_t` instead of `size_t` for number of commit graphs,
as this type is also used to iterate through this count already.
- Refactor `parse_commit_graph()` to take a repository instead of both
repo settings and a hash algo.
- Link to v1: https://lore.kernel.org/r/20250804-b4-pks-commit-graph-wo-the-repository-v1-0-850d626eb2e8@pks.im
Changes in v3:
- Use `unsigned` for commit-graph options instead of `size_t`.
- Link to v2: https://lore.kernel.org/r/20250806-b4-pks-commit-graph-wo-the-repository-v2-0-911bae638e61@pks.im
Changes in v4:
- Drop the patches that fix `-Wsign-compare` warnings.
- Link to v3: https://lore.kernel.org/r/20250807-b4-pks-commit-graph-wo-the-repository-v3-0-82edef830a1e@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (6):
commit-graph: stop using `the_hash_algo` via macros
commit-graph: store the hash algorithm instead of its length
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: stop using `the_hash_algo`
commit-graph: stop using `the_repository`
commit-graph: stop passing in redundant repository
builtin/commit-graph.c | 9 +-
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 283 +++++++++++++++++++++----------------------
commit-graph.h | 21 ++--
oss-fuzz/fuzz-commit-graph.c | 6 +-
t/helper/test-read-graph.c | 2 +-
7 files changed, 157 insertions(+), 168 deletions(-)
Range-diff versus v3:
1: 26daf74e02 < -: ---------- trace2: introduce function to trace unsigned integers
2: 646805924e < -: ---------- commit-graph: stop using signed integers to count Bloom filters
3: 01e38f39cf < -: ---------- commit-graph: fix type for some write options
4: a362f63472 < -: ---------- commit-graph: fix sign comparison warnings
5: 041d9c07a0 = 1: f7083b7e3b commit-graph: stop using `the_hash_algo` via macros
6: 6c1fa6be4f = 2: 336e35e93b commit-graph: store the hash algorithm instead of its length
7: ed04f7b787 = 3: 1446e1d66f commit-graph: refactor `parse_commit_graph()` to take a repository
8: 1eb6316e8d ! 4: 368e5ada3e commit-graph: stop using `the_hash_algo`
@@ commit-graph.c: int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
-- if (st->st_size < (ssize_t) the_hash_algo->hexsz) {
-+ if (st->st_size < (ssize_t) hash_algo->hexsz) {
+- if (st->st_size < the_hash_algo->hexsz) {
++ if (st->st_size < hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ commit-graph.c: struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+ int i = 0, valid = 1, count;
FILE *fp = xfdopen(fd, "r");
- size_t count;
- count = st->st_size / (the_hash_algo->hexsz + 1);
+ count = st->st_size / (r->hash_algo->hexsz + 1);
@@ commit-graph.c: static struct tree *load_tree_for_commit(struct repository *r,
oidread(&oid, commit_data, the_repository->hash_algo);
@@ commit-graph.c: static int write_graph_chunk_oids(struct hashfile *f,
-
+ int count;
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
- hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
@@ commit-graph.c: static int write_graph_chunk_data(struct hashfile *f,
parent = (*list)->parents;
-@@ commit-graph.c: static size_t write_graph_chunk_base_1(struct hashfile *f,
+@@ commit-graph.c: static int write_graph_chunk_base_1(struct hashfile *f,
return 0;
num = write_graph_chunk_base_1(f, g->base_graph);
9: 8d599f5a37 ! 5: a51cccec0d commit-graph: stop using `the_repository`
@@ builtin/merge.c: int cmd_merge(int argc,
## commit-graph.c ##
@@
-#define USE_THE_REPOSITORY_VARIABLE
--
+ #define DISABLE_SIGN_COMPARE_WARNINGS
+
#include "git-compat-util.h"
- #include "config.h"
- #include "csum-file.h"
@@
#include "tree.h"
#include "chunk-format.h"
10: 41ccdd6da1 ! 6: 841d280105 commit-graph: stop passing in redundant repository
@@ commit-graph.c: int open_commit_graph_chain(const char *chain_file,
int *incomplete_chain)
{
@@ commit-graph.c: struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+ int i = 0, valid = 1, count;
FILE *fp = xfdopen(fd, "r");
- size_t count;
- count = st->st_size / (r->hash_algo->hexsz + 1);
+ count = st->st_size / (odb->repo->hash_algo->hexsz + 1);
@@ commit-graph.c: struct commit_graph *load_commit_graph_chain_fd_st(struct reposi
- odb_prepare_alternates(r->objects);
+ odb_prepare_alternates(odb);
- for (size_t i = 0; i < count; i++) {
+ for (i = 0; i < count; i++) {
struct odb_source *source;
@@ commit-graph.c: struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
---
base-commit: e813a0200a7121b97fec535f0d0b460b0a33356c
change-id: 20250717-b4-pks-commit-graph-wo-the-repository-1dc2cacbc8e3
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH v4 1/6] commit-graph: stop using `the_hash_algo` via macros
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
@ 2025-08-15 5:49 ` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 2/6] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
` (5 subsequent siblings)
6 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-15 5:49 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
We have two macros `GRAPH_DATA_WIDTH` and `GRAPH_MIN_SIZE` that compute
hash-dependent sizes. They do so by using the global `the_hash_algo`
variable though, which we want to get rid of over time.
Convert these macros into functions that accept the hash algorithm as
input parameter. Adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index bd7b6f5338..ecd50ea3ff 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -53,8 +53,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */
#define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */
-#define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16)
-
#define GRAPH_VERSION_1 0x1
#define GRAPH_VERSION GRAPH_VERSION_1
@@ -66,8 +64,6 @@ void git_test_write_commit_graph_or_die(void)
#define GRAPH_HEADER_SIZE 8
#define GRAPH_FANOUT_SIZE (4 * 256)
-#define GRAPH_MIN_SIZE (GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE \
- + GRAPH_FANOUT_SIZE + the_hash_algo->rawsz)
#define CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW (1ULL << 31)
@@ -80,6 +76,16 @@ define_commit_slab(topo_level_slab, uint32_t);
define_commit_slab(commit_pos, int);
static struct commit_pos commit_pos = COMMIT_SLAB_INIT(1, commit_pos);
+static size_t graph_data_width(const struct git_hash_algo *algop)
+{
+ return algop->rawsz + 16;
+}
+
+static size_t graph_min_size(const struct git_hash_algo *algop)
+{
+ return GRAPH_HEADER_SIZE + 4 * CHUNK_TOC_ENTRY_SIZE + GRAPH_FANOUT_SIZE + algop->rawsz;
+}
+
static void set_commit_pos(struct repository *r, const struct object_id *oid)
{
static int32_t max_pos;
@@ -258,7 +264,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < GRAPH_MIN_SIZE) {
+ if (graph_size < graph_min_size(the_hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -314,7 +320,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / GRAPH_DATA_WIDTH != g->num_commits)
+ if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -379,7 +385,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < GRAPH_MIN_SIZE)
+ if (graph_size < graph_min_size(the_hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -900,7 +906,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(GRAPH_DATA_WIDTH, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1104,7 +1110,8 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(GRAPH_DATA_WIDTH, graph_pos - g->num_commits_in_base);
+ st_mult(graph_data_width(the_hash_algo),
+ graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
--
2.51.0.rc1.215.g0f929dcec7.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v4 2/6] commit-graph: store the hash algorithm instead of its length
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 1/6] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
@ 2025-08-15 5:49 ` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 3/6] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
` (4 subsequent siblings)
6 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-15 5:49 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
The commit-graph stores the length of the hash algorithm it uses. In
subsequent commits we'll need to pass the whole hash algorithm around
though, which we currently don't have access to.
Refactor the code so that we store the hash algorithm instead of only
its size.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 36 ++++++++++++++++++------------------
commit-graph.h | 2 +-
2 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index ecd50ea3ff..5053d12534 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -311,7 +311,7 @@ static int graph_read_oid_lookup(const unsigned char *chunk_start,
{
struct commit_graph *g = data;
g->chunk_oid_lookup = chunk_start;
- if (chunk_size / g->hash_len != g->num_commits)
+ if (chunk_size / g->hash_algo->rawsz != g->num_commits)
return error(_("commit-graph OID lookup chunk is the wrong size"));
return 0;
}
@@ -413,7 +413,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
graph = alloc_commit_graph();
- graph->hash_len = the_hash_algo->rawsz;
+ graph->hash_algo = the_hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
@@ -478,7 +478,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
FREE_AND_NULL(graph->bloom_filter_settings);
}
- oidread(&graph->oid, graph->data + graph->data_len - graph->hash_len,
+ oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
the_repository->hash_algo);
free_chunkfile(cf);
@@ -584,7 +584,7 @@ static int add_graph_to_chain(struct commit_graph *g,
return 0;
}
- if (g->chunk_base_graphs_size / g->hash_len < n) {
+ if (g->chunk_base_graphs_size / g->hash_algo->rawsz < n) {
warning(_("commit-graph base graphs chunk is too small"));
return 0;
}
@@ -594,7 +594,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
- !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_len, n),
+ !hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
the_repository->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
@@ -805,7 +805,7 @@ int generation_numbers_enabled(struct repository *r)
return 0;
first_generation = get_be32(g->chunk_commit_data +
- g->hash_len + 8) >> 2;
+ g->hash_algo->rawsz + 8) >> 2;
return !!first_generation;
}
@@ -849,7 +849,7 @@ void close_commit_graph(struct object_database *o)
static int bsearch_graph(struct commit_graph *g, const struct object_id *oid, uint32_t *pos)
{
return bsearch_hash(oid->hash, g->chunk_oid_fanout,
- g->chunk_oid_lookup, g->hash_len, pos);
+ g->chunk_oid_lookup, g->hash_algo->rawsz, pos);
}
static void load_oid_from_graph(struct commit_graph *g,
@@ -869,7 +869,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
- oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_len, lex_index),
+ oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
the_repository->hash_algo);
}
@@ -911,8 +911,8 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
- date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
- date_low = get_be32(commit_data + g->hash_len + 12);
+ date_high = get_be32(commit_data + g->hash_algo->rawsz + 8) & 0x3;
+ date_low = get_be32(commit_data + g->hash_algo->rawsz + 12);
item->date = (timestamp_t)((date_high << 32) | date_low);
if (g->read_generation_data) {
@@ -930,10 +930,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
} else
graph_data->generation = item->date + offset;
} else
- graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
+ graph_data->generation = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
if (g->topo_levels)
- *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2;
+ *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_algo->rawsz + 8) >> 2;
}
static inline void set_commit_tree(struct commit *c, struct tree *t)
@@ -957,7 +957,7 @@ static int fill_commit_in_graph(struct repository *r,
fill_commit_graph_info(item, g, pos);
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(g->hash_len + 16, lex_index);
+ commit_data = g->chunk_commit_data + st_mult(g->hash_algo->rawsz + 16, lex_index);
item->object.parsed = 1;
@@ -965,12 +965,12 @@ static int fill_commit_in_graph(struct repository *r,
pptr = &item->parents;
- edge_value = get_be32(commit_data + g->hash_len);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
pptr = insert_parent_or_die(r, g, edge_value, pptr);
- edge_value = get_be32(commit_data + g->hash_len + 4);
+ edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
@@ -2623,7 +2623,7 @@ int write_commit_graph(struct odb_source *source,
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
- oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
@@ -2754,7 +2754,7 @@ static int verify_one_commit_graph(struct repository *r,
for (i = 0; i < g->num_commits; i++) {
struct commit *graph_commit;
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
@@ -2799,7 +2799,7 @@ static int verify_one_commit_graph(struct repository *r,
timestamp_t generation;
display_progress(progress, ++(*seen));
- oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
+ oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
the_repository->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
diff --git a/commit-graph.h b/commit-graph.h
index 78ab7b875b..7dc1f2b22b 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -84,7 +84,7 @@ struct commit_graph {
const unsigned char *data;
size_t data_len;
- unsigned char hash_len;
+ const struct git_hash_algo *hash_algo;
unsigned char num_chunks;
uint32_t num_commits;
struct object_id oid;
--
2.51.0.rc1.215.g0f929dcec7.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v4 3/6] commit-graph: refactor `parse_commit_graph()` to take a repository
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 1/6] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 2/6] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
@ 2025-08-15 5:49 ` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 4/6] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
` (3 subsequent siblings)
6 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-15 5:49 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Refactor `parse_commit_graph()` so that it takes a repository instead of
taking repository settings. On the one hand this allows us to get rid of
instances where we access `the_hash_algo` by using the repository's hash
algorithm instead. On the other hand it also allows us to move the call
of `prepare_repo_settings()` into the function itself.
Note that there's one small catch, as the commit-graph fuzzer calls this
function directly without having a fully functional repository at hand.
And while the fuzzer already initializes `the_repository` with relevant
info, the call to `prepare_repo_settings()` would fail because we don't
have a fully-initialized repository.
Work around the issue by also settings `settings.initialized` to pretend
that we've already read the settings.
While at it, remove the redundant `parse_commit_graph()` declaration in
the fuzzer. It was added together with aa658574bf (commit-graph, fuzz:
add fuzzer for commit-graph, 2019-01-15), but as we also declared the
same function in "commit-graph.h" it wasn't ever needed.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
commit-graph.c | 23 ++++++++++++-----------
commit-graph.h | 2 +-
oss-fuzz/fuzz-commit-graph.c | 6 ++----
3 files changed, 15 insertions(+), 16 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 5053d125340..2f314a7407e 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -271,9 +271,8 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
}
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
- prepare_repo_settings(r);
- ret = parse_commit_graph(&r->settings, graph_map, graph_size);
+ ret = parse_commit_graph(r, graph_map, graph_size);
if (ret)
ret->odb_source = source;
else
@@ -373,7 +372,7 @@ static int graph_read_bloom_data(const unsigned char *chunk_start,
return 0;
}
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
+struct commit_graph *parse_commit_graph(struct repository *r,
void *graph_map, size_t graph_size)
{
const unsigned char *data;
@@ -385,7 +384,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
if (!graph_map)
return NULL;
- if (graph_size < graph_min_size(the_hash_algo))
+ if (graph_size < graph_min_size(r->hash_algo))
return NULL;
data = (const unsigned char *)graph_map;
@@ -405,22 +404,22 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
hash_version = *(unsigned char*)(data + 5);
- if (hash_version != oid_version(the_hash_algo)) {
+ if (hash_version != oid_version(r->hash_algo)) {
error(_("commit-graph hash version %X does not match version %X"),
- hash_version, oid_version(the_hash_algo));
+ hash_version, oid_version(r->hash_algo));
return NULL;
}
graph = alloc_commit_graph();
- graph->hash_algo = the_hash_algo;
+ graph->hash_algo = r->hash_algo;
graph->num_chunks = *(unsigned char*)(data + 6);
graph->data = graph_map;
graph->data_len = graph_size;
if (graph_size < GRAPH_HEADER_SIZE +
(graph->num_chunks + 1) * CHUNK_TOC_ENTRY_SIZE +
- GRAPH_FANOUT_SIZE + the_hash_algo->rawsz) {
+ GRAPH_FANOUT_SIZE + r->hash_algo->rawsz) {
error(_("commit-graph file is too small to hold %u chunks"),
graph->num_chunks);
free(graph);
@@ -451,7 +450,9 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
pair_chunk(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs,
&graph->chunk_base_graphs_size);
- if (s->commit_graph_generation_version >= 2) {
+ prepare_repo_settings(r);
+
+ if (r->settings.commit_graph_generation_version >= 2) {
read_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA,
graph_read_generation_data, graph);
pair_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW,
@@ -462,7 +463,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
graph->read_generation_data = 1;
}
- if (s->commit_graph_changed_paths_version) {
+ if (r->settings.commit_graph_changed_paths_version) {
read_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
graph_read_bloom_index, graph);
read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
@@ -479,7 +480,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
}
oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
- the_repository->hash_algo);
+ r->hash_algo);
free_chunkfile(cf);
return graph;
diff --git a/commit-graph.h b/commit-graph.h
index 7dc1f2b22bd..7bbc69989ce 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -128,7 +128,7 @@ struct repo_settings;
* Callers should initialize the repo_settings with prepare_repo_settings()
* prior to calling parse_commit_graph().
*/
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
+struct commit_graph *parse_commit_graph(struct repository *r,
void *graph_map, size_t graph_size);
/*
diff --git a/oss-fuzz/fuzz-commit-graph.c b/oss-fuzz/fuzz-commit-graph.c
index fbb77fec197..fb8b8787a46 100644
--- a/oss-fuzz/fuzz-commit-graph.c
+++ b/oss-fuzz/fuzz-commit-graph.c
@@ -4,9 +4,6 @@
#include "commit-graph.h"
#include "repository.h"
-struct commit_graph *parse_commit_graph(struct repo_settings *s,
- void *graph_map, size_t graph_size);
-
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
@@ -22,9 +19,10 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
* possible.
*/
repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+ the_repository->settings.initialized = 1;
the_repository->settings.commit_graph_generation_version = 2;
the_repository->settings.commit_graph_changed_paths_version = 1;
- g = parse_commit_graph(&the_repository->settings, (void *)data, size);
+ g = parse_commit_graph(the_repository, (void *)data, size);
repo_clear(the_repository);
free_commit_graph(g);
--
2.51.0.rc1.215.g0f929dcec7.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v4 4/6] commit-graph: stop using `the_hash_algo`
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
` (2 preceding siblings ...)
2025-08-15 5:49 ` [PATCH v4 3/6] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
@ 2025-08-15 5:49 ` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 5/6] commit-graph: stop using `the_repository` Patrick Steinhardt
` (2 subsequent siblings)
6 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-15 5:49 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Stop using `the_hash_algo` as it implicitly relies on `the_repository`.
Instead, we either use the hash algo provided via the context or, if
there is no such hash algo, we use `the_repository` explicitly. Such
uses will be removed in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 3 ++-
commit-graph.c | 27 ++++++++++++++-------------
commit-graph.h | 3 ++-
3 files changed, 18 insertions(+), 15 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 25018a0b9d..fa6dd34d8d 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -108,7 +108,8 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
opened = OPENED_GRAPH;
else if (errno != ENOENT)
die_errno(_("Could not open commit-graph '%s'"), graph_name);
- else if (open_commit_graph_chain(chain_name, &fd, &st))
+ else if (open_commit_graph_chain(chain_name, &fd, &st,
+ the_repository->hash_algo))
opened = OPENED_CHAIN;
else if (errno != ENOENT)
die_errno(_("could not open commit-graph chain '%s'"), chain_name);
diff --git a/commit-graph.c b/commit-graph.c
index 2f314a7407..46f55c8bb4 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -264,7 +264,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(the_hash_algo)) {
+ if (graph_size < graph_min_size(r->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -319,7 +319,7 @@ static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
struct commit_graph *g = data;
- if (chunk_size / graph_data_width(the_hash_algo) != g->num_commits)
+ if (chunk_size / graph_data_width(g->hash_algo) != g->num_commits)
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
@@ -620,7 +620,8 @@ static int add_graph_to_chain(struct commit_graph *g,
}
int open_commit_graph_chain(const char *chain_file,
- int *fd, struct stat *st)
+ int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo)
{
*fd = git_open(chain_file);
if (*fd < 0)
@@ -629,7 +630,7 @@ int open_commit_graph_chain(const char *chain_file,
close(*fd);
return 0;
}
- if (st->st_size < the_hash_algo->hexsz) {
+ if (st->st_size < hash_algo->hexsz) {
close(*fd);
if (!st->st_size) {
/* treat empty files the same as missing */
@@ -653,7 +654,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
int i = 0, valid = 1, count;
FILE *fp = xfdopen(fd, "r");
- count = st->st_size / (the_hash_algo->hexsz + 1);
+ count = st->st_size / (r->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
odb_prepare_alternates(r->objects);
@@ -715,7 +716,7 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r,
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
@@ -907,7 +908,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
die(_("invalid commit position. commit-graph is likely corrupt"));
lex_index = pos - g->num_commits_in_base;
- commit_data = g->chunk_commit_data + st_mult(graph_data_width(the_hash_algo), lex_index);
+ commit_data = g->chunk_commit_data + st_mult(graph_data_width(g->hash_algo), lex_index);
graph_data = commit_graph_data_at(item);
graph_data->graph_pos = pos;
@@ -1111,7 +1112,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
g = g->base_graph;
commit_data = g->chunk_commit_data +
- st_mult(graph_data_width(the_hash_algo),
+ st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, the_repository->hash_algo);
@@ -1220,7 +1221,7 @@ static int write_graph_chunk_oids(struct hashfile *f,
int count;
for (count = 0; count < ctx->commits.nr; count++, list++) {
display_progress(ctx->progress, ++ctx->progress_cnt);
- hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, (*list)->object.oid.hash, f->algop->rawsz);
}
return 0;
@@ -1251,7 +1252,7 @@ static int write_graph_chunk_data(struct hashfile *f,
die(_("unable to parse commit %s"),
oid_to_hex(&(*list)->object.oid));
tree = get_commit_tree_oid(*list);
- hashwrite(f, tree->hash, the_hash_algo->rawsz);
+ hashwrite(f, tree->hash, ctx->r->hash_algo->rawsz);
parent = (*list)->parents;
@@ -2034,7 +2035,7 @@ static int write_graph_chunk_base_1(struct hashfile *f,
return 0;
num = write_graph_chunk_base_1(f, g->base_graph);
- hashwrite(f, g->oid.hash, the_hash_algo->rawsz);
+ hashwrite(f, g->oid.hash, g->hash_algo->rawsz);
return num + 1;
}
@@ -2058,7 +2059,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
struct hashfile *f;
struct tempfile *graph_layer; /* when ctx->split is non-zero */
struct lock_file lk = LOCK_INIT;
- const unsigned hashsz = the_hash_algo->rawsz;
+ const unsigned hashsz = ctx->r->hash_algo->rawsz;
struct strbuf progress_title = STRBUF_INIT;
struct chunkfile *cf;
unsigned char file_hash[GIT_MAX_RAWSZ];
@@ -2146,7 +2147,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
hashwrite_be32(f, GRAPH_SIGNATURE);
hashwrite_u8(f, GRAPH_VERSION);
- hashwrite_u8(f, oid_version(the_hash_algo));
+ hashwrite_u8(f, oid_version(ctx->r->hash_algo));
hashwrite_u8(f, get_num_chunks(cf));
hashwrite_u8(f, ctx->num_commit_graphs_after - 1);
diff --git a/commit-graph.h b/commit-graph.h
index 7bbc69989c..df10daf01c 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -32,7 +32,8 @@ struct string_list;
char *get_commit_graph_filename(struct odb_source *source);
char *get_commit_graph_chain_filename(struct odb_source *source);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
-int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st);
+int open_commit_graph_chain(const char *chain_file, int *fd, struct stat *st,
+ const struct git_hash_algo *hash_algo);
/*
* Given a commit struct, try to fill the commit struct info, including:
--
2.51.0.rc1.215.g0f929dcec7.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v4 5/6] commit-graph: stop using `the_repository`
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
` (3 preceding siblings ...)
2025-08-15 5:49 ` [PATCH v4 4/6] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
@ 2025-08-15 5:49 ` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 6/6] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-15 15:17 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Derrick Stolee
6 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-15 5:49 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
There's still a bunch of uses of `the_repository` in "commit-graph.c",
which we want to stop using due to it being a global variable. Refactor
the code to stop using `the_repository` in favor of the repository
provided via the calling context.
This allows us to drop the `USE_THE_REPOSITORY_VARIABLE` macro.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 76 +++++++++++++++++++++++++++++---------------------------
commit-graph.h | 2 +-
4 files changed, 42 insertions(+), 40 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index 63e7158e98..8ca0aede48 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1933,7 +1933,7 @@ int cmd_commit(int argc,
"new index file. Check that disk is not full and quota is\n"
"not exceeded, and then \"git restore --staged :/\" to recover."));
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
repo_rerere(the_repository, 0);
run_auto_maintenance(quiet);
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..263cb58471 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1862,7 +1862,7 @@ int cmd_merge(int argc,
if (squash) {
finish(head_commit, remoteheads, NULL, NULL);
- git_test_write_commit_graph_or_die();
+ git_test_write_commit_graph_or_die(the_repository->objects->sources);
} else
write_merge_state(remoteheads);
diff --git a/commit-graph.c b/commit-graph.c
index 46f55c8bb4..180b826239 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,4 +1,3 @@
-#define USE_THE_REPOSITORY_VARIABLE
#define DISABLE_SIGN_COMPARE_WARNINGS
#include "git-compat-util.h"
@@ -28,7 +27,7 @@
#include "tree.h"
#include "chunk-format.h"
-void git_test_write_commit_graph_or_die(void)
+void git_test_write_commit_graph_or_die(struct odb_source *source)
{
int flags = 0;
if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
@@ -37,8 +36,7 @@ void git_test_write_commit_graph_or_die(void)
if (git_env_bool(GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS, 0))
flags = COMMIT_GRAPH_WRITE_BLOOM_FILTERS;
- if (write_commit_graph_reachable(the_repository->objects->sources,
- flags, NULL))
+ if (write_commit_graph_reachable(source, flags, NULL))
die("failed to write commit-graph under GIT_TEST_COMMIT_GRAPH");
}
@@ -596,7 +594,7 @@ static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
!hasheq(oids[n].hash, g->chunk_base_graphs + st_mult(g->hash_algo->rawsz, n),
- the_repository->hash_algo)) {
+ g->hash_algo)) {
warning(_("commit-graph chain does not match"));
return 0;
}
@@ -665,7 +663,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex(line.buf, &oids[i])) {
+ if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -751,7 +749,7 @@ static void prepare_commit_graph_one(struct repository *r,
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
* On the first invocation, this function attempts to load the commit
- * graph if the_repository is configured to have one.
+ * graph if the repository is configured to have one.
*/
static int prepare_commit_graph(struct repository *r)
{
@@ -872,7 +870,7 @@ static void load_oid_from_graph(struct commit_graph *g,
lex_index = pos - g->num_commits_in_base;
oidread(oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, lex_index),
- the_repository->hash_algo);
+ g->hash_algo);
}
static struct commit_list **insert_parent_or_die(struct repository *r,
@@ -1115,7 +1113,7 @@ static struct tree *load_tree_for_commit(struct repository *r,
st_mult(graph_data_width(g->hash_algo),
graph_pos - g->num_commits_in_base);
- oidread(&oid, commit_data, the_repository->hash_algo);
+ oidread(&oid, commit_data, g->hash_algo);
set_commit_tree(c, lookup_tree(r, &oid));
return c->maybe_tree;
@@ -1542,7 +1540,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Loading known commits in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1560,7 +1558,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
*/
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Expanding reachable commits in commit graph"),
0);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1581,7 +1579,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Clearing commit marks in commit graph"),
ctx->oids.nr);
for (i = 0; i < ctx->oids.nr; i++) {
@@ -1699,7 +1697,7 @@ static void compute_topological_levels(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph topological levels"),
ctx->commits.nr);
@@ -1734,7 +1732,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
info.progress = ctx->progress
= start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit graph generation numbers"),
ctx->commits.nr);
@@ -1811,7 +1809,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Computing commit changed paths Bloom filters"),
ctx->commits.nr);
@@ -1857,6 +1855,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
}
struct refs_cb_data {
+ struct repository *repo;
struct oidset *commits;
struct progress *progress;
};
@@ -1869,9 +1868,9 @@ static int add_ref_to_set(const char *refname UNUSED,
struct object_id peeled;
struct refs_cb_data *data = (struct refs_cb_data *)cb_data;
- if (!peel_iterated_oid(the_repository, oid, &peeled))
+ if (!peel_iterated_oid(data->repo, oid, &peeled))
oid = &peeled;
- if (odb_read_object_info(the_repository->objects, oid, NULL) == OBJ_COMMIT)
+ if (odb_read_object_info(data->repo->objects, oid, NULL) == OBJ_COMMIT)
oidset_insert(data->commits, oid);
display_progress(data->progress, oidset_size(data->commits));
@@ -1888,13 +1887,15 @@ int write_commit_graph_reachable(struct odb_source *source,
int result;
memset(&data, 0, sizeof(data));
+ data.repo = source->odb->repo;
data.commits = &commits;
+
if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
data.progress = start_delayed_progress(
- the_repository,
+ source->odb->repo,
_("Collecting referenced commits"), 0);
- refs_for_each_ref(get_main_ref_store(the_repository), add_ref_to_set,
+ refs_for_each_ref(get_main_ref_store(source->odb->repo), add_ref_to_set,
&data);
stop_progress(&data.progress);
@@ -1923,7 +1924,7 @@ static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
"Finding commits for commit graph in %"PRIuMAX" packs",
pack_indexes->nr),
(uintmax_t)pack_indexes->nr);
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
progress_title.buf, 0);
ctx->progress_done = 0;
}
@@ -1977,7 +1978,7 @@ static void fill_oids_from_all_packs(struct write_commit_graph_context *ctx)
{
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding commits for commit graph among packed objects"),
ctx->approx_nr_objects);
for_each_packed_object(ctx->r, add_packed_commits, ctx,
@@ -1996,7 +1997,7 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
ctx->num_extra_edges = 0;
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Finding extra edges in commit graph"),
ctx->oids.nr);
oid_array_sort(&ctx->oids);
@@ -2075,7 +2076,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
ctx->graph_name = get_commit_graph_filename(ctx->odb_source);
}
- if (safe_create_leading_directories(the_repository, ctx->graph_name)) {
+ if (safe_create_leading_directories(ctx->r, ctx->graph_name)) {
error(_("unable to create leading directories of %s"),
ctx->graph_name);
return -1;
@@ -2094,18 +2095,18 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
return -1;
}
- if (adjust_shared_perm(the_repository, get_tempfile_path(graph_layer))) {
+ if (adjust_shared_perm(ctx->r, get_tempfile_path(graph_layer))) {
error(_("unable to adjust shared permissions for '%s'"),
get_tempfile_path(graph_layer));
return -1;
}
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_tempfile_fd(graph_layer), get_tempfile_path(graph_layer));
} else {
hold_lock_file_for_update_mode(&lk, ctx->graph_name,
LOCK_DIE_ON_ERROR, 0444);
- f = hashfd(the_repository->hash_algo,
+ f = hashfd(ctx->r->hash_algo,
get_lock_file_fd(&lk), get_lock_file_path(&lk));
}
@@ -2158,7 +2159,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
get_num_chunks(cf)),
get_num_chunks(cf));
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
progress_title.buf,
st_mult(get_num_chunks(cf), ctx->commits.nr));
}
@@ -2216,7 +2217,8 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
}
free(ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
- ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] = xstrdup(hash_to_hex(file_hash));
+ ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1] =
+ xstrdup(hash_to_hex_algop(file_hash, ctx->r->hash_algo));
final_graph_name = get_split_graph_filename(ctx->odb_source,
ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 1]);
free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1]);
@@ -2371,7 +2373,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
if (ctx->report_progress)
ctx->progress = start_delayed_progress(
- the_repository,
+ ctx->r,
_("Scanning merged commits"),
ctx->commits.nr);
@@ -2416,7 +2418,7 @@ static void merge_commit_graphs(struct write_commit_graph_context *ctx)
current_graph_number--;
if (ctx->report_progress)
- ctx->progress = start_delayed_progress(the_repository,
+ ctx->progress = start_delayed_progress(ctx->r,
_("Merging commit-graph"), 0);
merge_commit_graph(ctx, g);
@@ -2519,7 +2521,7 @@ int write_commit_graph(struct odb_source *source,
enum commit_graph_write_flags flags,
const struct commit_graph_opts *opts)
{
- struct repository *r = the_repository;
+ struct repository *r = source->odb->repo;
struct write_commit_graph_context ctx = {
.r = r,
.odb_source = source,
@@ -2619,14 +2621,14 @@ int write_commit_graph(struct odb_source *source,
replace = ctx.opts->split_flags & COMMIT_GRAPH_SPLIT_REPLACE;
}
- ctx.approx_nr_objects = repo_approximate_object_count(the_repository);
+ ctx.approx_nr_objects = repo_approximate_object_count(r);
if (ctx.append && ctx.r->objects->commit_graph) {
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ r->hash_algo);
oid_array_append(&ctx.oids, &oid);
}
}
@@ -2734,7 +2736,7 @@ static void graph_report(const char *fmt, ...)
static int commit_graph_checksum_valid(struct commit_graph *g)
{
- return hashfile_checksum_valid(the_repository->hash_algo,
+ return hashfile_checksum_valid(g->hash_algo,
g->data, g->data_len);
}
@@ -2757,7 +2759,7 @@ static int verify_one_commit_graph(struct repository *r,
struct commit *graph_commit;
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
graph_report(_("commit-graph has incorrect OID order: %s then %s"),
@@ -2802,7 +2804,7 @@ static int verify_one_commit_graph(struct repository *r,
display_progress(progress, ++(*seen));
oidread(&cur_oid, g->chunk_oid_lookup + st_mult(g->hash_algo->rawsz, i),
- the_repository->hash_algo);
+ g->hash_algo);
graph_commit = lookup_commit(r, &cur_oid);
odb_commit = (struct commit *)create_object(r, &cur_oid, alloc_commit_node(r));
@@ -2906,7 +2908,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(the_repository,
+ progress = start_progress(r,
_("Verifying commits in commit graph"),
total);
}
diff --git a/commit-graph.h b/commit-graph.h
index df10daf01c..0a67ac9280 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -21,7 +21,7 @@
* call this method oustide of a builtin, and only if you know what
* you are doing!
*/
-void git_test_write_commit_graph_or_die(void);
+void git_test_write_commit_graph_or_die(struct odb_source *source);
struct commit;
struct bloom_filter_settings;
--
2.51.0.rc1.215.g0f929dcec7.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH v4 6/6] commit-graph: stop passing in redundant repository
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
` (4 preceding siblings ...)
2025-08-15 5:49 ` [PATCH v4 5/6] commit-graph: stop using `the_repository` Patrick Steinhardt
@ 2025-08-15 5:49 ` Patrick Steinhardt
2025-08-15 15:17 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Derrick Stolee
6 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-08-15 5:49 UTC (permalink / raw)
To: git; +Cc: Taylor Blau, Derrick Stolee, Oswald Buddenhagen, Junio C Hamano
Many of the commit-graph related functions take in both a repository and
the object database source (directly or via `struct commit_graph`) for
which we are supposed to load such a commit-graph. In the best case this
information is simply redundant as the source already contains a
reference to its owning object database, which in turn has a reference
to its repository. In the worst case this information could even
mismatch when passing in a source that doesn't belong to the same
repository.
Refactor the code so that we only pass in the object database source in
those cases.
There is one exception though, namely `load_commit_graph_chain_fd_st()`,
which is responsible for loading a commit-graph chain. It is expected
that parts of the commit-graph chain aren't located in the same object
source as the chain file itself, but in a different one. Consequently,
this function doesn't work on the source level but on the database level
instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/commit-graph.c | 6 +--
commit-graph.c | 120 +++++++++++++++++++--------------------------
commit-graph.h | 12 ++---
t/helper/test-read-graph.c | 2 +-
4 files changed, 59 insertions(+), 81 deletions(-)
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index fa6dd34d8d..dbeaeeaed3 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -121,15 +121,15 @@ static int graph_verify(int argc, const char **argv, const char *prefix,
if (opened == OPENED_NONE)
return 0;
else if (opened == OPENED_GRAPH)
- graph = load_commit_graph_one_fd_st(the_repository, fd, &st, source);
+ graph = load_commit_graph_one_fd_st(source, fd, &st);
else
- graph = load_commit_graph_chain_fd_st(the_repository, fd, &st,
+ graph = load_commit_graph_chain_fd_st(the_repository->objects, fd, &st,
&incomplete_chain);
if (!graph)
return 1;
- ret = verify_commit_graph(the_repository, graph, flags);
+ ret = verify_commit_graph(graph, flags);
free_commit_graph(graph);
if (incomplete_chain) {
diff --git a/commit-graph.c b/commit-graph.c
index 180b826239..5aaaa9715a 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -252,9 +252,8 @@ int open_commit_graph(const char *graph_file, int *fd, struct stat *st)
return 1;
}
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source)
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st)
{
void *graph_map;
size_t graph_size;
@@ -262,7 +261,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_size = xsize_t(st->st_size);
- if (graph_size < graph_min_size(r->hash_algo)) {
+ if (graph_size < graph_min_size(source->odb->repo->hash_algo)) {
close(fd);
error(_("commit-graph file is too small"));
return NULL;
@@ -270,7 +269,7 @@ struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
- ret = parse_commit_graph(r, graph_map, graph_size);
+ ret = parse_commit_graph(source->odb->repo, graph_map, graph_size);
if (ret)
ret->odb_source = source;
else
@@ -490,11 +489,9 @@ struct commit_graph *parse_commit_graph(struct repository *r,
return NULL;
}
-static struct commit_graph *load_commit_graph_one(struct repository *r,
- const char *graph_file,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_one(struct odb_source *source,
+ const char *graph_file)
{
-
struct stat st;
int fd;
struct commit_graph *g;
@@ -503,19 +500,17 @@ static struct commit_graph *load_commit_graph_one(struct repository *r,
if (!open_ok)
return NULL;
- g = load_commit_graph_one_fd_st(r, fd, &st, source);
-
+ g = load_commit_graph_one_fd_st(source, fd, &st);
if (g)
g->filename = xstrdup(graph_file);
return g;
}
-static struct commit_graph *load_commit_graph_v1(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_v1(struct odb_source *source)
{
char *graph_name = get_commit_graph_filename(source);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
return g;
@@ -642,7 +637,7 @@ int open_commit_graph_chain(const char *chain_file,
return 1;
}
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain)
{
@@ -652,10 +647,10 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
int i = 0, valid = 1, count;
FILE *fp = xfdopen(fd, "r");
- count = st->st_size / (r->hash_algo->hexsz + 1);
+ count = st->st_size / (odb->repo->hash_algo->hexsz + 1);
CALLOC_ARRAY(oids, count);
- odb_prepare_alternates(r->objects);
+ odb_prepare_alternates(odb);
for (i = 0; i < count; i++) {
struct odb_source *source;
@@ -663,7 +658,7 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
if (strbuf_getline_lf(&line, fp) == EOF)
break;
- if (get_oid_hex_algop(line.buf, &oids[i], r->hash_algo)) {
+ if (get_oid_hex_algop(line.buf, &oids[i], odb->repo->hash_algo)) {
warning(_("invalid commit-graph chain: line '%s' not a hash"),
line.buf);
valid = 0;
@@ -671,9 +666,9 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
}
valid = 0;
- for (source = r->objects->sources; source; source = source->next) {
+ for (source = odb->sources; source; source = source->next) {
char *graph_name = get_split_graph_filename(source, line.buf);
- struct commit_graph *g = load_commit_graph_one(r, graph_name, source);
+ struct commit_graph *g = load_commit_graph_one(source, graph_name);
free(graph_name);
@@ -706,45 +701,33 @@ struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
return graph_chain;
}
-static struct commit_graph *load_commit_graph_chain(struct repository *r,
- struct odb_source *source)
+static struct commit_graph *load_commit_graph_chain(struct odb_source *source)
{
char *chain_file = get_commit_graph_chain_filename(source);
struct stat st;
int fd;
struct commit_graph *g = NULL;
- if (open_commit_graph_chain(chain_file, &fd, &st, r->hash_algo)) {
+ if (open_commit_graph_chain(chain_file, &fd, &st, source->odb->repo->hash_algo)) {
int incomplete;
/* ownership of fd is taken over by load function */
- g = load_commit_graph_chain_fd_st(r, fd, &st, &incomplete);
+ g = load_commit_graph_chain_fd_st(source->odb, fd, &st, &incomplete);
}
free(chain_file);
return g;
}
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source)
+struct commit_graph *read_commit_graph_one(struct odb_source *source)
{
- struct commit_graph *g = load_commit_graph_v1(r, source);
+ struct commit_graph *g = load_commit_graph_v1(source);
if (!g)
- g = load_commit_graph_chain(r, source);
+ g = load_commit_graph_chain(source);
return g;
}
-static void prepare_commit_graph_one(struct repository *r,
- struct odb_source *source)
-{
-
- if (r->objects->commit_graph)
- return;
-
- r->objects->commit_graph = read_commit_graph_one(r, source);
-}
-
/*
* Return 1 if commit_graph is non-NULL, and 0 otherwise.
*
@@ -785,10 +768,12 @@ static int prepare_commit_graph(struct repository *r)
return 0;
odb_prepare_alternates(r->objects);
- for (source = r->objects->sources;
- !r->objects->commit_graph && source;
- source = source->next)
- prepare_commit_graph_one(r, source);
+ for (source = r->objects->sources; source; source = source->next) {
+ r->objects->commit_graph = read_commit_graph_one(source);
+ if (r->objects->commit_graph)
+ break;
+ }
+
return !!r->objects->commit_graph;
}
@@ -873,8 +858,7 @@ static void load_oid_from_graph(struct commit_graph *g,
g->hash_algo);
}
-static struct commit_list **insert_parent_or_die(struct repository *r,
- struct commit_graph *g,
+static struct commit_list **insert_parent_or_die(struct commit_graph *g,
uint32_t pos,
struct commit_list **pptr)
{
@@ -885,7 +869,7 @@ static struct commit_list **insert_parent_or_die(struct repository *r,
die("invalid parent position %"PRIu32, pos);
load_oid_from_graph(g, pos, &oid);
- c = lookup_commit(r, &oid);
+ c = lookup_commit(g->odb_source->odb->repo, &oid);
if (!c)
die(_("could not find commit %s"), oid_to_hex(&oid));
commit_graph_data_at(c)->graph_pos = pos;
@@ -941,8 +925,7 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
c->maybe_tree = t;
}
-static int fill_commit_in_graph(struct repository *r,
- struct commit *item,
+static int fill_commit_in_graph(struct commit *item,
struct commit_graph *g, uint32_t pos)
{
uint32_t edge_value;
@@ -968,13 +951,13 @@ static int fill_commit_in_graph(struct repository *r,
edge_value = get_be32(commit_data + g->hash_algo->rawsz);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
edge_value = get_be32(commit_data + g->hash_algo->rawsz + 4);
if (edge_value == GRAPH_PARENT_NONE)
return 1;
if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) {
- pptr = insert_parent_or_die(r, g, edge_value, pptr);
+ pptr = insert_parent_or_die(g, edge_value, pptr);
return 1;
}
@@ -989,7 +972,7 @@ static int fill_commit_in_graph(struct repository *r,
}
edge_value = get_be32(g->chunk_extra_edges +
sizeof(uint32_t) * parent_data_pos);
- pptr = insert_parent_or_die(r, g,
+ pptr = insert_parent_or_die(g,
edge_value & GRAPH_EDGE_LAST_MASK,
pptr);
parent_data_pos++;
@@ -1055,14 +1038,13 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
if (commit->object.parsed)
return commit;
- if (!fill_commit_in_graph(repo, commit, repo->objects->commit_graph, pos))
+ if (!fill_commit_in_graph(commit, repo->objects->commit_graph, pos))
return NULL;
return commit;
}
-static int parse_commit_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static int parse_commit_in_graph_one(struct commit_graph *g,
struct commit *item)
{
uint32_t pos;
@@ -1071,7 +1053,7 @@ static int parse_commit_in_graph_one(struct repository *r,
return 1;
if (find_commit_pos_in_graph(item, g, &pos))
- return fill_commit_in_graph(r, item, g, pos);
+ return fill_commit_in_graph(item, g, pos);
return 0;
}
@@ -1088,7 +1070,7 @@ int parse_commit_in_graph(struct repository *r, struct commit *item)
if (!prepare_commit_graph(r))
return 0;
- return parse_commit_in_graph_one(r, r->objects->commit_graph, item);
+ return parse_commit_in_graph_one(r->objects->commit_graph, item);
}
void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1098,8 +1080,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
fill_commit_graph_info(item, r->objects->commit_graph, pos);
}
-static struct tree *load_tree_for_commit(struct repository *r,
- struct commit_graph *g,
+static struct tree *load_tree_for_commit(struct commit_graph *g,
struct commit *c)
{
struct object_id oid;
@@ -1114,13 +1095,12 @@ static struct tree *load_tree_for_commit(struct repository *r,
graph_pos - g->num_commits_in_base);
oidread(&oid, commit_data, g->hash_algo);
- set_commit_tree(c, lookup_tree(r, &oid));
+ set_commit_tree(c, lookup_tree(g->odb_source->odb->repo, &oid));
return c->maybe_tree;
}
-static struct tree *get_commit_tree_in_graph_one(struct repository *r,
- struct commit_graph *g,
+static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g,
const struct commit *c)
{
if (c->maybe_tree)
@@ -1128,12 +1108,12 @@ static struct tree *get_commit_tree_in_graph_one(struct repository *r,
if (commit_graph_position(c) == COMMIT_NOT_FROM_GRAPH)
BUG("get_commit_tree_in_graph_one called from non-commit-graph commit");
- return load_tree_for_commit(r, g, (struct commit *)c);
+ return load_tree_for_commit(g, (struct commit *)c);
}
struct tree *get_commit_tree_in_graph(struct repository *r, const struct commit *c)
{
- return get_commit_tree_in_graph_one(r, r->objects->commit_graph, c);
+ return get_commit_tree_in_graph_one(r->objects->commit_graph, c);
}
struct packed_commit_list {
@@ -2740,11 +2720,11 @@ static int commit_graph_checksum_valid(struct commit_graph *g)
g->data, g->data_len);
}
-static int verify_one_commit_graph(struct repository *r,
- struct commit_graph *g,
+static int verify_one_commit_graph(struct commit_graph *g,
struct progress *progress,
uint64_t *seen)
{
+ struct repository *r = g->odb_source->odb->repo;
uint32_t i, cur_fanout_pos = 0;
struct object_id prev_oid, cur_oid;
struct commit *seen_gen_zero = NULL;
@@ -2778,7 +2758,7 @@ static int verify_one_commit_graph(struct repository *r,
}
graph_commit = lookup_commit(r, &cur_oid);
- if (!parse_commit_in_graph_one(r, g, graph_commit))
+ if (!parse_commit_in_graph_one(g, graph_commit))
graph_report(_("failed to parse commit %s from commit-graph"),
oid_to_hex(&cur_oid));
}
@@ -2814,7 +2794,7 @@ static int verify_one_commit_graph(struct repository *r,
continue;
}
- if (!oideq(&get_commit_tree_in_graph_one(r, g, graph_commit)->object.oid,
+ if (!oideq(&get_commit_tree_in_graph_one(g, graph_commit)->object.oid,
get_commit_tree_oid(odb_commit)))
graph_report(_("root tree OID for commit %s in commit-graph is %s != %s"),
oid_to_hex(&cur_oid),
@@ -2832,7 +2812,7 @@ static int verify_one_commit_graph(struct repository *r,
}
/* parse parent in case it is in a base graph */
- parse_commit_in_graph_one(r, g, graph_parents->item);
+ parse_commit_in_graph_one(g, graph_parents->item);
if (!oideq(&graph_parents->item->object.oid, &odb_parents->item->object.oid))
graph_report(_("commit-graph parent for %s is %s != %s"),
@@ -2892,7 +2872,7 @@ static int verify_one_commit_graph(struct repository *r,
return verify_commit_graph_error;
}
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
+int verify_commit_graph(struct commit_graph *g, int flags)
{
struct progress *progress = NULL;
int local_error = 0;
@@ -2908,13 +2888,13 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
if (!(flags & COMMIT_GRAPH_VERIFY_SHALLOW))
total += g->num_commits_in_base;
- progress = start_progress(r,
+ progress = start_progress(g->odb_source->odb->repo,
_("Verifying commits in commit graph"),
total);
}
for (; g; g = g->base_graph) {
- local_error |= verify_one_commit_graph(r, g, progress, &seen);
+ local_error |= verify_one_commit_graph(g, progress, &seen);
if (flags & COMMIT_GRAPH_VERIFY_SHALLOW)
break;
}
diff --git a/commit-graph.h b/commit-graph.h
index 0a67ac9280..4899b54ef8 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -114,14 +114,12 @@ struct commit_graph {
struct bloom_filter_settings *bloom_filter_settings;
};
-struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- int fd, struct stat *st,
- struct odb_source *source);
-struct commit_graph *load_commit_graph_chain_fd_st(struct repository *r,
+struct commit_graph *load_commit_graph_one_fd_st(struct odb_source *source,
+ int fd, struct stat *st);
+struct commit_graph *load_commit_graph_chain_fd_st(struct object_database *odb,
int fd, struct stat *st,
int *incomplete_chain);
-struct commit_graph *read_commit_graph_one(struct repository *r,
- struct odb_source *source);
+struct commit_graph *read_commit_graph_one(struct odb_source *source);
struct repo_settings;
@@ -185,7 +183,7 @@ int write_commit_graph(struct odb_source *source,
#define COMMIT_GRAPH_VERIFY_SHALLOW (1 << 0)
-int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags);
+int verify_commit_graph(struct commit_graph *g, int flags);
void close_commit_graph(struct object_database *);
void free_commit_graph(struct commit_graph *);
diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c
index ef5339bbee..6a5f64e473 100644
--- a/t/helper/test-read-graph.c
+++ b/t/helper/test-read-graph.c
@@ -81,7 +81,7 @@ int cmd__read_graph(int argc, const char **argv)
prepare_repo_settings(the_repository);
- graph = read_commit_graph_one(the_repository, source);
+ graph = read_commit_graph_one(source);
if (!graph) {
ret = 1;
goto done;
--
2.51.0.rc1.215.g0f929dcec7.dirty
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH v4 0/6] commit-graph: remove reliance on global state
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
` (5 preceding siblings ...)
2025-08-15 5:49 ` [PATCH v4 6/6] commit-graph: stop passing in redundant repository Patrick Steinhardt
@ 2025-08-15 15:17 ` Derrick Stolee
6 siblings, 0 replies; 69+ messages in thread
From: Derrick Stolee @ 2025-08-15 15:17 UTC (permalink / raw)
To: Patrick Steinhardt, git; +Cc: Taylor Blau, Oswald Buddenhagen, Junio C Hamano
On 8/15/2025 1:49 AM, Patrick Steinhardt wrote:
> Changes in v4:
> - Drop the patches that fix `-Wsign-compare` warnings.
I appreciate the choice to leave the controversial changes
for a later series and instead focusing on refactoring-only
changes in this version. LGTM.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 69+ messages in thread
end of thread, other threads:[~2025-08-15 15:17 UTC | newest]
Thread overview: 69+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 1/9] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-04 21:33 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters Patrick Steinhardt
2025-08-04 9:13 ` Oswald Buddenhagen
2025-08-04 11:18 ` Patrick Steinhardt
2025-08-04 18:34 ` Junio C Hamano
2025-08-04 21:44 ` Taylor Blau
2025-08-06 6:23 ` Patrick Steinhardt
2025-08-06 12:54 ` Oswald Buddenhagen
2025-08-06 19:04 ` Junio C Hamano
2025-08-06 15:41 ` Junio C Hamano
2025-08-07 7:04 ` Patrick Steinhardt
2025-08-07 22:41 ` Junio C Hamano
2025-08-11 8:05 ` Patrick Steinhardt
2025-08-05 15:13 ` Junio C Hamano
2025-08-04 21:42 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 3/9] commit-graph: fix type for some write options Patrick Steinhardt
2025-08-04 21:52 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 4/9] commit-graph: fix sign comparison warnings Patrick Steinhardt
2025-08-04 22:04 ` Taylor Blau
2025-08-06 6:52 ` Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 5/9] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-04 22:05 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 6/9] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-04 22:07 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 7/9] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-04 22:10 ` Taylor Blau
2025-08-06 6:53 ` Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 8/9] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-04 22:11 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 9/9] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-05 4:27 ` [PATCH 0/9] commit-graph: remove reliance on global state Derrick Stolee
2025-08-06 6:53 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 00/10] " Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 03/10] commit-graph: fix type for some write options Patrick Steinhardt
2025-08-06 12:34 ` Oswald Buddenhagen
2025-08-06 15:40 ` Junio C Hamano
2025-08-07 7:07 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 03/10] commit-graph: fix type for some write options Patrick Steinhardt
2025-08-07 22:40 ` Junio C Hamano
2025-08-11 8:24 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 1/6] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 2/6] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 3/6] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 4/6] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 5/6] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 6/6] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-15 15:17 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Derrick Stolee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).