From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>, Derrick Stolee <stolee@gmail.com>,
Oswald Buddenhagen <oswald.buddenhagen@gmx.de>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 00/10] commit-graph: remove reliance on global state
Date: Wed, 06 Aug 2025 14:00:05 +0200 [thread overview]
Message-ID: <20250806-b4-pks-commit-graph-wo-the-repository-v2-0-911bae638e61@pks.im> (raw)
In-Reply-To: <20250804-b4-pks-commit-graph-wo-the-repository-v1-0-850d626eb2e8@pks.im>
Hi,
this patch series is another step on our long road towards not having
global state. In addition to that, as commit-graphs are part of the
object database layer, this is also another step towards pluggable
object databases.
Changes in v2:
- Use `unsigned` instead of `size_t` to count number of Bloom filters.
- Use `uint32_t` instead of `size_t` for number of commit graphs,
as this type is also used to iterate through this count already.
- Refactor `parse_commit_graph()` to take a repository instead of both
repo settings and a hash algo.
- Link to v1: https://lore.kernel.org/r/20250804-b4-pks-commit-graph-wo-the-repository-v1-0-850d626eb2e8@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (10):
trace2: introduce function to trace unsigned integers
commit-graph: stop using signed integers to count Bloom filters
commit-graph: fix type for some write options
commit-graph: fix sign comparison warnings
commit-graph: stop using `the_hash_algo` via macros
commit-graph: store the hash algorithm instead of its length
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: stop using `the_hash_algo`
commit-graph: stop using `the_repository`
commit-graph: stop passing in redundant repository
builtin/commit-graph.c | 13 +-
builtin/commit.c | 2 +-
builtin/merge.c | 2 +-
commit-graph.c | 371 +++++++++++++++++++++----------------------
commit-graph.h | 25 ++-
oss-fuzz/fuzz-commit-graph.c | 6 +-
t/helper/test-read-graph.c | 2 +-
trace2.c | 14 ++
trace2.h | 9 ++
9 files changed, 227 insertions(+), 217 deletions(-)
Range-diff versus v1:
1: cb92085a3b = 1: a25e9cdbcc trace2: introduce function to trace unsigned integers
2: 25520448c6 ! 2: e03ca21ec2 commit-graph: stop using signed integers to count bloom filters
@@ Metadata
Author: Patrick Steinhardt <ps@pks.im>
## Commit message ##
- commit-graph: stop using signed integers to count bloom filters
+ commit-graph: stop using signed integers to count Bloom filters
When writing a new commit graph we have a couple of counters that
- provide statistics around what kind of bloom filters we have or have not
+ provide statistics around what kind of Bloom filters we have or have not
written. These counters naturally count from zero and are only ever
incremented, but they use a signed integer as type regardless.
- Refactor those fields to be of type `size_t` instead.
+ Refactor those fields to be unsigned instead. Using an unsigned type
+ makes it explicit to the reader that they never have to worry about
+ negative values and thus makes the code easier to understand.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
@@ commit-graph.c: struct write_commit_graph_context {
- int count_bloom_filter_trunc_empty;
- int count_bloom_filter_trunc_large;
- int count_bloom_filter_upgraded;
-+ size_t count_bloom_filter_computed;
-+ size_t count_bloom_filter_not_computed;
-+ size_t count_bloom_filter_trunc_empty;
-+ size_t count_bloom_filter_trunc_large;
-+ size_t count_bloom_filter_upgraded;
++ unsigned count_bloom_filter_computed;
++ unsigned count_bloom_filter_not_computed;
++ unsigned count_bloom_filter_trunc_empty;
++ unsigned count_bloom_filter_trunc_large;
++ unsigned count_bloom_filter_upgraded;
};
static int write_graph_chunk_fanout(struct hashfile *f,
3: 12e150a326 = 3: d569434715 commit-graph: fix type for some write options
4: 0bdaff4e76 ! 4: 3f820e3347 commit-graph: fix sign comparison warnings
@@ Commit message
quantity, we still return a signed integer that we then later
compare with unsigned values.
- - The bloom settings hash version is being assigned `-1` even though
+ - The Bloom settings hash version is being assigned `-1` even though
it's an unsigned value. This is used to indicate an unspecified
value and relies on 1's complement.
@@ commit-graph.c: struct write_commit_graph_context {
char *base_graph_name;
- int num_commit_graphs_before;
- int num_commit_graphs_after;
-+ size_t num_commit_graphs_before;
-+ size_t num_commit_graphs_after;
++ uint32_t num_commit_graphs_before;
++ uint32_t num_commit_graphs_after;
char **commit_graph_filenames_before;
char **commit_graph_filenames_after;
char **commit_graph_hash_after;
5: 6e5d4da7f1 = 5: c3be366e36 commit-graph: stop using `the_hash_algo` via macros
6: 8e8bf531d1 = 6: 2476994769 commit-graph: store the hash algorithm instead of its length
-: ---------- > 7: b582b49437 commit-graph: refactor `parse_commit_graph()` to take a repository
7: 20bce2f981 ! 8: 0d0bd20ceb commit-graph: stop using `the_hash_algo`
@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct reposito
close(fd);
error(_("commit-graph file is too small"));
return NULL;
-@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
- graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
- close(fd);
- prepare_repo_settings(r);
-- ret = parse_commit_graph(&r->settings, graph_map, graph_size);
-+ ret = parse_commit_graph(&r->settings, r->hash_algo, graph_map, graph_size);
-
- if (ret)
- ret->odb_source = source;
@@ commit-graph.c: static int graph_read_commit_data(const unsigned char *chunk_start,
size_t chunk_size, void *data)
{
@@ commit-graph.c: static int graph_read_commit_data(const unsigned char *chunk_sta
return error(_("commit-graph commit data chunk is wrong size"));
g->chunk_commit_data = chunk_start;
return 0;
-@@ commit-graph.c: static int graph_read_bloom_data(const unsigned char *chunk_start,
- }
-
- struct commit_graph *parse_commit_graph(struct repo_settings *s,
-+ const struct git_hash_algo *hash_algo,
- void *graph_map, size_t graph_size)
- {
- const unsigned char *data;
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
- if (!graph_map)
- return NULL;
-
-- if (graph_size < graph_min_size(the_hash_algo))
-+ if (graph_size < graph_min_size(hash_algo))
- return NULL;
-
- data = (const unsigned char *)graph_map;
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
- }
-
- hash_version = *(unsigned char*)(data + 5);
-- if (hash_version != oid_version(the_hash_algo)) {
-+ if (hash_version != oid_version(hash_algo)) {
- error(_("commit-graph hash version %X does not match version %X"),
-- hash_version, oid_version(the_hash_algo));
-+ hash_version, oid_version(hash_algo));
- return NULL;
- }
-
- graph = alloc_commit_graph();
-
-- graph->hash_algo = the_hash_algo;
-+ graph->hash_algo = hash_algo;
- graph->num_chunks = *(unsigned char*)(data + 6);
- graph->data = graph_map;
- graph->data_len = graph_size;
-
- if (graph_size < GRAPH_HEADER_SIZE +
- (graph->num_chunks + 1) * CHUNK_TOC_ENTRY_SIZE +
-- GRAPH_FANOUT_SIZE + the_hash_algo->rawsz) {
-+ GRAPH_FANOUT_SIZE + hash_algo->rawsz) {
- error(_("commit-graph file is too small to hold %u chunks"),
- graph->num_chunks);
- free(graph);
@@ commit-graph.c: static int add_graph_to_chain(struct commit_graph *g,
}
@@ commit-graph.h: struct string_list;
/*
* Given a commit struct, try to fill the commit struct info, including:
-@@ commit-graph.h: struct repo_settings;
- * prior to calling parse_commit_graph().
- */
- struct commit_graph *parse_commit_graph(struct repo_settings *s,
-+ const struct git_hash_algo *hash_algo,
- void *graph_map, size_t graph_size);
-
- /*
-
- ## oss-fuzz/fuzz-commit-graph.c ##
-@@
- #include "repository.h"
-
- struct commit_graph *parse_commit_graph(struct repo_settings *s,
-+ const struct git_hash_algo *hash_algo,
- void *graph_map, size_t graph_size);
-
- int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
-@@ oss-fuzz/fuzz-commit-graph.c: int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
- repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
- the_repository->settings.commit_graph_generation_version = 2;
- the_repository->settings.commit_graph_changed_paths_version = 1;
-- g = parse_commit_graph(&the_repository->settings, (void *)data, size);
-+ g = parse_commit_graph(&the_repository->settings, the_repository->hash_algo,
-+ (void *)data, size);
- repo_clear(the_repository);
- free_commit_graph(g);
-
8: 424567998e ! 9: a86c1ab958 commit-graph: stop using `the_repository`
@@ commit-graph.c: void git_test_write_commit_graph_or_die(void)
die("failed to write commit-graph under GIT_TEST_COMMIT_GRAPH");
}
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
- }
-
- oidread(&graph->oid, graph->data + graph->data_len - graph->hash_algo->rawsz,
-- the_repository->hash_algo);
-+ hash_algo);
-
- free_chunkfile(cf);
- return graph;
@@ commit-graph.c: static int add_graph_to_chain(struct commit_graph *g,
if (!cur_g ||
!oideq(&oids[n], &cur_g->oid) ||
9: cff4bc0329 ! 10: 70a7f6fecf commit-graph: stop passing in redundant repository
@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct reposito
close(fd);
error(_("commit-graph file is too small"));
return NULL;
- }
+@@ commit-graph.c: struct commit_graph *load_commit_graph_one_fd_st(struct repository *r,
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
-- prepare_repo_settings(r);
-- ret = parse_commit_graph(&r->settings, r->hash_algo, graph_map, graph_size);
-+ prepare_repo_settings(source->odb->repo);
-+ ret = parse_commit_graph(&source->odb->repo->settings, source->odb->repo->hash_algo,
-+ graph_map, graph_size);
+- ret = parse_commit_graph(r, graph_map, graph_size);
++ ret = parse_commit_graph(source->odb->repo, graph_map, graph_size);
if (ret)
ret->odb_source = source;
-@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repo_settings *s,
+ else
+@@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r,
return NULL;
}
---
base-commit: e813a0200a7121b97fec535f0d0b460b0a33356c
change-id: 20250717-b4-pks-commit-graph-wo-the-repository-1dc2cacbc8e3
next prev parent reply other threads:[~2025-08-06 12:00 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-04 8:17 [PATCH 0/9] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 1/9] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-04 21:33 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters Patrick Steinhardt
2025-08-04 9:13 ` Oswald Buddenhagen
2025-08-04 11:18 ` Patrick Steinhardt
2025-08-04 18:34 ` Junio C Hamano
2025-08-04 21:44 ` Taylor Blau
2025-08-06 6:23 ` Patrick Steinhardt
2025-08-06 12:54 ` Oswald Buddenhagen
2025-08-06 19:04 ` Junio C Hamano
2025-08-06 15:41 ` Junio C Hamano
2025-08-07 7:04 ` Patrick Steinhardt
2025-08-07 22:41 ` Junio C Hamano
2025-08-11 8:05 ` Patrick Steinhardt
2025-08-05 15:13 ` Junio C Hamano
2025-08-04 21:42 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 3/9] commit-graph: fix type for some write options Patrick Steinhardt
2025-08-04 21:52 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 4/9] commit-graph: fix sign comparison warnings Patrick Steinhardt
2025-08-04 22:04 ` Taylor Blau
2025-08-06 6:52 ` Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 5/9] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-04 22:05 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 6/9] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-04 22:07 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 7/9] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-04 22:10 ` Taylor Blau
2025-08-06 6:53 ` Patrick Steinhardt
2025-08-04 8:17 ` [PATCH 8/9] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-04 22:11 ` Taylor Blau
2025-08-04 8:17 ` [PATCH 9/9] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-05 4:27 ` [PATCH 0/9] commit-graph: remove reliance on global state Derrick Stolee
2025-08-06 6:53 ` Patrick Steinhardt
2025-08-06 12:00 ` Patrick Steinhardt [this message]
2025-08-06 12:00 ` [PATCH v2 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 03/10] commit-graph: fix type for some write options Patrick Steinhardt
2025-08-06 12:34 ` Oswald Buddenhagen
2025-08-06 15:40 ` Junio C Hamano
2025-08-07 7:07 ` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-06 12:00 ` [PATCH v2 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 00/10] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 01/10] trace2: introduce function to trace unsigned integers Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 02/10] commit-graph: stop using signed integers to count Bloom filters Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 03/10] commit-graph: fix type for some write options Patrick Steinhardt
2025-08-07 22:40 ` Junio C Hamano
2025-08-11 8:24 ` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 04/10] commit-graph: fix sign comparison warnings Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 05/10] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 06/10] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 07/10] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 08/10] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 09/10] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-07 8:04 ` [PATCH v3 10/10] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 1/6] commit-graph: stop using `the_hash_algo` via macros Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 2/6] commit-graph: store the hash algorithm instead of its length Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 3/6] commit-graph: refactor `parse_commit_graph()` to take a repository Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 4/6] commit-graph: stop using `the_hash_algo` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 5/6] commit-graph: stop using `the_repository` Patrick Steinhardt
2025-08-15 5:49 ` [PATCH v4 6/6] commit-graph: stop passing in redundant repository Patrick Steinhardt
2025-08-15 15:17 ` [PATCH v4 0/6] commit-graph: remove reliance on global state Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250806-b4-pks-commit-graph-wo-the-repository-v2-0-911bae638e61@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=oswald.buddenhagen@gmx.de \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).