[PATCH v2 00/17] Refactor chunk-format into an API - Derrick Stolee via GitGitGadget

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: me@ttaylorr.com, gitster@pobox.com, l.s.r@web.de,
	szeder.dev@gmail.com, Chris Torek <chris.torek@gmail.com>,
	Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v2 00/17] Refactor chunk-format into an API
Date: Wed, 27 Jan 2021 15:01:39 +0000	[thread overview]
Message-ID: <pull.848.v2.git.1611759716.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.848.git.1611676886.gitgitgadget@gmail.com>

This is a restart on the topic previously submitted [1] but dropped because
ak/corrected-commit-date was still in progress. This version is based on
that branch.

[1]
https://lore.kernel.org/git/pull.804.git.1607012215.gitgitgadget@gmail.com/

This version also changes the approach to use a more dynamic interaction
with a struct chunkfile pointer. This idea is credited to Taylor Blau [2],
but I started again from scratch. I also go further to make struct chunkfile
anonymous to API consumers. It is defined only in chunk-format.c, which
should hopefully deter future users from interacting with that data
directly.

[2] https://lore.kernel.org/git/X8%2FI%2FRzXZksio+ri@nand.local/

This combined API is beneficial to reduce duplicated logic. Or rather, to
ensure that similar file formats have similar protections against bad data.
The multi-pack-index code did not have as many guards as the commit-graph
code did, but now they both share a common base that checks for things like
duplicate chunks or offsets outside the size of the file.

Here are some stats for the end-to-end change:

 * 570 insertions(+), 456 deletions(-).
 * commit-graph.c: 107 insertions(+), 192 deletions(-)
 * midx.c: 164 insertions(+), 260 deletions(-)

While there is an overall increase to the code size, the consumers do get
smaller. Boilerplate things like abstracting method to match chunk_write_fn
and chunk_read_fn make up a lot of these insertions. The "interesting" code
gets a lot smaller and cleaner.


Updates in V2
=============

 * The method pair_chunk() now automatically sets a pointer while
   read_chunk() uses the callback. This greatly reduces the code size.

 * Pointer casts are now implicit instead of explicit.

 * Extra care is taken to not overflow when verifying chunk sizes on write.

Thanks, -Stolee

Derrick Stolee (17):
  commit-graph: anonymize data in chunk_write_fn
  chunk-format: create chunk format write API
  commit-graph: use chunk-format write API
  midx: rename pack_info to write_midx_context
  midx: use context in write_midx_pack_names()
  midx: add entries to write_midx_context
  midx: add pack_perm to write_midx_context
  midx: add num_large_offsets to write_midx_context
  midx: return success/failure in chunk write methods
  midx: drop chunk progress during write
  midx: use chunk-format API in write_midx_internal()
  chunk-format: create read chunk API
  commit-graph: use chunk-format read API
  midx: use chunk-format read API
  midx: use 64-bit multiplication for chunk sizes
  chunk-format: restore duplicate chunk checks
  chunk-format: add technical docs

 Documentation/technical/chunk-format.txt      |  54 +++
 .../technical/commit-graph-format.txt         |   3 +
 Documentation/technical/pack-format.txt       |   3 +
 Makefile                                      |   1 +
 chunk-format.c                                | 181 ++++++++
 chunk-format.h                                |  53 +++
 commit-graph.c                                | 299 +++++-------
 midx.c                                        | 424 +++++++-----------
 t/t5318-commit-graph.sh                       |   2 +-
 t/t5319-multi-pack-index.sh                   |   6 +-
 10 files changed, 570 insertions(+), 456 deletions(-)
 create mode 100644 Documentation/technical/chunk-format.txt
 create mode 100644 chunk-format.c
 create mode 100644 chunk-format.h


base-commit: 5a3b130cad0d5c770f766e3af6d32b41766374c0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-848%2Fderrickstolee%2Fchunk-format%2Frefactor-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-848/derrickstolee/chunk-format/refactor-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/848

Range-diff vs v1:

  1:  09b32829e4f !  1:  243dcec9436 commit-graph: anonymize data in chunk_write_fn
     @@ commit-graph.c: struct write_commit_graph_context {
      -				    struct write_commit_graph_context *ctx)
      +				    void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	int i, count = 0;
       	struct commit **list = ctx->commits.list;
       
     @@ commit-graph.c: static int write_graph_chunk_fanout(struct hashfile *f,
      -				  struct write_commit_graph_context *ctx)
      +				  void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	struct commit **list = ctx->commits.list;
       	int count;
       	for (count = 0; count < ctx->commits.nr; count++, list++) {
     @@ commit-graph.c: static const unsigned char *commit_to_sha1(size_t index, void *t
      -				  struct write_commit_graph_context *ctx)
      +				  void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	struct commit **list = ctx->commits.list;
       	struct commit **last = ctx->commits.list + ctx->commits.nr;
       	uint32_t num_extra_edges = 0;
     @@ commit-graph.c: static int write_graph_chunk_data(struct hashfile *f,
      -					      struct write_commit_graph_context *ctx)
      +					     void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	int i, num_generation_data_overflows = 0;
       
       	for (i = 0; i < ctx->commits.nr; i++) {
     @@ commit-graph.c: static int write_graph_chunk_generation_data(struct hashfile *f,
      -						       struct write_commit_graph_context *ctx)
      +						      void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	int i;
       	for (i = 0; i < ctx->commits.nr; i++) {
       		struct commit *c = ctx->commits.list[i];
     @@ commit-graph.c: static int write_graph_chunk_generation_data_overflow(struct has
      -					 struct write_commit_graph_context *ctx)
      +					 void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	struct commit **list = ctx->commits.list;
       	struct commit **last = ctx->commits.list + ctx->commits.nr;
       	struct commit_list *parent;
     @@ commit-graph.c: static int write_graph_chunk_extra_edges(struct hashfile *f,
      -					   struct write_commit_graph_context *ctx)
      +					   void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	struct commit **list = ctx->commits.list;
       	struct commit **last = ctx->commits.list + ctx->commits.nr;
       	uint32_t cur_pos = 0;
     @@ commit-graph.c: static void trace2_bloom_filter_settings(struct write_commit_gra
      -					struct write_commit_graph_context *ctx)
      +					void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	struct commit **list = ctx->commits.list;
       	struct commit **last = ctx->commits.list + ctx->commits.nr;
       
     @@ commit-graph.c: static int write_graph_chunk_base_1(struct hashfile *f,
      -				  struct write_commit_graph_context *ctx)
      +				    void *data)
       {
     -+	struct write_commit_graph_context *ctx =
     -+		(struct write_commit_graph_context *)data;
     ++	struct write_commit_graph_context *ctx = data;
       	int num = write_graph_chunk_base_1(f, ctx->new_base_graph);
       
       	if (num != ctx->num_commit_graphs_after - 1) {
  2:  9bd273f8c94 !  2:  814512f2167 chunk-format: create chunk format write API
     @@ chunk-format.c (new)
      +		if (result)
      +			return result;
      +
     -+		if (cf->f->total + cf->f->offset != start_offset + cf->chunks[i].size)
     ++		if (cf->f->total + cf->f->offset - start_offset != cf->chunks[i].size)
      +			BUG("expected to write %"PRId64" bytes to chunk %"PRIx32", but wrote %"PRId64" instead",
      +			    cf->chunks[i].size, cf->chunks[i].id,
      +			    cf->f->total + cf->f->offset - start_offset);
  3:  a3d6177a352 =  3:  70af6e3083f commit-graph: use chunk-format write API
  4:  9fe5ee8611c !  4:  0cac7890bed midx: rename pack_info to write_midx_context
     @@ midx.c: struct pack_list {
       			     const char *file_name, void *data)
       {
      -	struct pack_list *packs = (struct pack_list *)data;
     -+	struct write_midx_context *ctx = (struct write_midx_context *)data;
     ++	struct write_midx_context *ctx = data;
       
       	if (ends_with(file_name, ".idx")) {
      -		display_progress(packs->progress, ++packs->pack_paths_checked);
  5:  14a0246b982 !  5:  4a4e90b129a midx: use context in write_midx_pack_names()
     @@ midx.c: static struct pack_midx_entry *get_sorted_entries(struct multi_pack_inde
      -				    uint32_t num_packs)
      +static size_t write_midx_pack_names(struct hashfile *f, void *data)
       {
     -+	struct write_midx_context *ctx = (struct write_midx_context *)data;
     ++	struct write_midx_context *ctx = data;
       	uint32_t i;
       	unsigned char padding[MIDX_CHUNK_ALIGNMENT];
       	size_t written = 0;
  6:  79f479ef7d1 !  6:  30ad423997b midx: add entries to write_midx_context
     @@ midx.c: static size_t write_midx_pack_names(struct hashfile *f, void *data)
       {
      -	struct pack_midx_entry *list = objects;
      -	struct pack_midx_entry *last = objects + nr_objects;
     -+	struct write_midx_context *ctx = (struct write_midx_context *)data;
     ++	struct write_midx_context *ctx = data;
      +	struct pack_midx_entry *list = ctx->entries;
      +	struct pack_midx_entry *last = ctx->entries + ctx->entries_nr;
       	uint32_t count = 0;
     @@ midx.c: static size_t write_midx_oid_fanout(struct hashfile *f,
      +				    void *data)
       {
      -	struct pack_midx_entry *list = objects;
     -+	struct write_midx_context *ctx = (struct write_midx_context *)data;
     ++	struct write_midx_context *ctx = data;
      +	unsigned char hash_len = the_hash_algo->rawsz;
      +	struct pack_midx_entry *list = ctx->entries;
       	uint32_t i;
  7:  0b4ce3f1732 !  7:  2f1c496f3ab midx: add pack_perm to write_midx_context
     @@ midx.c: static size_t write_midx_oid_lookup(struct hashfile *f,
      +					void *data)
       {
      -	struct pack_midx_entry *list = objects;
     -+	struct write_midx_context *ctx = (struct write_midx_context *)data;
     ++	struct write_midx_context *ctx = data;
      +	struct pack_midx_entry *list = ctx->entries;
       	uint32_t i, nr_large_offset = 0;
       	size_t written = 0;
  8:  eabc7b73647 !  8:  c4939548e51 midx: add num_large_offsets to write_midx_context
     @@ midx.c: static size_t write_midx_object_offsets(struct hashfile *f,
      +				       void *data)
       {
      -	struct pack_midx_entry *list = objects, *end = objects + nr_objects;
     -+	struct write_midx_context *ctx = (struct write_midx_context *)data;
     ++	struct write_midx_context *ctx = data;
      +	struct pack_midx_entry *list = ctx->entries;
      +	struct pack_midx_entry *end = ctx->entries + ctx->entries_nr;
       	size_t written = 0;
  9:  909ca28e0ba !  9:  b3cc73c2256 midx: return success/failure in chunk write methods
     @@ midx.c: static struct pack_midx_entry *get_sorted_entries(struct multi_pack_inde
      -static size_t write_midx_pack_names(struct hashfile *f, void *data)
      +static int write_midx_pack_names(struct hashfile *f, void *data)
       {
     - 	struct write_midx_context *ctx = (struct write_midx_context *)data;
     + 	struct write_midx_context *ctx = data;
       	uint32_t i;
      @@ midx.c: static size_t write_midx_pack_names(struct hashfile *f, void *data)
       	if (i < MIDX_CHUNK_ALIGNMENT) {
     @@ midx.c: static size_t write_midx_pack_names(struct hashfile *f, void *data)
      +static int write_midx_oid_fanout(struct hashfile *f,
      +				 void *data)
       {
     - 	struct write_midx_context *ctx = (struct write_midx_context *)data;
     + 	struct write_midx_context *ctx = data;
       	struct pack_midx_entry *list = ctx->entries;
      @@ midx.c: static size_t write_midx_oid_fanout(struct hashfile *f,
       		list = next;
     @@ midx.c: static size_t write_midx_oid_fanout(struct hashfile *f,
      +static int write_midx_oid_lookup(struct hashfile *f,
      +				 void *data)
       {
     - 	struct write_midx_context *ctx = (struct write_midx_context *)data;
     + 	struct write_midx_context *ctx = data;
       	unsigned char hash_len = the_hash_algo->rawsz;
       	struct pack_midx_entry *list = ctx->entries;
       	uint32_t i;
     @@ midx.c: static size_t write_midx_oid_lookup(struct hashfile *f,
      +static int write_midx_object_offsets(struct hashfile *f,
      +				     void *data)
       {
     - 	struct write_midx_context *ctx = (struct write_midx_context *)data;
     + 	struct write_midx_context *ctx = data;
       	struct pack_midx_entry *list = ctx->entries;
       	uint32_t i, nr_large_offset = 0;
      -	size_t written = 0;
     @@ midx.c: static size_t write_midx_object_offsets(struct hashfile *f,
      +static int write_midx_large_offsets(struct hashfile *f,
      +				    void *data)
       {
     - 	struct write_midx_context *ctx = (struct write_midx_context *)data;
     + 	struct write_midx_context *ctx = data;
       	struct pack_midx_entry *list = ctx->entries;
       	struct pack_midx_entry *end = ctx->entries + ctx->entries_nr;
      -	size_t written = 0;
 10:  e613ffa9ac6 = 10:  78744d3b701 midx: drop chunk progress during write
 11:  49cfb4f63e2 = 11:  07dc0cf8c68 midx: use chunk-format API in write_midx_internal()
 12:  e3475633e1d ! 12:  d8d8e9e2aa3 chunk-format: create read chunk API
     @@ Commit message
      
           1. initialize a 'struct chunkfile' with init_chunkfile(NULL).
           2. call read_table_of_contents().
     -     3. for each chunk to parse, call pair_chunk() with appropriate pointers.
     +     3. for each chunk to parse,
     +        a. call pair_chunk() to assign a pointer with the chunk position, or
     +        b. call read_chunk() to run a callback on the chunk start and size.
           4. call free_chunkfile() to clear the 'struct chunkfile' data.
      
          We are re-using the anonymous 'struct chunkfile' data, as it is internal
     @@ chunk-format.c: int write_chunkfile(struct chunkfile *cf, void *data)
      +
      +int pair_chunk(struct chunkfile *cf,
      +	       uint32_t chunk_id,
     ++	       const unsigned char **p)
     ++{
     ++	int i;
     ++
     ++	for (i = 0; i < cf->chunks_nr; i++) {
     ++		if (cf->chunks[i].id == chunk_id) {
     ++			*p = cf->chunks[i].start;
     ++			return 0;
     ++		}
     ++	}
     ++
     ++	return CHUNK_NOT_FOUND;
     ++}
     ++
     ++int read_chunk(struct chunkfile *cf,
     ++	       uint32_t chunk_id,
      +	       chunk_read_fn fn,
      +	       void *data)
      +{
     @@ chunk-format.h: void add_chunk(struct chunkfile *cf,
      +			   uint64_t toc_offset,
      +			   int toc_length);
      +
     ++#define CHUNK_NOT_FOUND (-2)
     ++
      +/*
     -+ * When reading a table of contents, we find the chunk with matching 'id'
     -+ * then call its read_fn to populate the necessary 'data' based on the
     -+ * chunk start and size.
     ++ * Find 'chunk_id' in the given chunkfile and assign the
     ++ * given pointer to the position in the mmap'd file where
     ++ * that chunk begins.
     ++ *
     ++ * Returns CHUNK_NOT_FOUND if the chunk does not exist.
      + */
     ++int pair_chunk(struct chunkfile *cf,
     ++	       uint32_t chunk_id,
     ++	       const unsigned char **p);
     ++
      +typedef int (*chunk_read_fn)(const unsigned char *chunk_start,
      +			     size_t chunk_size, void *data);
     -+
     -+
     -+#define CHUNK_NOT_FOUND (-2)
     -+int pair_chunk(struct chunkfile *cf,
     ++/*
     ++ * Find 'chunk_id' in the given chunkfile and call the
     ++ * given chunk_read_fn method with the information for
     ++ * that chunk.
     ++ *
     ++ * Returns CHUNK_NOT_FOUND if the chunk does not exist.
     ++ */
     ++int read_chunk(struct chunkfile *cf,
      +	       uint32_t chunk_id,
      +	       chunk_read_fn fn,
      +	       void *data);
 13:  7339990f07d ! 13:  8744d278596 commit-graph: use chunk-format read API
     @@ commit-graph.c: static int verify_commit_graph_lite(struct commit_graph *g)
       	return 0;
       }
       
     -+static int graph_read_oid_fanout(const unsigned char *chunk_start,
     -+				 size_t chunk_size, void *data)
     -+{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     -+	g->chunk_oid_fanout = (uint32_t*)chunk_start;
     -+	return 0;
     -+}
     -+
      +static int graph_read_oid_lookup(const unsigned char *chunk_start,
      +				 size_t chunk_size, void *data)
      +{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     ++	struct commit_graph *g = data;
      +	g->chunk_oid_lookup = chunk_start;
      +	g->num_commits = chunk_size / g->hash_len;
      +	return 0;
      +}
      +
     -+static int graph_read_data(const unsigned char *chunk_start,
     -+				 size_t chunk_size, void *data)
     -+{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     -+	g->chunk_commit_data = chunk_start;
     -+	return 0;
     -+}
     -+
     -+static int graph_read_extra_edges(const unsigned char *chunk_start,
     -+				  size_t chunk_size, void *data)
     -+{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     -+	g->chunk_extra_edges = chunk_start;
     -+	return 0;
     -+}
     -+
     -+static int graph_read_base_graphs(const unsigned char *chunk_start,
     -+				  size_t chunk_size, void *data)
     -+{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     -+	g->chunk_base_graphs = chunk_start;
     -+	return 0;
     -+}
     -+
     -+static int graph_read_generation_data(const unsigned char *chunk_start,
     -+				      size_t chunk_size, void *data)
     -+{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     -+	g->chunk_generation_data = chunk_start;
     -+	return 0;
     -+}
     -+
     -+static int graph_read_generation_overflow(const unsigned char *chunk_start,
     -+					  size_t chunk_size, void *data)
     -+{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     -+	g->chunk_generation_data_overflow = chunk_start;
     -+	return 0;
     -+}
     -+
     -+static int graph_read_bloom_indices(const unsigned char *chunk_start,
     -+				    size_t chunk_size, void *data)
     -+{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     -+	g->chunk_bloom_indexes = chunk_start;
     -+	return 0;
     -+}
     -+
      +static int graph_read_bloom_data(const unsigned char *chunk_start,
      +				  size_t chunk_size, void *data)
      +{
     -+	struct commit_graph *g = (struct commit_graph *)data;
     ++	struct commit_graph *g = data;
      +	uint32_t hash_version;
      +	g->chunk_bloom_data = chunk_start;
      +	hash_version = get_be32(chunk_start);
     @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r,
      -						     / graph->hash_len;
      -			}
      -			break;
     --
     ++	cf = init_chunkfile(NULL);
     + 
      -		case GRAPH_CHUNKID_DATA:
      -			if (graph->chunk_commit_data)
      -				chunk_repeated = 1;
     @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r,
      -			else
      -				graph->chunk_generation_data = data + chunk_offset;
      -			break;
     -+	cf = init_chunkfile(NULL);
     - 
     +-
      -		case GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW:
      -			if (graph->chunk_generation_data_overflow)
      -				chunk_repeated = 1;
     @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r,
      -			}
      -			break;
      -		}
     -+	pair_chunk(cf, GRAPH_CHUNKID_OIDFANOUT, graph_read_oid_fanout, graph);
     -+	pair_chunk(cf, GRAPH_CHUNKID_OIDLOOKUP, graph_read_oid_lookup, graph);
     -+	pair_chunk(cf, GRAPH_CHUNKID_DATA, graph_read_data, graph);
     -+	pair_chunk(cf, GRAPH_CHUNKID_EXTRAEDGES, graph_read_extra_edges, graph);
     -+	pair_chunk(cf, GRAPH_CHUNKID_BASE, graph_read_base_graphs, graph);
     ++	pair_chunk(cf, GRAPH_CHUNKID_OIDFANOUT,
     ++		   (const unsigned char **)&graph->chunk_oid_fanout);
     ++	read_chunk(cf, GRAPH_CHUNKID_OIDLOOKUP, graph_read_oid_lookup, graph);
     ++	pair_chunk(cf, GRAPH_CHUNKID_DATA, &graph->chunk_commit_data);
     ++	pair_chunk(cf, GRAPH_CHUNKID_EXTRAEDGES, &graph->chunk_extra_edges);
     ++	pair_chunk(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs);
      +	pair_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA,
     -+		   graph_read_generation_data, graph);
     ++		   &graph->chunk_generation_data);
      +	pair_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW,
     -+		   graph_read_generation_overflow, graph);
     ++		   &graph->chunk_generation_data_overflow);
       
      -		if (chunk_repeated) {
      -			error(_("commit-graph chunk id %08x appears multiple times"), chunk_id);
     @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r,
      -		}
      +	if (r->settings.commit_graph_read_changed_paths) {
      +		pair_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
     -+			   graph_read_bloom_indices, graph);
     -+		pair_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
     ++			   &graph->chunk_bloom_indexes);
     ++		read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
      +			   graph_read_bloom_data, graph);
       	}
       
 14:  cb145e0e32a ! 14:  750c03253c9 midx: use chunk-format read API
     @@ midx.c: static char *get_midx_filename(const char *object_dir)
       	return xstrfmt("%s/pack/multi-pack-index", object_dir);
       }
       
     -+static int midx_read_pack_names(const unsigned char *chunk_start,
     -+				size_t chunk_size, void *data)
     -+{
     -+	struct multi_pack_index *m = (struct multi_pack_index *)data;
     -+	m->chunk_pack_names = chunk_start;
     -+	return 0;
     -+}
     -+
      +static int midx_read_oid_fanout(const unsigned char *chunk_start,
      +				size_t chunk_size, void *data)
      +{
     -+	struct multi_pack_index *m = (struct multi_pack_index *)data;
     ++	struct multi_pack_index *m = data;
      +	m->chunk_oid_fanout = (uint32_t *)chunk_start;
      +
      +	if (chunk_size != 4 * 256) {
     @@ midx.c: static char *get_midx_filename(const char *object_dir)
      +	}
      +	return 0;
      +}
     -+
     -+static int midx_read_oid_lookup(const unsigned char *chunk_start,
     -+				size_t chunk_size, void *data)
     -+{
     -+	struct multi_pack_index *m = (struct multi_pack_index *)data;
     -+	m->chunk_oid_lookup = chunk_start;
     -+	return 0;
     -+}
     -+
     -+static int midx_read_offsets(const unsigned char *chunk_start,
     -+			     size_t chunk_size, void *data)
     -+{
     -+	struct multi_pack_index *m = (struct multi_pack_index *)data;
     -+	m->chunk_object_offsets = chunk_start;
     -+	return 0;
     -+}
     -+
     -+static int midx_read_large_offsets(const unsigned char *chunk_start,
     -+				   size_t chunk_size, void *data)
     -+{
     -+	struct multi_pack_index *m = (struct multi_pack_index *)data;
     -+	m->chunk_large_offsets = chunk_start;
     -+	return 0;
     -+}
      +
       struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local)
       {
     @@ midx.c: struct multi_pack_index *load_multi_pack_index(const char *object_dir, i
      +				   MIDX_HEADER_SIZE, m->num_chunks))
      +		goto cleanup_fail;
      +
     -+	if (pair_chunk(cf, MIDX_CHUNKID_PACKNAMES, midx_read_pack_names, m) == CHUNK_NOT_FOUND)
     ++	if (pair_chunk(cf, MIDX_CHUNKID_PACKNAMES, &m->chunk_pack_names) == CHUNK_NOT_FOUND)
       		die(_("multi-pack-index missing required pack-name chunk"));
      -	if (!m->chunk_oid_fanout)
     -+	if (pair_chunk(cf, MIDX_CHUNKID_OIDFANOUT, midx_read_oid_fanout, m) == CHUNK_NOT_FOUND)
     ++	if (read_chunk(cf, MIDX_CHUNKID_OIDFANOUT, midx_read_oid_fanout, m) == CHUNK_NOT_FOUND)
       		die(_("multi-pack-index missing required OID fanout chunk"));
      -	if (!m->chunk_oid_lookup)
     -+	if (pair_chunk(cf, MIDX_CHUNKID_OIDLOOKUP, midx_read_oid_lookup, m) == CHUNK_NOT_FOUND)
     ++	if (pair_chunk(cf, MIDX_CHUNKID_OIDLOOKUP, &m->chunk_oid_lookup) == CHUNK_NOT_FOUND)
       		die(_("multi-pack-index missing required OID lookup chunk"));
      -	if (!m->chunk_object_offsets)
     -+	if (pair_chunk(cf, MIDX_CHUNKID_OBJECTOFFSETS, midx_read_offsets, m) == CHUNK_NOT_FOUND)
     ++	if (pair_chunk(cf, MIDX_CHUNKID_OBJECTOFFSETS, &m->chunk_object_offsets) == CHUNK_NOT_FOUND)
       		die(_("multi-pack-index missing required object offsets chunk"));
       
     -+	pair_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, midx_read_large_offsets, m);
     ++	pair_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, &m->chunk_large_offsets);
      +
       	m->num_objects = ntohl(m->chunk_oid_fanout[255]);
       
 15:  f6c58ff72d2 = 15:  83d292532a0 midx: use 64-bit multiplication for chunk sizes
 16:  62a23842aa6 = 16:  669eeec707a chunk-format: restore duplicate chunk checks
 17:  05cbd0a8d93 = 17:  8f3985ab5df chunk-format: add technical docs

-- 
gitgitgadget

next prev parent reply	other threads:[~2021-01-27 15:08 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26 16:01 [PATCH 00/17] Refactor chunk-format into an API Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 01/17] commit-graph: anonymize data in chunk_write_fn Derrick Stolee via GitGitGadget
2021-01-27  1:53   ` Chris Torek
2021-01-27  2:36     ` Taylor Blau
2021-01-26 16:01 ` [PATCH 02/17] chunk-format: create chunk format write API Derrick Stolee via GitGitGadget
2021-01-27  2:42   ` Taylor Blau
2021-01-27 13:49     ` Derrick Stolee
2021-01-26 16:01 ` [PATCH 03/17] commit-graph: use chunk-format " Derrick Stolee via GitGitGadget
2021-01-27  2:47   ` Taylor Blau
2021-01-26 16:01 ` [PATCH 04/17] midx: rename pack_info to write_midx_context Derrick Stolee via GitGitGadget
2021-01-27  2:49   ` Taylor Blau
2021-01-26 16:01 ` [PATCH 05/17] midx: use context in write_midx_pack_names() Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 06/17] midx: add entries to write_midx_context Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 07/17] midx: add pack_perm " Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 08/17] midx: add num_large_offsets " Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 09/17] midx: return success/failure in chunk write methods Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 10/17] midx: drop chunk progress during write Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 11/17] midx: use chunk-format API in write_midx_internal() Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 12/17] chunk-format: create read chunk API Derrick Stolee via GitGitGadget
2021-01-27  3:02   ` Taylor Blau
2021-01-26 16:01 ` [PATCH 13/17] commit-graph: use chunk-format read API Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 14/17] midx: " Derrick Stolee via GitGitGadget
2021-01-27  3:06   ` Taylor Blau
2021-01-27 13:50     ` Derrick Stolee
2021-01-26 16:01 ` [PATCH 15/17] midx: use 64-bit multiplication for chunk sizes Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 16/17] chunk-format: restore duplicate chunk checks Derrick Stolee via GitGitGadget
2021-01-26 16:01 ` [PATCH 17/17] chunk-format: add technical docs Derrick Stolee via GitGitGadget
2021-01-26 22:37 ` [PATCH 00/17] Refactor chunk-format into an API Junio C Hamano
2021-01-27  2:29 ` Taylor Blau
2021-01-27 15:01 ` Derrick Stolee via GitGitGadget [this message]
2021-01-27 15:01   ` [PATCH v2 01/17] commit-graph: anonymize data in chunk_write_fn Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 02/17] chunk-format: create chunk format write API Derrick Stolee via GitGitGadget
2021-02-04 21:24     ` Junio C Hamano
2021-02-04 22:40       ` Junio C Hamano
2021-02-05 11:37       ` Derrick Stolee
2021-02-05 19:25         ` Junio C Hamano
2021-01-27 15:01   ` [PATCH v2 03/17] commit-graph: use chunk-format " Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 04/17] midx: rename pack_info to write_midx_context Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 05/17] midx: use context in write_midx_pack_names() Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 06/17] midx: add entries to write_midx_context Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 07/17] midx: add pack_perm " Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 08/17] midx: add num_large_offsets " Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 09/17] midx: return success/failure in chunk write methods Derrick Stolee via GitGitGadget
2021-02-04 22:59     ` Junio C Hamano
2021-02-05 11:42       ` Derrick Stolee
2021-01-27 15:01   ` [PATCH v2 10/17] midx: drop chunk progress during write Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 11/17] midx: use chunk-format API in write_midx_internal() Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 12/17] chunk-format: create read chunk API Derrick Stolee via GitGitGadget
2021-02-04 23:40     ` Junio C Hamano
2021-02-05 12:19       ` Derrick Stolee
2021-02-05 19:37         ` Junio C Hamano
2021-02-08 22:26           ` Junio C Hamano
2021-02-09  1:33             ` Derrick Stolee
2021-02-09 20:47               ` Junio C Hamano
2021-01-27 15:01   ` [PATCH v2 13/17] commit-graph: use chunk-format read API Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 14/17] midx: " Derrick Stolee via GitGitGadget
2021-01-27 15:01   ` [PATCH v2 15/17] midx: use 64-bit multiplication for chunk sizes Derrick Stolee via GitGitGadget
2021-02-05  0:00     ` Junio C Hamano
2021-02-05 10:59       ` Chris Torek
2021-02-05 20:41         ` Junio C Hamano
2021-02-06 20:35           ` Chris Torek
2021-02-05 12:30       ` Derrick Stolee
2021-02-05 19:42         ` Junio C Hamano
2021-02-07 19:50       ` SZEDER Gábor
2021-02-08  5:41         ` Junio C Hamano
2021-01-27 15:01   ` [PATCH v2 16/17] chunk-format: restore duplicate chunk checks Derrick Stolee via GitGitGadget
2021-02-05  0:05     ` Junio C Hamano
2021-02-05 12:31       ` Derrick Stolee
2021-01-27 15:01   ` [PATCH v2 17/17] chunk-format: add technical docs Derrick Stolee via GitGitGadget
2021-02-05  0:15     ` Junio C Hamano
2021-01-27 16:03   ` [PATCH v2 00/17] Refactor chunk-format into an API Taylor Blau
2021-02-05  2:08   ` Junio C Hamano
2021-02-05  2:27     ` Derrick Stolee
2021-02-05 14:30   ` [PATCH v3 " Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 01/17] commit-graph: anonymize data in chunk_write_fn Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 02/17] chunk-format: create chunk format write API Derrick Stolee via GitGitGadget
2021-02-07 21:13       ` SZEDER Gábor
2021-02-08 13:44         ` Derrick Stolee
2021-02-11 19:43           ` SZEDER Gábor
2021-02-05 14:30     ` [PATCH v3 03/17] commit-graph: use chunk-format " Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 04/17] midx: rename pack_info to write_midx_context Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 05/17] midx: use context in write_midx_pack_names() Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 06/17] midx: add entries to write_midx_context Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 07/17] midx: add pack_perm " Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 08/17] midx: add num_large_offsets " Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 09/17] midx: return success/failure in chunk write methods Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 10/17] midx: drop chunk progress during write Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 11/17] midx: use chunk-format API in write_midx_internal() Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 12/17] chunk-format: create read chunk API Derrick Stolee via GitGitGadget
2021-02-07 20:20       ` SZEDER Gábor
2021-02-08 13:35         ` Derrick Stolee
2021-02-05 14:30     ` [PATCH v3 13/17] commit-graph: use chunk-format read API Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 14/17] midx: " Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 15/17] midx: use 64-bit multiplication for chunk sizes Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 16/17] chunk-format: restore duplicate chunk checks Derrick Stolee via GitGitGadget
2021-02-05 14:30     ` [PATCH v3 17/17] chunk-format: add technical docs Derrick Stolee via GitGitGadget
2021-02-18 14:07     ` [PATCH v4 00/17] Refactor chunk-format into an API Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 01/17] commit-graph: anonymize data in chunk_write_fn Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 02/17] chunk-format: create chunk format write API Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 03/17] commit-graph: use chunk-format " Derrick Stolee via GitGitGadget
2021-02-24 16:52         ` SZEDER Gábor
2021-02-24 17:12           ` Taylor Blau
2021-02-24 17:52             ` Derrick Stolee
2021-02-24 19:44               ` Junio C Hamano
2021-02-18 14:07       ` [PATCH v4 04/17] midx: rename pack_info to write_midx_context Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 05/17] midx: use context in write_midx_pack_names() Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 06/17] midx: add entries to write_midx_context Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 07/17] midx: add pack_perm " Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 08/17] midx: add num_large_offsets " Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 09/17] midx: return success/failure in chunk write methods Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 10/17] midx: drop chunk progress during write Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 11/17] midx: use chunk-format API in write_midx_internal() Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 12/17] chunk-format: create read chunk API Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 13/17] commit-graph: use chunk-format read API Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 14/17] midx: " Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 15/17] midx: use 64-bit multiplication for chunk sizes Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 16/17] chunk-format: restore duplicate chunk checks Derrick Stolee via GitGitGadget
2021-02-18 14:07       ` [PATCH v4 17/17] chunk-format: add technical docs Derrick Stolee via GitGitGadget
2021-02-18 21:47         ` Junio C Hamano
2021-02-19 12:42           ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.848.v2.git.1611759716.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=chris.torek@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=me@ttaylorr.com \
    --cc=stolee@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.