git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Subject: [PATCH 1/7] chunk-format: introduce `pair_chunk_expect()` helper
Date: Thu, 9 Nov 2023 17:34:11 -0500	[thread overview]
Message-ID: <af5fe3b7237caeba8f970e967933db96c83a230e.1699569246.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1699569246.git.me@ttaylorr.com>

In 570b8b8836 (chunk-format: note that pair_chunk() is unsafe,
2023-10-09), the pair_chunk() interface grew a required "size" pointer,
so that the caller is forced to at least have a handle on the actual
size of the given chunk.

Many callers were converted to the new interface. A handful were instead
converted from the unsafe version of pair_chunk() to read_chunk() so
that they could check their expected size.

This led to a lot of code like:

    static int graph_read_oid_lookup(const unsigned char *chunk_start,
                                     size_t chunk_size, void *data)
    {
      struct commit_graph *g = data;
      g->chunk_oid_lookup = chunk_start;
      if (chunk_size / g->hash_len != g->num_commits)
        return error(_("commit-graph OID lookup chunk is the wrong size"));
      return 0;
    }

, leaving the caller to use read_chunk(), like so:

    read_chunk(cf, GRAPH_CHUNKID_OIDLOOKUP, graph_read_oid_lookup, graph);

The callback to read_chunk() (in the above, `graph_read_oid_lookup()`)
does nothing more than (a) assign a pointer to the location of the start
of the chunk in the mmap'd file, and (b) assert that it has the correct
size.

For callers that know the expected size of their chunk(s) up-front (most
often because they are made up of a known number of fixed-size records),
we can simplify this by teaching the chunk-format API itself to validate
the expected size for us.

This is wrapped in a new function, called `pair_chunk_expect()` which
takes a pair of "size_t"s (corresponding to the record size and count),
instead of a "size_t *", and validates that the given chunk matches the
expected size as given.

This will allow us to reduce the number of lines of code it takes to
perform these basic read_chunk() operations, by taking the above and
replacing it with something like:

    if (pair_chunk_expect(cf, GRAPH_CHUNKID_OIDLOOKUP,
                          &graph->chunk_oid_lookup,
                          graph->hash_len, graph->num_commits))
      error(_("commit-graph oid lookup chunk is wrong size"));

We will perform those transformations in the following commits.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 chunk-format.c | 29 +++++++++++++++++++++++++++++
 chunk-format.h | 13 ++++++++++++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/chunk-format.c b/chunk-format.c
index cdc7f39b70..be078dcca8 100644
--- a/chunk-format.c
+++ b/chunk-format.c
@@ -163,6 +163,10 @@ int read_table_of_contents(struct chunkfile *cf,
 struct pair_chunk_data {
 	const unsigned char **p;
 	size_t *size;
+
+	/* for pair_chunk_expect() only */
+	size_t record_size;
+	size_t record_nr;
 };
 
 static int pair_chunk_fn(const unsigned char *chunk_start,
@@ -175,6 +179,17 @@ static int pair_chunk_fn(const unsigned char *chunk_start,
 	return 0;
 }
 
+static int pair_chunk_expect_fn(const unsigned char *chunk_start,
+				size_t chunk_size,
+				void *data)
+{
+	struct pair_chunk_data *pcd = data;
+	if (chunk_size / pcd->record_size != pcd->record_nr)
+		return -1;
+	*pcd->p = chunk_start;
+	return 0;
+}
+
 int pair_chunk(struct chunkfile *cf,
 	       uint32_t chunk_id,
 	       const unsigned char **p,
@@ -184,6 +199,20 @@ int pair_chunk(struct chunkfile *cf,
 	return read_chunk(cf, chunk_id, pair_chunk_fn, &pcd);
 }
 
+int pair_chunk_expect(struct chunkfile *cf,
+		      uint32_t chunk_id,
+		      const unsigned char **p,
+		      size_t record_size,
+		      size_t record_nr)
+{
+	struct pair_chunk_data pcd = {
+		.p = p,
+		.record_size = record_size,
+		.record_nr = record_nr,
+	};
+	return read_chunk(cf, chunk_id, pair_chunk_expect_fn, &pcd);
+}
+
 int read_chunk(struct chunkfile *cf,
 	       uint32_t chunk_id,
 	       chunk_read_fn fn,
diff --git a/chunk-format.h b/chunk-format.h
index 14b76180ef..10806d7a9a 100644
--- a/chunk-format.h
+++ b/chunk-format.h
@@ -17,7 +17,8 @@ struct chunkfile;
  *
  * If reading a file, use a NULL 'struct hashfile *' and then call
  * read_table_of_contents(). Supply the memory-mapped data to the
- * pair_chunk() or read_chunk() methods, as appropriate.
+ * pair_chunk(), pair_chunk_expect(), or read_chunk() methods, as
+ * appropriate.
  *
  * DO NOT MIX THESE MODES. Use different 'struct chunkfile' instances
  * for reading and writing.
@@ -54,6 +55,16 @@ int pair_chunk(struct chunkfile *cf,
 	       const unsigned char **p,
 	       size_t *size);
 
+/*
+ * Similar to 'pair_chunk', but used for callers who are reading a chunk
+ * with a known number of fixed-width records.
+ */
+int pair_chunk_expect(struct chunkfile *cf,
+		      uint32_t chunk_id,
+		      const unsigned char **p,
+		      size_t record_size,
+		      size_t record_nr);
+
 typedef int (*chunk_read_fn)(const unsigned char *chunk_start,
 			     size_t chunk_size, void *data);
 /*
-- 
2.43.0.rc0.39.g44bd344727


  reply	other threads:[~2023-11-09 22:34 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-09  7:03 [PATCH 0/9] some more chunk-file bounds-checks fixes Jeff King
2023-11-09  7:09 ` [PATCH 1/9] commit-graph: handle overflow in chunk_size checks Jeff King
2023-11-09 21:13   ` Taylor Blau
2023-11-09 21:27     ` Jeff King
2023-11-09  7:12 ` [PATCH 2/9] midx: check consistency of fanout table Jeff King
2023-11-09  7:13 ` [PATCH 3/9] commit-graph: drop redundant call to "lite" verification Jeff King
2023-11-09  7:14 ` [PATCH 4/9] commit-graph: clarify missing-chunk error messages Jeff King
2023-11-09  7:17 ` [PATCH 5/9] commit-graph: abort as soon as we see a bogus chunk Jeff King
2023-11-09 21:18   ` Taylor Blau
2023-11-09  7:24 ` [PATCH 6/9] commit-graph: use fanout value for graph size Jeff King
2023-11-09 21:20   ` Taylor Blau
2023-11-09 21:38     ` Jeff King
2023-11-09 22:15       ` Taylor Blau
2023-11-10 21:52         ` Jeff King
2023-11-09  7:25 ` [PATCH 7/9] commit-graph: check order while reading fanout chunk Jeff King
2023-11-09  7:25 ` [PATCH 8/9] commit-graph: drop verify_commit_graph_lite() Jeff King
2023-11-09  7:26 ` [PATCH 9/9] commit-graph: mark chunk error messages for translation Jeff King
2023-11-09 21:22 ` [PATCH 0/9] some more chunk-file bounds-checks fixes Taylor Blau
2023-11-09 22:34 ` [PATCH 0/7] chunk-format: introduce `pair_chunk_expect()` Taylor Blau
2023-11-09 22:34   ` Taylor Blau [this message]
2023-11-10  4:55     ` [PATCH 1/7] chunk-format: introduce `pair_chunk_expect()` helper Junio C Hamano
2023-11-10 16:27       ` Taylor Blau
2023-11-10 22:01         ` Jeff King
2023-11-10 23:39           ` Junio C Hamano
2023-11-10 23:38         ` Junio C Hamano
2023-11-10 21:57       ` Jeff King
2023-11-10 22:09         ` Jeff King
2023-11-10 22:08     ` Jeff King
2024-01-15 22:31     ` Linus Arver
2024-01-15 22:53       ` Linus Arver
2024-01-16 15:10       ` Jeff King
2024-01-18 23:59         ` Linus Arver
2023-11-09 22:34   ` [PATCH 2/7] commit-graph: read `OIDL` chunk with `pair_chunk_expect()` Taylor Blau
2023-11-10 22:10     ` Jeff King
2023-11-09 22:34   ` [PATCH 3/7] commit-graph: read `CDAT` " Taylor Blau
2023-11-09 22:34   ` [PATCH 4/7] commit-graph: read `GDAT` " Taylor Blau
2023-11-09 22:34   ` [PATCH 5/7] commit-graph: read `BIDX` " Taylor Blau
2023-11-09 22:34   ` [PATCH 6/7] midx: read `OIDL` " Taylor Blau
2023-11-09 22:34   ` [PATCH 7/7] midx: read `OOFF` " Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af5fe3b7237caeba8f970e967933db96c83a230e.1699569246.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).