[PATCH 00/16] odb: introduce "inmemory" source

Git development
 help / color / mirror / Atom feed

* [PATCH 00/16] odb: introduce "inmemory" source
@ 2026-04-03  6:01 Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 01/16] " Patrick Steinhardt
                   ` (18 more replies)
  0 siblings, 19 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Hi,

this patch series introduces the second object database source type,
which is the "inmemory" source.

This source may seem somewhat odd at first: it always starts out empty,
and any object written into it will only exist in memory until the
process exits. But the source already serves a purpose in our codebase,
where some commands, for example git-blame(1), write an in-memory
worktree commit.

Furthermore, I think that going forward it can serve more purposes as we
now have an easy way to write and read objects that will not get
persisted. I could see that this may be useful when for example
re-merging diffs. But eventually, once we have the object storage format
extension wired up, callers might even want to manually set up an
in-memory database as the primary ODB for write operations so that no
data will be persisted in an arbitrary write.

Last but not least, this patch series also serves the purpose of
eventually getting rid of the `struct object_info::whence` member.
Instead, we'll simply yield the ODB source a specific object has been
read from, together with some backend-specific data, which gives
strictly more information compared to the status quo.

The series is based on cf2139f8e1 (The 24th batch, 2026-04-01) with
ps/odb-cleanup at 109bcb7d1d (odb: drop unneeded headers and forward
decls, 2026-04-01) merged into it.

Thanks!

Patrick

---
Patrick Steinhardt (16):
      odb: introduce "inmemory" source
      odb/source-inmemory: implement `free()` callback
      odb: fix unnecessary call to `find_cached_object()`
      odb/source-inmemory: implement `read_object_info()` callback
      odb/source-inmemory: implement `read_object_stream()` callback
      odb/source-inmemory: implement `write_object()` callback
      odb/source-inmemory: implement `write_object_stream()` callback
      cbtree: allow using arbitrary wrapper structures for nodes
      oidtree: add ability to store data
      odb/source-inmemory: convert to use oidtree
      odb/source-inmemory: implement `for_each_object()` callback
      odb/source-inmemory: implement `find_abbrev_len()` callback
      odb/source-inmemory: implement `count_objects()` callback
      odb/source-inmemory: implement `freshen_object()` callback
      odb/source-inmemory: stub out remaining functions
      odb: generic inmemory source

 Makefile                 |   1 +
 cbtree.c                 |  25 +++-
 cbtree.h                 |  11 +-
 loose.c                  |   2 +-
 meson.build              |   1 +
 object-file.c            |   3 +-
 odb.c                    |  82 ++---------
 odb.h                    |   4 +-
 odb/source-inmemory.c    | 375 +++++++++++++++++++++++++++++++++++++++++++++++
 odb/source-inmemory.h    |  33 +++++
 odb/source.h             |   3 +
 oidtree.c                |  66 ++++++---
 oidtree.h                |  12 +-
 t/unit-tests/u-oidtree.c |  26 +++-
 14 files changed, 529 insertions(+), 115 deletions(-)

---
base-commit: 3d05c3e2906489caa9f12f0af18dc233a6b8032c
change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 01/16] odb: introduce "inmemory" source
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-08 21:00   ` Justin Tobler
  2026-04-03  6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Next to our typical object database sources, each object database also
has an implicit source of "cached" objects. These cached objects only
exist in memory and some use cases:

  - They contain evergreen objects that we expect to always exist, like
    for example the empty tree.

  - They can be used to store temporary objects that we don't want to
    persist to disk.

Overall, their use is somewhat restricted though. For example, we don't
provide the ability to use it as a temporary object database source that
allows the user to write objects, but discard them after Git exists. So
while these cached objects behave almost like a source, they aren't used
as one.

This is about to change over the following commits, where we will turn
cached objects into a new "inmemory" source. This will allow us to use
it exactly the same as any other source by providing the same common
interface as the "files" source.

For now, the inmemory source only hosts the cached objects and doesn't
provide any logic yet. This will change with subsequent commits, where
we move respective functionality into the source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Makefile              |  1 +
 meson.build           |  1 +
 odb.c                 | 21 +++++++++++++--------
 odb.h                 |  4 ++--
 odb/source-inmemory.c | 12 ++++++++++++
 odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++
 odb/source.h          |  3 +++
 7 files changed, 67 insertions(+), 10 deletions(-)

diff --git a/Makefile b/Makefile
index dbf0022054..175391e6f8 100644
--- a/Makefile
+++ b/Makefile
@@ -1218,6 +1218,7 @@ LIB_OBJS += object.o
 LIB_OBJS += odb.o
 LIB_OBJS += odb/source.o
 LIB_OBJS += odb/source-files.o
+LIB_OBJS += odb/source-inmemory.o
 LIB_OBJS += odb/streaming.o
 LIB_OBJS += oid-array.o
 LIB_OBJS += oidmap.o
diff --git a/meson.build b/meson.build
index 8309942d18..8f55d2650e 100644
--- a/meson.build
+++ b/meson.build
@@ -404,6 +404,7 @@ libgit_sources = [
   'odb.c',
   'odb/source.c',
   'odb/source-files.c',
+  'odb/source-inmemory.c',
   'odb/streaming.c',
   'oid-array.c',
   'oidmap.c',
diff --git a/odb.c b/odb.c
index 9b28fe25ef..95b21e2cfd 100644
--- a/odb.c
+++ b/odb.c
@@ -14,6 +14,7 @@
 #include "object-file.h"
 #include "object-name.h"
 #include "odb.h"
+#include "odb/source-inmemory.h"
 #include "packfile.h"
 #include "path.h"
 #include "promisor-remote.h"
@@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob
 		.type = OBJ_TREE,
 		.buf = "",
 	};
-	const struct cached_object_entry *co = object_store->cached_objects;
+	const struct cached_object_entry *co = object_store->inmemory_objects->objects;
 
-	for (size_t i = 0; i < object_store->cached_object_nr; i++, co++)
+	for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++)
 		if (oideq(&co->oid, oid))
 			return &co->value;
 
@@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb,
 	    find_cached_object(odb, oid))
 		return 0;
 
-	ALLOC_GROW(odb->cached_objects,
-		   odb->cached_object_nr + 1, odb->cached_object_alloc);
-	co = &odb->cached_objects[odb->cached_object_nr++];
+	ALLOC_GROW(odb->inmemory_objects->objects,
+		   odb->inmemory_objects->objects_nr + 1,
+		   odb->inmemory_objects->objects_alloc);
+	co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++];
 	co->value.size = len;
 	co->value.type = type;
 	co_buf = xmalloc(len);
@@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo,
 	o->sources = odb_source_new(o, primary_source, true);
 	o->sources_tail = &o->sources->next;
 	o->alternate_db = xstrdup_or_null(secondary_sources);
+	o->inmemory_objects = odb_source_inmemory_new(o);
 
 	free(to_free);
 
@@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o)
 	odb_close(o);
 	odb_free_sources(o);
 
-	for (size_t i = 0; i < o->cached_object_nr; i++)
-		free((char *) o->cached_objects[i].value.buf);
-	free(o->cached_objects);
+	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
+		free((char *) o->inmemory_objects->objects[i].value.buf);
+	free(o->inmemory_objects->objects);
+	free(o->inmemory_objects->base.path);
+	free(o->inmemory_objects);
 
 	string_list_clear(&o->submodule_source_paths, 0);
 
diff --git a/odb.h b/odb.h
index 3a711f6547..3d20270a05 100644
--- a/odb.h
+++ b/odb.h
@@ -8,6 +8,7 @@
 #include "thread-utils.h"
 
 struct cached_object_entry;
+struct odb_source_inmemory;
 struct packed_git;
 struct repository;
 struct strbuf;
@@ -98,8 +99,7 @@ struct object_database {
 	 * to write them into the object store (e.g. a browse-only
 	 * application).
 	 */
-	struct cached_object_entry *cached_objects;
-	size_t cached_object_nr, cached_object_alloc;
+	struct odb_source_inmemory *inmemory_objects;
 
 	/*
 	 * A fast, rough count of the number of objects in the repository.
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
new file mode 100644
index 0000000000..c7ac5c24f0
--- /dev/null
+++ b/odb/source-inmemory.c
@@ -0,0 +1,12 @@
+#include "git-compat-util.h"
+#include "odb/source-inmemory.h"
+
+struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
+{
+	struct odb_source_inmemory *source;
+
+	CALLOC_ARRAY(source, 1);
+	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
+
+	return source;
+}
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
new file mode 100644
index 0000000000..95477bf36d
--- /dev/null
+++ b/odb/source-inmemory.h
@@ -0,0 +1,35 @@
+#ifndef ODB_SOURCE_INMEMORY_H
+#define ODB_SOURCE_INMEMORY_H
+
+#include "odb/source.h"
+
+struct cached_object_entry;
+
+/*
+ * An inmemory source that you can write objects to that shall be made
+ * available for reading, but that shouldn't ever be persisted to disk. Note
+ * that any objects written to this source will be stored in memory, so the
+ * number of objects you can store is limited by available system memory.
+ */
+struct odb_source_inmemory {
+	struct odb_source base;
+
+	struct cached_object_entry *objects;
+	size_t objects_nr, objects_alloc;
+};
+
+/* Create a new in-memory object database source. */
+struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb);
+
+/*
+ * Cast the given object database source to the inmemory backend. This will
+ * cause a BUG in case the source doesn't use this backend.
+ */
+static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source)
+{
+	if (source->type != ODB_SOURCE_INMEMORY)
+		BUG("trying to downcast source of type '%d' to inmemory", source->type);
+	return container_of(source, struct odb_source_inmemory, base);
+}
+
+#endif
diff --git a/odb/source.h b/odb/source.h
index f706e0608a..cd14f9e046 100644
--- a/odb/source.h
+++ b/odb/source.h
@@ -13,6 +13,9 @@ enum odb_source_type {
 
 	/* The "files" backend that uses loose objects and packfiles. */
 	ODB_SOURCE_FILES,
+
+	/* The "inmemory" backend that stores objects in memory. */
+	ODB_SOURCE_INMEMORY,
 };
 
 struct object_id;

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 02/16] odb/source-inmemory: implement `free()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 01/16] " Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-08 21:05   ` Justin Tobler
  2026-04-03  6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Implement the `free()` callback function for the "inmemory" source.

Note that this requires us to define `struct cached_object_entry` in
"odb/source-inmemory.h", as it is accessed in both "odb.c" and
"odb/source-inmemory.c" now. This will be fixed in subsequent commits
though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 25 ++++---------------------
 odb/source-inmemory.c | 12 ++++++++++++
 odb/source-inmemory.h |  9 ++++++++-
 3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/odb.c b/odb.c
index 95b21e2cfd..d321242353 100644
--- a/odb.c
+++ b/odb.c
@@ -32,21 +32,6 @@
 KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
 	struct odb_source *, 1, fspathhash, fspatheq)
 
-/*
- * This is meant to hold a *small* number of objects that you would
- * want odb_read_object() to be able to return, but yet you do not want
- * to write them into the object store (e.g. a browse-only
- * application).
- */
-struct cached_object_entry {
-	struct object_id oid;
-	struct cached_object {
-		enum object_type type;
-		const void *buf;
-		unsigned long size;
-	} value;
-};
-
 static const struct cached_object *find_cached_object(struct object_database *object_store,
 						      const struct object_id *oid)
 {
@@ -1109,6 +1094,10 @@ static void odb_free_sources(struct object_database *o)
 		odb_source_free(o->sources);
 		o->sources = next;
 	}
+
+	odb_source_free(&o->inmemory_objects->base);
+	o->inmemory_objects = NULL;
+
 	kh_destroy_odb_path_map(o->source_by_path);
 	o->source_by_path = NULL;
 }
@@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o)
 	odb_close(o);
 	odb_free_sources(o);
 
-	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
-		free((char *) o->inmemory_objects->objects[i].value.buf);
-	free(o->inmemory_objects->objects);
-	free(o->inmemory_objects->base.path);
-	free(o->inmemory_objects);
-
 	string_list_clear(&o->submodule_source_paths, 0);
 
 	free(o);
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index c7ac5c24f0..ccbb622eae 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,6 +1,16 @@
 #include "git-compat-util.h"
 #include "odb/source-inmemory.h"
 
+static void odb_source_inmemory_free(struct odb_source *source)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	for (size_t i = 0; i < inmemory->objects_nr; i++)
+		free((char *) inmemory->objects[i].value.buf);
+	free(inmemory->objects);
+	free(inmemory->base.path);
+	free(inmemory);
+}
+
 struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 {
 	struct odb_source_inmemory *source;
@@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	CALLOC_ARRAY(source, 1);
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
+	source->base.free = odb_source_inmemory_free;
+
 	return source;
 }
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
index 95477bf36d..14dc06f7c3 100644
--- a/odb/source-inmemory.h
+++ b/odb/source-inmemory.h
@@ -3,7 +3,14 @@
 
 #include "odb/source.h"
 
-struct cached_object_entry;
+struct cached_object_entry {
+	struct object_id oid;
+	struct cached_object {
+		enum object_type type;
+		const void *buf;
+		unsigned long size;
+	} value;
+};
 
 /*
  * An inmemory source that you can write objects to that shall be made

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()`
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 01/16] " Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-08 21:13   ` Justin Tobler
  2026-04-03  6:01 ` [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

The function `odb_pretend_object()` writes an object into the in-memory
object database source. The effect of this is that the object will now
become readable, but it won't ever be persisted to disk.

Before storing the object, we first verify whether the object already
exists. This is done by calling `odb_has_object()` to check all sources,
followed by `find_cached_object()` to check whether we have already
stored the object in our in-memory source.

This is unnecessary though, as `odb_has_object()` already checks the
in-memory source transitively via:

  - `odb_has_object()`
  - `odb_read_object_info_extended()`
  - `do_oid_object_info_extended()`
  - `find_cached_object()`

Drop the explicit call to `find_cached_object()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/odb.c b/odb.c
index d321242353..21cdedc31c 100644
--- a/odb.c
+++ b/odb.c
@@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb,
 	char *co_buf;
 
 	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
-	if (odb_has_object(odb, oid, 0) ||
-	    find_cached_object(odb, oid))
+	if (odb_has_object(odb, oid, 0))
 		return 0;
 
 	ALLOC_GROW(odb->inmemory_objects->objects,

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Implement the `read_object_info()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 39 +------------------------------------
 odb/source-inmemory.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 38 deletions(-)

diff --git a/odb.c b/odb.c
index 21cdedc31c..b8e7356951 100644
--- a/odb.c
+++ b/odb.c
@@ -32,25 +32,6 @@
 KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
 	struct odb_source *, 1, fspathhash, fspatheq)
 
-static const struct cached_object *find_cached_object(struct object_database *object_store,
-						      const struct object_id *oid)
-{
-	static const struct cached_object empty_tree = {
-		.type = OBJ_TREE,
-		.buf = "",
-	};
-	const struct cached_object_entry *co = object_store->inmemory_objects->objects;
-
-	for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++)
-		if (oideq(&co->oid, oid))
-			return &co->value;
-
-	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
-		return &empty_tree;
-
-	return NULL;
-}
-
 int odb_mkstemp(struct object_database *odb,
 		struct strbuf *temp_filename, const char *pattern)
 {
@@ -570,7 +551,6 @@ static int do_oid_object_info_extended(struct object_database *odb,
 				       const struct object_id *oid,
 				       struct object_info *oi, unsigned flags)
 {
-	const struct cached_object *co;
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
@@ -580,25 +560,8 @@ static int do_oid_object_info_extended(struct object_database *odb,
 	if (is_null_oid(real))
 		return -1;
 
-	co = find_cached_object(odb, real);
-	if (co) {
-		if (oi) {
-			if (oi->typep)
-				*(oi->typep) = co->type;
-			if (oi->sizep)
-				*(oi->sizep) = co->size;
-			if (oi->disk_sizep)
-				*(oi->disk_sizep) = 0;
-			if (oi->delta_base_oid)
-				oidclr(oi->delta_base_oid, odb->repo->hash_algo);
-			if (oi->contentp)
-				*oi->contentp = xmemdupz(co->buf, co->size);
-			if (oi->mtimep)
-				*oi->mtimep = 0;
-			oi->whence = OI_CACHED;
-		}
+	if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags))
 		return 0;
-	}
 
 	odb_prepare_alternates(odb);
 
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index ccbb622eae..12c80f9b34 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,5 +1,57 @@
 #include "git-compat-util.h"
+#include "odb.h"
 #include "odb/source-inmemory.h"
+#include "repository.h"
+
+static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
+						      const struct object_id *oid)
+{
+	static const struct cached_object empty_tree = {
+		.type = OBJ_TREE,
+		.buf = "",
+	};
+	const struct cached_object_entry *co = source->objects;
+
+	for (size_t i = 0; i < source->objects_nr; i++, co++)
+		if (oideq(&co->oid, oid))
+			return &co->value;
+
+	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
+		return &empty_tree;
+
+	return NULL;
+}
+
+static int odb_source_inmemory_read_object_info(struct odb_source *source,
+						const struct object_id *oid,
+						struct object_info *oi,
+						enum object_info_flags flags UNUSED)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	const struct cached_object *object;
+
+	object = find_cached_object(inmemory, oid);
+	if (!object)
+		return -1;
+
+	if (oi) {
+		if (oi->typep)
+			*(oi->typep) = object->type;
+		if (oi->sizep)
+			*(oi->sizep) = object->size;
+		if (oi->disk_sizep)
+			*(oi->disk_sizep) = 0;
+		if (oi->delta_base_oid)
+			oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
+		if (oi->contentp)
+			*oi->contentp = xmemdupz(object->buf, object->size);
+		if (oi->mtimep)
+			*oi->mtimep = 0;
+		oi->whence = OI_CACHED;
+	}
+
+	return 0;
+}
 
 static void odb_source_inmemory_free(struct odb_source *source)
 {
@@ -19,6 +71,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
 	source->base.free = odb_source_inmemory_free;
+	source->base.read_object_info = odb_source_inmemory_read_object_info;
 
 	return source;
 }

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-08 21:24   ` Justin Tobler
  2026-04-03  6:01 ` [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Implement the `read_object_stream()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 12c80f9b34..4a68169430 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,6 +1,7 @@
 #include "git-compat-util.h"
 #include "odb.h"
 #include "odb/source-inmemory.h"
+#include "odb/streaming.h"
 #include "repository.h"
 
 static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
@@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 	return 0;
 }
 
+struct odb_read_stream_inmemory {
+	struct odb_read_stream base;
+	const void *buf;
+	size_t offset;
+};
+
+static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream,
+					     char *buf, size_t buf_len)
+{
+	struct odb_read_stream_inmemory *inmemory =
+		container_of(stream, struct odb_read_stream_inmemory, base);
+	size_t bytes = buf_len;
+
+	if (buf_len > inmemory->base.size - inmemory->offset)
+		bytes = inmemory->base.size - inmemory->offset;
+	memcpy(buf, inmemory->buf, bytes);
+
+	return bytes;
+}
+
+static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED)
+{
+	return 0;
+}
+
+static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
+						  struct odb_source *source,
+						  const struct object_id *oid)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct odb_read_stream_inmemory *stream;
+	const struct cached_object *object;
+
+	object = find_cached_object(inmemory, oid);
+	if (!object)
+		return -1;
+
+	CALLOC_ARRAY(stream, 1);
+	stream->base.read = odb_read_stream_inmemory_read;
+	stream->base.close = odb_read_stream_inmemory_close;
+	stream->base.size = object->size;
+	stream->base.type = object->type;
+	stream->buf = object->buf;
+
+	*out = &stream->base;
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -72,6 +121,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
+	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 
 	return source;
 }

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Implement the `write_object()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 16 ++--------------
 odb/source-inmemory.c | 22 ++++++++++++++++++++++
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/odb.c b/odb.c
index b8e7356951..34228c0cd5 100644
--- a/odb.c
+++ b/odb.c
@@ -733,24 +733,12 @@ int odb_pretend_object(struct object_database *odb,
 		       void *buf, unsigned long len, enum object_type type,
 		       struct object_id *oid)
 {
-	struct cached_object_entry *co;
-	char *co_buf;
-
 	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
 	if (odb_has_object(odb, oid, 0))
 		return 0;
 
-	ALLOC_GROW(odb->inmemory_objects->objects,
-		   odb->inmemory_objects->objects_nr + 1,
-		   odb->inmemory_objects->objects_alloc);
-	co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++];
-	co->value.size = len;
-	co->value.type = type;
-	co_buf = xmalloc(len);
-	memcpy(co_buf, buf, len);
-	co->value.buf = co_buf;
-	oidcpy(&co->oid, oid);
-	return 0;
+	return odb_source_write_object(&odb->inmemory_objects->base,
+				       buf, len, type, oid, NULL, 0);
 }
 
 void *odb_read_object(struct object_database *odb,
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 4a68169430..d2fc4c4054 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -102,6 +102,27 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 	return 0;
 }
 
+static int odb_source_inmemory_write_object(struct odb_source *source,
+					    const void *buf, unsigned long len,
+					    enum object_type type,
+					    struct object_id *oid,
+					    struct object_id *compat_oid UNUSED,
+					    enum odb_write_object_flags flags UNUSED)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct cached_object_entry *object;
+
+	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
+		   inmemory->objects_alloc);
+	object = &inmemory->objects[inmemory->objects_nr++];
+	object->value.size = len;
+	object->value.type = type;
+	object->value.buf = xmemdupz(buf, len);
+	oidcpy(&object->oid, oid);
+
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -122,6 +143,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
+	source->base.write_object = odb_source_inmemory_write_object;
 
 	return source;
 }

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03 22:11   ` Junio C Hamano
  2026-04-03  6:01 ` [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Implement the `write_object_stream()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index d2fc4c4054..890e2a8c7c 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -123,6 +123,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
 	return 0;
 }
 
+static int odb_source_inmemory_write_object_stream(struct odb_source *source,
+						   struct odb_write_stream *stream,
+						   size_t len,
+						   struct object_id *oid)
+{
+	size_t total_read = 0;
+	char *data;
+	int ret;
+
+	CALLOC_ARRAY(data, len);
+	while (!stream->is_finished) {
+		unsigned long bytes_read;
+		const void *in;
+
+		in = stream->read(stream, &bytes_read);
+		if (total_read + bytes_read > len) {
+			ret = error("object stream yielded more bytes than expected");
+			goto out;
+		}
+
+		memcpy(data, in, bytes_read);
+		total_read += bytes_read;
+	}
+
+	if (total_read != len) {
+		ret = error("object stream yielded less bytes than expected");
+		goto out;
+	}
+
+	ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid,
+					       NULL, 0);
+	if (ret < 0)
+		goto out;
+
+out:
+	free(data);
+	return ret;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -144,6 +183,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.write_object = odb_source_inmemory_write_object;
+	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 
 	return source;
 }

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (6 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 09/16] oidtree: add ability to store data Patrick Steinhardt
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

The cbtree subsystem allows the user to store arbitrary data in a
prefix-free set of strings. This is used by us to store object IDs in a
way that we can easily iterate through them in lexicograph order, and so
that we can easily perform lookups with shortened object IDs.

In its current form, it is not easily possible to store arbitrary data
with the tree nodes. There are a couple of approaches such a caller
could try to use, but none of them really work:

  - One may embed the `struct cb_node` in a custom structure. This does
    not work though as `struct cb_node` contains a flex array, and
    embedding such a struct in another struct is forbidden.

  - One may use a `union` over `struct cb_node` and ones own data type,
    which _is_ allowed even if the struct contains a flex array. This
    does not work though, as the compiler may align members of the
    struct so that the node key would not immediately start where the
    flex array starts.

  - One may allocate `struct cb_node` such that it has room for both its
    key and the custom data. This has the downside though that if the
    custom data is itself a pointer to allocated memory, then the leak
    checker will not consider the pointer to be alive anymore.

Refactor the cbtree to drop the flex array and instead take in an
explicit offset for where to find the key, which allows the caller to
embed `struct cb_node` is a wrapper struct.

Note that this change has the downside that we now have a bit of padding
in our structure, which grows the size from 60 to 64 bytes on a 64 bit
system. On the other hand though, it allows us to get rid of the memory
copies that we previously had to do to ensure proper alignment. This
seems like a reasonable tradeoff.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 cbtree.c  | 25 ++++++++++++++++++-------
 cbtree.h  | 11 ++++++-----
 oidtree.c | 33 ++++++++++++++-------------------
 3 files changed, 38 insertions(+), 31 deletions(-)

diff --git a/cbtree.c b/cbtree.c
index 4ab794bddc..8f5edbb80a 100644
--- a/cbtree.c
+++ b/cbtree.c
@@ -7,6 +7,11 @@
 #include "git-compat-util.h"
 #include "cbtree.h"
 
+static inline uint8_t *cb_node_key(struct cb_tree *t, struct cb_node *node)
+{
+	return (uint8_t *) node + t->key_offset;
+}
+
 static struct cb_node *cb_node_of(const void *p)
 {
 	return (struct cb_node *)((uintptr_t)p - 1);
@@ -33,6 +38,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 	uint8_t c;
 	int newdirection;
 	struct cb_node **wherep, *p;
+	uint8_t *node_key, *p_key;
 
 	assert(!((uintptr_t)node & 1)); /* allocations must be aligned */
 
@@ -41,23 +47,26 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 		return NULL;	/* success */
 	}
 
+	node_key = cb_node_key(t, node);
+
 	/* see if a node already exists */
-	p = cb_internal_best_match(t->root, node->k, klen);
+	p = cb_internal_best_match(t->root, node_key, klen);
+	p_key = cb_node_key(t, p);
 
 	/* find first differing byte */
 	for (newbyte = 0; newbyte < klen; newbyte++) {
-		if (p->k[newbyte] != node->k[newbyte])
+		if (p_key[newbyte] != node_key[newbyte])
 			goto different_byte_found;
 	}
 	return p;	/* element exists, let user deal with it */
 
 different_byte_found:
-	newotherbits = p->k[newbyte] ^ node->k[newbyte];
+	newotherbits = p_key[newbyte] ^ node_key[newbyte];
 	newotherbits |= newotherbits >> 1;
 	newotherbits |= newotherbits >> 2;
 	newotherbits |= newotherbits >> 4;
 	newotherbits = (newotherbits & ~(newotherbits >> 1)) ^ 255;
-	c = p->k[newbyte];
+	c = p_key[newbyte];
 	newdirection = (1 + (newotherbits | c)) >> 8;
 
 	node->byte = newbyte;
@@ -78,7 +87,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 			break;
 		if (q->byte == newbyte && q->otherbits > newotherbits)
 			break;
-		c = q->byte < klen ? node->k[q->byte] : 0;
+		c = q->byte < klen ? node_key[q->byte] : 0;
 		direction = (1 + (q->otherbits | c)) >> 8;
 		wherep = q->child + direction;
 	}
@@ -93,7 +102,7 @@ struct cb_node *cb_lookup(struct cb_tree *t, const uint8_t *k, size_t klen)
 {
 	struct cb_node *p = cb_internal_best_match(t->root, k, klen);
 
-	return p && !memcmp(p->k, k, klen) ? p : NULL;
+	return p && !memcmp(cb_node_key(t, p), k, klen) ? p : NULL;
 }
 
 static int cb_descend(struct cb_node *p, cb_iter fn, void *arg)
@@ -115,6 +124,7 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
 	struct cb_node *p = t->root;
 	struct cb_node *top = p;
 	size_t i = 0;
+	uint8_t *p_key;
 
 	if (!p)
 		return 0; /* empty tree */
@@ -130,8 +140,9 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
 			top = p;
 	}
 
+	p_key = cb_node_key(t, p);
 	for (i = 0; i < klen; i++) {
-		if (p->k[i] != kpfx[i])
+		if (p_key[i] != kpfx[i])
 			return 0; /* "best" match failed */
 	}
 
diff --git a/cbtree.h b/cbtree.h
index c374b1b3db..3ce0d6b287 100644
--- a/cbtree.h
+++ b/cbtree.h
@@ -23,18 +23,19 @@ struct cb_node {
 	 */
 	uint32_t byte;
 	uint8_t otherbits;
-	uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */
 };
 
 struct cb_tree {
 	struct cb_node *root;
+	ptrdiff_t key_offset;
 };
 
-#define CBTREE_INIT { 0 }
-
-static inline void cb_init(struct cb_tree *t)
+static inline void cb_init(struct cb_tree *t,
+			   ptrdiff_t key_offset)
 {
-	struct cb_tree blank = CBTREE_INIT;
+	struct cb_tree blank = {
+		.key_offset = key_offset,
+	};
 	memcpy(t, &blank, sizeof(*t));
 }
 
diff --git a/oidtree.c b/oidtree.c
index ab9fe7ec7a..117649753f 100644
--- a/oidtree.c
+++ b/oidtree.c
@@ -6,9 +6,14 @@
 #include "oidtree.h"
 #include "hash.h"
 
+struct oidtree_node {
+	struct cb_node base;
+	struct object_id key;
+};
+
 void oidtree_init(struct oidtree *ot)
 {
-	cb_init(&ot->tree);
+	cb_init(&ot->tree, offsetof(struct oidtree_node, key));
 	mem_pool_init(&ot->mem_pool, 0);
 }
 
@@ -22,20 +27,13 @@ void oidtree_clear(struct oidtree *ot)
 
 void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 {
-	struct cb_node *on;
-	struct object_id k;
+	struct oidtree_node *on;
 
 	if (!oid->algo)
 		BUG("oidtree_insert requires oid->algo");
 
-	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid));
-
-	/*
-	 * Clear the padding and copy the result in separate steps to
-	 * respect the 4-byte alignment needed by struct object_id.
-	 */
-	oidcpy(&k, oid);
-	memcpy(on->k, &k, sizeof(k));
+	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on));
+	oidcpy(&on->key, oid);
 
 	/*
 	 * n.b. Current callers won't get us duplicates, here.  If a
@@ -43,7 +41,7 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 	 * that won't be freed until oidtree_clear.  Currently it's not
 	 * worth maintaining a free list
 	 */
-	cb_insert(&ot->tree, on, sizeof(*oid));
+	cb_insert(&ot->tree, &on->base, sizeof(*oid));
 }
 
 bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
@@ -73,21 +71,18 @@ struct oidtree_each_data {
 
 static int iter(struct cb_node *n, void *cb_data)
 {
+	struct oidtree_node *node = container_of(n, struct oidtree_node, base);
 	struct oidtree_each_data *data = cb_data;
-	struct object_id k;
-
-	/* Copy to provide 4-byte alignment needed by struct object_id. */
-	memcpy(&k, n->k, sizeof(k));
 
-	if (data->algo != GIT_HASH_UNKNOWN && data->algo != k.algo)
+	if (data->algo != GIT_HASH_UNKNOWN && data->algo != node->key.algo)
 		return 0;
 
 	if (data->last_nibble_at) {
-		if ((k.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0)
+		if ((node->key.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0)
 			return 0;
 	}
 
-	return data->cb(&k, data->cb_data);
+	return data->cb(&node->key, data->cb_data);
 }
 
 int oidtree_each(struct oidtree *ot, const struct object_id *prefix,

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 09/16] oidtree: add ability to store data
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (7 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 10/16] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

The oidtree data structure is currently only used to store object IDs,
without any associated data. So consequently, it can only really be used
to track which object IDs exist, and we can use the tree structure to
efficiently operate on OID prefixes.

But there are valid use cases where we want to both:

  - Store object IDs in a sorted order.

  - Associated arbitrary data with them.

Refactor the oidtree interface so that it allows us to store arbitrary
payloads within the respective nodes. This will be used in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 loose.c                  |  2 +-
 object-file.c            |  3 ++-
 oidtree.c                | 37 ++++++++++++++++++++++++++++++++-----
 oidtree.h                | 12 ++++++++++--
 t/unit-tests/u-oidtree.c | 26 +++++++++++++++++++++++---
 5 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/loose.c b/loose.c
index 07333be696..f7a3dd1a72 100644
--- a/loose.c
+++ b/loose.c
@@ -57,7 +57,7 @@ static int insert_loose_map(struct odb_source *source,
 	inserted |= insert_oid_pair(map->to_compat, oid, compat_oid);
 	inserted |= insert_oid_pair(map->to_storage, compat_oid, oid);
 	if (inserted)
-		oidtree_insert(files->loose->cache, compat_oid);
+		oidtree_insert(files->loose->cache, compat_oid, NULL);
 
 	return inserted;
 }
diff --git a/object-file.c b/object-file.c
index 98a4678ca4..c0805f0ebb 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1850,6 +1850,7 @@ static int for_each_object_wrapper_cb(const struct object_id *oid,
 }
 
 static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid,
+					       void *node_data UNUSED,
 					       void *cb_data)
 {
 	struct for_each_object_wrapper_data *data = cb_data;
@@ -1995,7 +1996,7 @@ static int append_loose_object(const struct object_id *oid,
 			       const char *path UNUSED,
 			       void *data)
 {
-	oidtree_insert(data, oid);
+	oidtree_insert(data, oid, NULL);
 	return 0;
 }
 
diff --git a/oidtree.c b/oidtree.c
index 117649753f..e43f18026e 100644
--- a/oidtree.c
+++ b/oidtree.c
@@ -9,6 +9,7 @@
 struct oidtree_node {
 	struct cb_node base;
 	struct object_id key;
+	void *data;
 };
 
 void oidtree_init(struct oidtree *ot)
@@ -25,15 +26,22 @@ void oidtree_clear(struct oidtree *ot)
 	}
 }
 
-void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
+struct oidtree_data {
+	struct object_id oid;
+};
+
+void oidtree_insert(struct oidtree *ot, const struct object_id *oid,
+		    void *data)
 {
 	struct oidtree_node *on;
+	struct cb_node *node;
 
 	if (!oid->algo)
 		BUG("oidtree_insert requires oid->algo");
 
 	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on));
 	oidcpy(&on->key, oid);
+	on->data = data;
 
 	/*
 	 * n.b. Current callers won't get us duplicates, here.  If a
@@ -41,13 +49,19 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 	 * that won't be freed until oidtree_clear.  Currently it's not
 	 * worth maintaining a free list
 	 */
-	cb_insert(&ot->tree, &on->base, sizeof(*oid));
+	node = cb_insert(&ot->tree, &on->base, sizeof(*oid));
+	if (node) {
+		struct oidtree_node *preexisting = container_of(node, struct oidtree_node, base);
+		preexisting->data = data;
+	}
 }
 
-bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+static struct oidtree_node *oidtree_lookup(struct oidtree *ot,
+					   const struct object_id *oid)
 {
 	struct object_id k;
 	size_t klen = sizeof(k);
+	struct cb_node *node;
 
 	oidcpy(&k, oid);
 
@@ -58,7 +72,20 @@ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
 	klen += BUILD_ASSERT_OR_ZERO(offsetof(struct object_id, hash) <
 				offsetof(struct object_id, algo));
 
-	return !!cb_lookup(&ot->tree, (const uint8_t *)&k, klen);
+	node = cb_lookup(&ot->tree, (const uint8_t *)&k, klen);
+	return node ? container_of(node, struct oidtree_node, base) : NULL;
+}
+
+bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+{
+	struct oidtree_node *node = oidtree_lookup(ot, oid);
+	return node ? 1 : 0;
+}
+
+void *oidtree_get(struct oidtree *ot, const struct object_id *oid)
+{
+	struct oidtree_node *node = oidtree_lookup(ot, oid);
+	return node ? node->data : NULL;
 }
 
 struct oidtree_each_data {
@@ -82,7 +109,7 @@ static int iter(struct cb_node *n, void *cb_data)
 			return 0;
 	}
 
-	return data->cb(&node->key, data->cb_data);
+	return data->cb(&node->key, node->data, data->cb_data);
 }
 
 int oidtree_each(struct oidtree *ot, const struct object_id *prefix,
diff --git a/oidtree.h b/oidtree.h
index 2b7bad2e60..baa5a436ea 100644
--- a/oidtree.h
+++ b/oidtree.h
@@ -29,18 +29,26 @@ void oidtree_init(struct oidtree *ot);
  */
 void oidtree_clear(struct oidtree *ot);
 
-/* Insert the object ID into the tree. */
-void oidtree_insert(struct oidtree *ot, const struct object_id *oid);
+/*
+ * Insert the object ID into the tree and store the given pointer alongside
+ * with it. The data pointer of any preexisting entry will be overwritten.
+ */
+void oidtree_insert(struct oidtree *ot, const struct object_id *oid,
+		    void *data);
 
 /* Check whether the tree contains the given object ID. */
 bool oidtree_contains(struct oidtree *ot, const struct object_id *oid);
 
+/* Get the payload stored with the given object ID. */
+void *oidtree_get(struct oidtree *ot, const struct object_id *oid);
+
 /*
  * Callback function used for `oidtree_each()`. Returning a non-zero exit code
  * will cause iteration to stop. The exit code will be propagated to the caller
  * of `oidtree_each()`.
  */
 typedef int (*oidtree_each_cb)(const struct object_id *oid,
+			       void *node_data,
 			       void *cb_data);
 
 /*
diff --git a/t/unit-tests/u-oidtree.c b/t/unit-tests/u-oidtree.c
index d4d05c7dc3..f0d5ebb733 100644
--- a/t/unit-tests/u-oidtree.c
+++ b/t/unit-tests/u-oidtree.c
@@ -19,7 +19,7 @@ static int fill_tree_loc(struct oidtree *ot, const char *hexes[], size_t n)
 	for (size_t i = 0; i < n; i++) {
 		struct object_id oid;
 		cl_parse_any_oid(hexes[i], &oid);
-		oidtree_insert(ot, &oid);
+		oidtree_insert(ot, &oid, NULL);
 	}
 	return 0;
 }
@@ -38,9 +38,9 @@ struct expected_hex_iter {
 	const char *query;
 };
 
-static int check_each_cb(const struct object_id *oid, void *data)
+static int check_each_cb(const struct object_id *oid, void *node_data UNUSED, void *cb_data)
 {
-	struct expected_hex_iter *hex_iter = data;
+	struct expected_hex_iter *hex_iter = cb_data;
 	struct object_id expected;
 
 	cl_assert(hex_iter->i < hex_iter->expected_hexes.nr);
@@ -105,3 +105,23 @@ void test_oidtree__each(void)
 	check_each(&ot, "32100", "321", NULL);
 	check_each(&ot, "32", "320", "321", NULL);
 }
+
+void test_oidtree__insert_overwrites_data(void)
+{
+	struct object_id oid;
+	struct oidtree ot;
+	int a, b;
+
+	cl_parse_any_oid("1", &oid);
+
+	oidtree_init(&ot);
+
+	oidtree_insert(&ot, &oid, NULL);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), NULL);
+	oidtree_insert(&ot, &oid, &a);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), &a);
+	oidtree_insert(&ot, &oid, &b);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), &b);
+
+	oidtree_clear(&ot);
+}

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 10/16] odb/source-inmemory: convert to use oidtree
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (8 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 09/16] oidtree: add ability to store data Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

The inmemory source stores its objects in a simple array that we grow as
needed. This has a couple of downsides:

  - The object lookup is O(n). This doesn't matter in practice because
    we only store a small number of objects.

  - We don't have an easy way to iterate over all objects in
    lexicographic order.

  - We don't have an easy way to compute unique object ID prefixes.

Refactor the code to use an oidtree instead. This is the same data
structure used by our loose object source, and thus it means we get a
bunch of functionality for free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 72 +++++++++++++++++++++++++++++++++++++--------------
 odb/source-inmemory.h | 13 ++--------
 2 files changed, 54 insertions(+), 31 deletions(-)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 890e2a8c7c..22bae6927e 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -2,20 +2,29 @@
 #include "odb.h"
 #include "odb/source-inmemory.h"
 #include "odb/streaming.h"
+#include "oidtree.h"
 #include "repository.h"
 
-static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
-						      const struct object_id *oid)
+struct inmemory_object {
+	enum object_type type;
+	const void *buf;
+	unsigned long size;
+};
+
+static const struct inmemory_object *find_cached_object(struct odb_source_inmemory *source,
+							const struct object_id *oid)
 {
-	static const struct cached_object empty_tree = {
+	static const struct inmemory_object empty_tree = {
 		.type = OBJ_TREE,
 		.buf = "",
 	};
-	const struct cached_object_entry *co = source->objects;
+	const struct inmemory_object *object;
 
-	for (size_t i = 0; i < source->objects_nr; i++, co++)
-		if (oideq(&co->oid, oid))
-			return &co->value;
+	if (source->objects) {
+		object = oidtree_get(source->objects, oid);
+		if (object)
+			return object;
+	}
 
 	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
 		return &empty_tree;
@@ -29,7 +38,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 						enum object_info_flags flags UNUSED)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	const struct cached_object *object;
+	const struct inmemory_object *object;
 
 	object = find_cached_object(inmemory, oid);
 	if (!object)
@@ -85,7 +94,7 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
 	struct odb_read_stream_inmemory *stream;
-	const struct cached_object *object;
+	const struct inmemory_object *object;
 
 	object = find_cached_object(inmemory, oid);
 	if (!object)
@@ -110,15 +119,21 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
 					    enum odb_write_object_flags flags UNUSED)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	struct cached_object_entry *object;
+	struct inmemory_object *object;
 
-	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
-		   inmemory->objects_alloc);
-	object = &inmemory->objects[inmemory->objects_nr++];
-	object->value.size = len;
-	object->value.type = type;
-	object->value.buf = xmemdupz(buf, len);
-	oidcpy(&object->oid, oid);
+	if (!inmemory->objects) {
+		CALLOC_ARRAY(inmemory->objects, 1);
+		oidtree_init(inmemory->objects);
+	} else if (oidtree_contains(inmemory->objects, oid)) {
+		return 0;
+	}
+
+	CALLOC_ARRAY(object, 1);
+	object->size = len;
+	object->type = type;
+	object->buf = xmemdupz(buf, len);
+
+	oidtree_insert(inmemory->objects, oid, object);
 
 	return 0;
 }
@@ -162,12 +177,29 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source,
 	return ret;
 }
 
+static int inmemory_object_free(const struct object_id *oid UNUSED,
+				void *node_data,
+				void *cb_data UNUSED)
+{
+	struct inmemory_object *object = node_data;
+	free((void *) object->buf);
+	free(object);
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	for (size_t i = 0; i < inmemory->objects_nr; i++)
-		free((char *) inmemory->objects[i].value.buf);
-	free(inmemory->objects);
+
+	if (inmemory->objects) {
+		struct object_id null_oid = { 0 };
+
+		oidtree_each(inmemory->objects, &null_oid, 0,
+			     inmemory_object_free, NULL);
+		oidtree_clear(inmemory->objects);
+		free(inmemory->objects);
+	}
+
 	free(inmemory->base.path);
 	free(inmemory);
 }
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
index 14dc06f7c3..02cf586b63 100644
--- a/odb/source-inmemory.h
+++ b/odb/source-inmemory.h
@@ -3,14 +3,7 @@
 
 #include "odb/source.h"
 
-struct cached_object_entry {
-	struct object_id oid;
-	struct cached_object {
-		enum object_type type;
-		const void *buf;
-		unsigned long size;
-	} value;
-};
+struct oidtree;
 
 /*
  * An inmemory source that you can write objects to that shall be made
@@ -20,9 +13,7 @@ struct cached_object_entry {
  */
 struct odb_source_inmemory {
 	struct odb_source base;
-
-	struct cached_object_entry *objects;
-	size_t objects_nr, objects_alloc;
+	struct oidtree *objects;
 };
 
 /* Create a new in-memory object database source. */

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (9 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 10/16] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03  6:01 ` [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Implement the `for_each_object()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 86 +++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 70 insertions(+), 16 deletions(-)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 22bae6927e..0ac20df323 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -32,6 +32,28 @@ static const struct inmemory_object *find_cached_object(struct odb_source_inmemo
 	return NULL;
 }
 
+static void populate_object_info(struct odb_source_inmemory *source,
+				 struct object_info *oi,
+				 const struct inmemory_object *object)
+{
+	if (!oi)
+		return;
+
+	if (oi->typep)
+		*(oi->typep) = object->type;
+	if (oi->sizep)
+		*(oi->sizep) = object->size;
+	if (oi->disk_sizep)
+		*(oi->disk_sizep) = 0;
+	if (oi->delta_base_oid)
+		oidclr(oi->delta_base_oid, source->base.odb->repo->hash_algo);
+	if (oi->contentp)
+		*oi->contentp = xmemdupz(object->buf, object->size);
+	if (oi->mtimep)
+		*oi->mtimep = 0;
+	oi->whence = OI_CACHED;
+}
+
 static int odb_source_inmemory_read_object_info(struct odb_source *source,
 						const struct object_id *oid,
 						struct object_info *oi,
@@ -44,22 +66,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 	if (!object)
 		return -1;
 
-	if (oi) {
-		if (oi->typep)
-			*(oi->typep) = object->type;
-		if (oi->sizep)
-			*(oi->sizep) = object->size;
-		if (oi->disk_sizep)
-			*(oi->disk_sizep) = 0;
-		if (oi->delta_base_oid)
-			oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
-		if (oi->contentp)
-			*oi->contentp = xmemdupz(object->buf, object->size);
-		if (oi->mtimep)
-			*oi->mtimep = 0;
-		oi->whence = OI_CACHED;
-	}
-
+	populate_object_info(inmemory, oi, object);
 	return 0;
 }
 
@@ -111,6 +118,52 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 	return 0;
 }
 
+struct odb_source_inmemory_for_each_object_data {
+	struct odb_source_inmemory *inmemory;
+	const struct object_info *request;
+	odb_for_each_object_cb cb;
+	void *cb_data;
+};
+
+static int odb_source_inmemory_for_each_object_cb(const struct object_id *oid,
+						  void *node_data, void *cb_data)
+{
+	struct odb_source_inmemory_for_each_object_data *data = cb_data;
+	struct inmemory_object *object = node_data;
+
+	if (data->request) {
+		struct object_info oi = *data->request;
+		populate_object_info(data->inmemory, &oi, object);
+		return data->cb(oid, &oi, data->cb_data);
+	} else {
+		return data->cb(oid, NULL, data->cb_data);
+	}
+}
+
+static int odb_source_inmemory_for_each_object(struct odb_source *source,
+					       const struct object_info *request,
+					       odb_for_each_object_cb cb,
+					       void *cb_data,
+					       const struct odb_for_each_object_options *opts)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct odb_source_inmemory_for_each_object_data payload = {
+		.inmemory = inmemory,
+		.request = request,
+		.cb = cb,
+		.cb_data = cb_data,
+	};
+	struct object_id null_oid = { 0 };
+
+	if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) ||
+	    (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local))
+		return 0;
+
+	return oidtree_each(inmemory->objects,
+			    opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len,
+			    odb_source_inmemory_for_each_object_cb, &payload);
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -214,6 +267,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
+	source->base.for_each_object = odb_source_inmemory_for_each_object;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (10 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
@ 2026-04-03  6:01 ` Patrick Steinhardt
  2026-04-03  6:02 ` [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:01 UTC (permalink / raw)
  To: git

Implement the `find_abbrev_len()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 0ac20df323..16182bded3 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -164,6 +164,44 @@ static int odb_source_inmemory_for_each_object(struct odb_source *source,
 			    odb_source_inmemory_for_each_object_cb, &payload);
 }
 
+struct find_abbrev_len_data {
+	const struct object_id *oid;
+	unsigned len;
+};
+
+static int find_abbrev_len_cb(const struct object_id *oid,
+			      struct object_info *oi UNUSED,
+			      void *cb_data)
+{
+	struct find_abbrev_len_data *data = cb_data;
+	unsigned len = oid_common_prefix_hexlen(oid, data->oid);
+	if (len != hash_algos[oid->algo].hexsz && len >= data->len)
+		data->len = len + 1;
+	return 0;
+}
+
+static int odb_source_inmemory_find_abbrev_len(struct odb_source *source,
+					       const struct object_id *oid,
+					       unsigned min_len,
+					       unsigned *out)
+{
+	struct odb_for_each_object_options opts = {
+		.prefix = oid,
+		.prefix_hex_len = min_len,
+	};
+	struct find_abbrev_len_data data = {
+		.oid = oid,
+		.len = min_len,
+	};
+	int ret;
+
+	ret = odb_source_inmemory_for_each_object(source, NULL, find_abbrev_len_cb,
+						  &data, &opts);
+	*out = data.len;
+
+	return ret;
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -268,6 +306,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
+	source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (11 preceding siblings ...)
  2026-04-03  6:01 ` [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
@ 2026-04-03  6:02 ` Patrick Steinhardt
  2026-04-03  6:02 ` [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:02 UTC (permalink / raw)
  To: git

Implement the `count_objects()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 16182bded3..bd89a7ef14 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -202,6 +202,25 @@ static int odb_source_inmemory_find_abbrev_len(struct odb_source *source,
 	return ret;
 }
 
+static int count_objects_cb(const struct object_id *oid UNUSED,
+			    struct object_info *oi UNUSED,
+			    void *cb_data)
+{
+	unsigned long *counter = cb_data;
+	(*counter)++;
+	return 0;
+}
+
+static int odb_source_inmemory_count_objects(struct odb_source *source,
+					     enum odb_count_objects_flags flags UNUSED,
+					     unsigned long *out)
+{
+	struct odb_for_each_object_options opts = { 0 };
+	*out = 0;
+	return odb_source_inmemory_for_each_object(source, NULL, count_objects_cb,
+						   out, &opts);
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -307,6 +326,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
 	source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len;
+	source->base.count_objects = odb_source_inmemory_count_objects;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (12 preceding siblings ...)
  2026-04-03  6:02 ` [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
@ 2026-04-03  6:02 ` Patrick Steinhardt
  2026-04-03  6:02 ` [PATCH 15/16] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:02 UTC (permalink / raw)
  To: git

Implement the `freshen_object()` callback function for the inmemory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index bd89a7ef14..c5249d04bc 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -287,6 +287,15 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source,
 	return ret;
 }
 
+static int odb_source_inmemory_freshen_object(struct odb_source *source,
+					      const struct object_id *oid)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	if (find_cached_object(inmemory, oid))
+		return 1;
+	return 0;
+}
+
 static int inmemory_object_free(const struct object_id *oid UNUSED,
 				void *node_data,
 				void *cb_data UNUSED)
@@ -329,6 +338,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.count_objects = odb_source_inmemory_count_objects;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
+	source->base.freshen_object = odb_source_inmemory_freshen_object;
 
 	return source;
 }

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 15/16] odb/source-inmemory: stub out remaining functions
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (13 preceding siblings ...)
  2026-04-03  6:02 ` [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
@ 2026-04-03  6:02 ` Patrick Steinhardt
  2026-04-03  6:02 ` [PATCH 16/16] odb: generic inmemory source Patrick Steinhardt
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:02 UTC (permalink / raw)
  To: git

Stub out remaining functions that we either don't need or that are
basically no-ops.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index c5249d04bc..53009be032 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -296,6 +296,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source,
 	return 0;
 }
 
+static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED,
+						 struct odb_transaction **out UNUSED)
+{
+	return error("inmemory source does not support transactions");
+}
+
+static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED,
+					       struct strvec *out UNUSED)
+{
+	return 0;
+}
+
+static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED,
+					       const char *alternate UNUSED)
+{
+	return error("inmemory source does not support alternates");
+}
+
+static void odb_source_inmemory_close(struct odb_source *source UNUSED)
+{
+}
+
+static void odb_source_inmemory_reprepare(struct odb_source *source UNUSED)
+{
+}
+
 static int inmemory_object_free(const struct object_id *oid UNUSED,
 				void *node_data,
 				void *cb_data UNUSED)
@@ -331,6 +357,8 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
 	source->base.free = odb_source_inmemory_free;
+	source->base.close = odb_source_inmemory_close;
+	source->base.reprepare = odb_source_inmemory_reprepare;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
@@ -339,6 +367,9 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 	source->base.freshen_object = odb_source_inmemory_freshen_object;
+	source->base.begin_transaction = odb_source_inmemory_begin_transaction;
+	source->base.read_alternates = odb_source_inmemory_read_alternates;
+	source->base.write_alternate = odb_source_inmemory_write_alternate;
 
 	return source;
 }

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 16/16] odb: generic inmemory source
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (14 preceding siblings ...)
  2026-04-03  6:02 ` [PATCH 15/16] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
@ 2026-04-03  6:02 ` Patrick Steinhardt
  2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03  6:02 UTC (permalink / raw)
  To: git

Make the in-memory source generic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c | 8 ++++----
 odb.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/odb.c b/odb.c
index 34228c0cd5..70c59fef91 100644
--- a/odb.c
+++ b/odb.c
@@ -560,7 +560,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
 	if (is_null_oid(real))
 		return -1;
 
-	if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags))
+	if (!odb_source_read_object_info(odb->inmemory_objects, oid, oi, flags))
 		return 0;
 
 	odb_prepare_alternates(odb);
@@ -737,7 +737,7 @@ int odb_pretend_object(struct object_database *odb,
 	if (odb_has_object(odb, oid, 0))
 		return 0;
 
-	return odb_source_write_object(&odb->inmemory_objects->base,
+	return odb_source_write_object(odb->inmemory_objects,
 				       buf, len, type, oid, NULL, 0);
 }
 
@@ -1020,7 +1020,7 @@ struct object_database *odb_new(struct repository *repo,
 	o->sources = odb_source_new(o, primary_source, true);
 	o->sources_tail = &o->sources->next;
 	o->alternate_db = xstrdup_or_null(secondary_sources);
-	o->inmemory_objects = odb_source_inmemory_new(o);
+	o->inmemory_objects = &odb_source_inmemory_new(o)->base;
 
 	free(to_free);
 
@@ -1045,7 +1045,7 @@ static void odb_free_sources(struct object_database *o)
 		o->sources = next;
 	}
 
-	odb_source_free(&o->inmemory_objects->base);
+	odb_source_free(o->inmemory_objects);
 	o->inmemory_objects = NULL;
 
 	kh_destroy_odb_path_map(o->source_by_path);
diff --git a/odb.h b/odb.h
index 3d20270a05..e3211ad8d4 100644
--- a/odb.h
+++ b/odb.h
@@ -99,7 +99,7 @@ struct object_database {
 	 * to write them into the object store (e.g. a browse-only
 	 * application).
 	 */
-	struct odb_source_inmemory *inmemory_objects;
+	struct odb_source *inmemory_objects;
 
 	/*
 	 * A fast, rough count of the number of objects in the repository.

-- 
2.53.0.1323.g189a785ab5.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/16] odb: introduce "inmemory" source
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (15 preceding siblings ...)
  2026-04-03  6:02 ` [PATCH 16/16] odb: generic inmemory source Patrick Steinhardt
@ 2026-04-03 15:41 ` Junio C Hamano
  2026-04-08  8:22   ` Patrick Steinhardt
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
  18 siblings, 1 reply; 85+ messages in thread
From: Junio C Hamano @ 2026-04-03 15:41 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> this patch series introduces the second object database source type,
> which is the "inmemory" source.

I cannot read the word without a hyphen, i.e.e.g., "in-memory".

> This source may seem somewhat odd at first: it always starts out empty,
> and any object written into it will only exist in memory until the
> process exits. But the source already serves a purpose in our codebase,
> where some commands, for example git-blame(1), write an in-memory
> worktree commit.

Intermediate tree and blob objects you need while making an octopus
merge may also benefit from this feature, to stay only in-core
without having to get written out to the outside world.

I understand that this is not meant to be used as a "we create them
only for ourselves and they are available only to us while we work,
but once we are satisfied we make them available to others", which
is much better done by creating an on-disk ephemeral object store,
write such objects in them, and then decide at the end of the
process between discarding the ephemeral store and moving objects
from there to the main object store.

> Furthermore, I think that going forward it can serve more purposes as we
> now have an easy way to write and read objects that will not get
> persisted. I could see that this may be useful when for example
> re-merging diffs. But eventually, once we have the object storage format
> extension wired up, callers might even want to manually set up an
> in-memory database as the primary ODB for write operations so that no
> data will be persisted in an arbitrary write.

;-)

> Last but not least, this patch series also serves the purpose of
> eventually getting rid of the `struct object_info::whence` member.
> Instead, we'll simply yield the ODB source a specific object has been
> read from, together with some backend-specific data, which gives
> strictly more information compared to the status quo.
>
> The series is based on cf2139f8e1 (The 24th batch, 2026-04-01) with
> ps/odb-cleanup at 109bcb7d1d (odb: drop unneeded headers and forward
> decls, 2026-04-01) merged into it.
>
> Thanks!
>
> Patrick
>
> ---
> Patrick Steinhardt (16):
>       odb: introduce "inmemory" source
>       odb/source-inmemory: implement `free()` callback
>       odb: fix unnecessary call to `find_cached_object()`
>       odb/source-inmemory: implement `read_object_info()` callback
>       odb/source-inmemory: implement `read_object_stream()` callback
>       odb/source-inmemory: implement `write_object()` callback
>       odb/source-inmemory: implement `write_object_stream()` callback
>       cbtree: allow using arbitrary wrapper structures for nodes
>       oidtree: add ability to store data
>       odb/source-inmemory: convert to use oidtree
>       odb/source-inmemory: implement `for_each_object()` callback
>       odb/source-inmemory: implement `find_abbrev_len()` callback
>       odb/source-inmemory: implement `count_objects()` callback
>       odb/source-inmemory: implement `freshen_object()` callback
>       odb/source-inmemory: stub out remaining functions
>       odb: generic inmemory source
>
>  Makefile                 |   1 +
>  cbtree.c                 |  25 +++-
>  cbtree.h                 |  11 +-
>  loose.c                  |   2 +-
>  meson.build              |   1 +
>  object-file.c            |   3 +-
>  odb.c                    |  82 ++---------
>  odb.h                    |   4 +-
>  odb/source-inmemory.c    | 375 +++++++++++++++++++++++++++++++++++++++++++++++
>  odb/source-inmemory.h    |  33 +++++
>  odb/source.h             |   3 +
>  oidtree.c                |  66 ++++++---
>  oidtree.h                |  12 +-
>  t/unit-tests/u-oidtree.c |  26 +++-
>  14 files changed, 529 insertions(+), 115 deletions(-)
>
>
> ---
> base-commit: 3d05c3e2906489caa9f12f0af18dc233a6b8032c
> change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback
  2026-04-03  6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
@ 2026-04-03 22:11   ` Junio C Hamano
  2026-04-08  8:22     ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Junio C Hamano @ 2026-04-03 22:11 UTC (permalink / raw)
  To: Patrick Steinhardt, Justin Tobler; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> Implement the `write_object_stream()` callback function for the inmemory
> source.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)

As the signature of the .read() method drastically changes in
another topic in flight,

  https://lore.kernel.org/git/20260402213220.2651523-4-jltobler@gmail.com/

this needs a bit of inter-topic coordination.

> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> index d2fc4c4054..890e2a8c7c 100644
> --- a/odb/source-inmemory.c
> +++ b/odb/source-inmemory.c
> @@ -123,6 +123,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
>  	return 0;
>  }
>  
> +static int odb_source_inmemory_write_object_stream(struct odb_source *source,
> +						   struct odb_write_stream *stream,
> +						   size_t len,
> +						   struct object_id *oid)
> +{
> +	size_t total_read = 0;
> +	char *data;
> +	int ret;
> +
> +	CALLOC_ARRAY(data, len);
> +	while (!stream->is_finished) {
> +		unsigned long bytes_read;
> +		const void *in;
> +
> +		in = stream->read(stream, &bytes_read);
> +		if (total_read + bytes_read > len) {
> +			ret = error("object stream yielded more bytes than expected");
> +			goto out;
> +		}
> +
> +		memcpy(data, in, bytes_read);
> +		total_read += bytes_read;
> +	}
> +
> +	if (total_read != len) {
> +		ret = error("object stream yielded less bytes than expected");
> +		goto out;
> +	}
> +
> +	ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid,
> +					       NULL, 0);
> +	if (ret < 0)
> +		goto out;
> +
> +out:
> +	free(data);
> +	return ret;
> +}
> +
>  static void odb_source_inmemory_free(struct odb_source *source)
>  {
>  	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
> @@ -144,6 +183,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
>  	source->base.read_object_info = odb_source_inmemory_read_object_info;
>  	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
>  	source->base.write_object = odb_source_inmemory_write_object;
> +	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
>  
>  	return source;
>  }

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/16] odb: introduce "inmemory" source
  2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano
@ 2026-04-08  8:22   ` Patrick Steinhardt
  2026-04-08 21:48     ` Junio C Hamano
  0 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-08  8:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Apr 03, 2026 at 08:41:16AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > this patch series introduces the second object database source type,
> > which is the "inmemory" source.
> 
> I cannot read the word without a hyphen, i.e.e.g., "in-memory".

Fair. I think I'll keep it as `odb_source_inmemory` in the sources,
which I find easier ot parse than `odb_source_in_memory`, but will adapt
to "in-memory" in prose. I already did this for most of the part, but
not in the cover letter indeed.

> > This source may seem somewhat odd at first: it always starts out empty,
> > and any object written into it will only exist in memory until the
> > process exits. But the source already serves a purpose in our codebase,
> > where some commands, for example git-blame(1), write an in-memory
> > worktree commit.
> 
> Intermediate tree and blob objects you need while making an octopus
> merge may also benefit from this feature, to stay only in-core
> without having to get written out to the outside world.
> 
> I understand that this is not meant to be used as a "we create them
> only for ourselves and they are available only to us while we work,
> but once we are satisfied we make them available to others", which
> is much better done by creating an on-disk ephemeral object store,
> write such objects in them, and then decide at the end of the
> process between discarding the ephemeral store and moving objects
> from there to the main object store.

Yeah, I think there's a bunch of use cases where this could be useful
going forward.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback
  2026-04-03 22:11   ` Junio C Hamano
@ 2026-04-08  8:22     ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-08  8:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Justin Tobler, git

On Fri, Apr 03, 2026 at 03:11:16PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Implement the `write_object_stream()` callback function for the inmemory
> > source.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> 
> As the signature of the .read() method drastically changes in
> another topic in flight,
> 
>   https://lore.kernel.org/git/20260402213220.2651523-4-jltobler@gmail.com/
> 
> this needs a bit of inter-topic coordination.

Fair. I think Justin's patch series is close to landing, so I'll rebase
my patch series on top of his. Thanks for flagging.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 01/16] odb: introduce "inmemory" source
  2026-04-03  6:01 ` [PATCH 01/16] " Patrick Steinhardt
@ 2026-04-08 21:00   ` Justin Tobler
  2026-04-09  5:22     ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Justin Tobler @ 2026-04-08 21:00 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> Next to our typical object database sources, each object database also
> has an implicit source of "cached" objects. These cached objects only
> exist in memory and some use cases:
> 
>   - They contain evergreen objects that we expect to always exist, like
>     for example the empty tree.
> 
>   - They can be used to store temporary objects that we don't want to
>     persist to disk.
> 
> Overall, their use is somewhat restricted though. For example, we don't
> provide the ability to use it as a temporary object database source that
> allows the user to write objects, but discard them after Git exists. So
> while these cached objects behave almost like a source, they aren't used
> as one.

I find the wording of the second bullet point and paragraph above a
little confusing. Are there existing uses where new objects are written
to only the cache?

> This is about to change over the following commits, where we will turn
> cached objects into a new "inmemory" source. This will allow us to use
> it exactly the same as any other source by providing the same common
> interface as the "files" source.

Treating the object cache just like any other ODB source seems like a
good direction.

> For now, the inmemory source only hosts the cached objects and doesn't
> provide any logic yet. This will change with subsequent commits, where
> we move respective functionality into the source.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Makefile              |  1 +
>  meson.build           |  1 +
>  odb.c                 | 21 +++++++++++++--------
>  odb.h                 |  4 ++--
>  odb/source-inmemory.c | 12 ++++++++++++
>  odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++
>  odb/source.h          |  3 +++
>  7 files changed, 67 insertions(+), 10 deletions(-)
> 
[snip]
> @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o)
>  	odb_close(o);
>  	odb_free_sources(o);
>  
> -	for (size_t i = 0; i < o->cached_object_nr; i++)
> -		free((char *) o->cached_objects[i].value.buf);
> -	free(o->cached_objects);
> +	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
> +		free((char *) o->inmemory_objects->objects[i].value.buf);
> +	free(o->inmemory_objects->objects);
> +	free(o->inmemory_objects->base.path);
> +	free(o->inmemory_objects);

Should we have some sort of `odb_source_inmemory_release()`?

>  
>  	string_list_clear(&o->submodule_source_paths, 0);
>  
> diff --git a/odb.h b/odb.h
> index 3a711f6547..3d20270a05 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -8,6 +8,7 @@
>  #include "thread-utils.h"
>  
>  struct cached_object_entry;
> +struct odb_source_inmemory;
>  struct packed_git;
>  struct repository;
>  struct strbuf;
> @@ -98,8 +99,7 @@ struct object_database {
>  	 * to write them into the object store (e.g. a browse-only
>  	 * application).
>  	 */
> -	struct cached_object_entry *cached_objects;
> -	size_t cached_object_nr, cached_object_alloc;
> +	struct odb_source_inmemory *inmemory_objects;

We store an inmemory ODB source instead of the cache object info
directly. Makes sense. 

>  
>  	/*
>  	 * A fast, rough count of the number of objects in the repository.
> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> new file mode 100644
> index 0000000000..c7ac5c24f0
> --- /dev/null
> +++ b/odb/source-inmemory.c
> @@ -0,0 +1,12 @@
> +#include "git-compat-util.h"
> +#include "odb/source-inmemory.h"
> +
> +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
> +{
> +	struct odb_source_inmemory *source;
> +
> +	CALLOC_ARRAY(source, 1);
> +	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);

huh, so we set the path for the `struct odb_source` to "source". In the
context of an inmemory source, a path doesn't make much sense. I suspect
though that storing a path is likely only useful the context of the
files ODB source. Is there reason for us to still keep this around in
the generic ODB source?

> +
> +	return source;
> +}
> diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
> new file mode 100644
> index 0000000000..95477bf36d
> --- /dev/null
> +++ b/odb/source-inmemory.h
> @@ -0,0 +1,35 @@
> +#ifndef ODB_SOURCE_INMEMORY_H
> +#define ODB_SOURCE_INMEMORY_H
> +
> +#include "odb/source.h"
> +
> +struct cached_object_entry;
> +
> +/*
> + * An inmemory source that you can write objects to that shall be made
> + * available for reading, but that shouldn't ever be persisted to disk. Note
> + * that any objects written to this source will be stored in memory, so the
> + * number of objects you can store is limited by available system memory.
> + */
> +struct odb_source_inmemory {
> +	struct odb_source base;
> +
> +	struct cached_object_entry *objects;
> +	size_t objects_nr, objects_alloc;
> +};

This new ODB source now just contains the object cache info. Looks good.

-Justin

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 02/16] odb/source-inmemory: implement `free()` callback
  2026-04-03  6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
@ 2026-04-08 21:05   ` Justin Tobler
  0 siblings, 0 replies; 85+ messages in thread
From: Justin Tobler @ 2026-04-08 21:05 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> @@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o)
>  	odb_close(o);
>  	odb_free_sources(o);
>  
> -	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
> -		free((char *) o->inmemory_objects->objects[i].value.buf);
> -	free(o->inmemory_objects->objects);
> -	free(o->inmemory_objects->base.path);
> -	free(o->inmemory_objects);

Ah ok, this addresses a comment in the previous patch.

> -
>  	string_list_clear(&o->submodule_source_paths, 0);
>  
>  	free(o);
> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> index c7ac5c24f0..ccbb622eae 100644
> --- a/odb/source-inmemory.c
> +++ b/odb/source-inmemory.c
> @@ -1,6 +1,16 @@
>  #include "git-compat-util.h"
>  #include "odb/source-inmemory.h"
>  
> +static void odb_source_inmemory_free(struct odb_source *source)
> +{
> +	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
> +	for (size_t i = 0; i < inmemory->objects_nr; i++)
> +		free((char *) inmemory->objects[i].value.buf);
> +	free(inmemory->objects);
> +	free(inmemory->base.path);
> +	free(inmemory);
> +}
> +
>  struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
>  {
>  	struct odb_source_inmemory *source;
> @@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
>  	CALLOC_ARRAY(source, 1);
>  	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
>  
> +	source->base.free = odb_source_inmemory_free;

We wire up a function to specifically handle freeing the inmemory ODB
source. Looks good.

-Justin

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()`
  2026-04-03  6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
@ 2026-04-08 21:13   ` Justin Tobler
  2026-04-09  5:22     ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Justin Tobler @ 2026-04-08 21:13 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> diff --git a/odb.c b/odb.c
> index d321242353..21cdedc31c 100644
> --- a/odb.c
> +++ b/odb.c
> @@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb,
>  	char *co_buf;
>  
>  	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
> -	if (odb_has_object(odb, oid, 0) ||
> -	    find_cached_object(odb, oid))
> +	if (odb_has_object(odb, oid, 0))

Nice, odb_has_object() does indeed already check the object cache so
that makes the explicit find_cached_object() redundant.

If a future where temporary objects could be written to the inmemory ODB
source, would there ever be a reason for odb_has_object() to
differentiate between inmemory and real objects?

-Justin

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback
  2026-04-03  6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
@ 2026-04-08 21:24   ` Justin Tobler
  2026-04-09  5:22     ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Justin Tobler @ 2026-04-08 21:24 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> Implement the `read_object_stream()` callback function for the inmemory
> source.

Hmmm, if the whole object is already in memory, outside providing a
complete ODB source interface, is there really much reason for streaming
the object in practice?

The patch itself looks good though.

-Justin

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/16] odb: introduce "inmemory" source
  2026-04-08  8:22   ` Patrick Steinhardt
@ 2026-04-08 21:48     ` Junio C Hamano
  2026-04-09  5:22       ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Junio C Hamano @ 2026-04-08 21:48 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> On Fri, Apr 03, 2026 at 08:41:16AM -0700, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>> 
>> > this patch series introduces the second object database source type,
>> > which is the "inmemory" source.
>> 
>> I cannot read the word without a hyphen, i.e.e.g., "in-memory".
>
> Fair. I think I'll keep it as `odb_source_inmemory` in the sources,
> which I find easier ot parse than `odb_source_in_memory`, but will adapt
> to "in-memory" in prose. I already did this for most of the part, but
> not in the cover letter indeed.

Fair.

FWIW, we do the same for "in core" or "in-core" in prose, and
"incore" in identifier names, so the above is understandable
position to take.

But stepping back a bit, does this new "in memory" refer to a
concept that is different from what the rest of the system uses "in
core" to represent?

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/16] odb: introduce "inmemory" source
  2026-04-08 21:48     ` Junio C Hamano
@ 2026-04-09  5:22       ` Patrick Steinhardt
  2026-04-09 13:46         ` Junio C Hamano
  0 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  5:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Apr 08, 2026 at 02:48:52PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > On Fri, Apr 03, 2026 at 08:41:16AM -0700, Junio C Hamano wrote:
> >> Patrick Steinhardt <ps@pks.im> writes:
> >> 
> >> > this patch series introduces the second object database source type,
> >> > which is the "inmemory" source.
> >> 
> >> I cannot read the word without a hyphen, i.e.e.g., "in-memory".
> >
> > Fair. I think I'll keep it as `odb_source_inmemory` in the sources,
> > which I find easier ot parse than `odb_source_in_memory`, but will adapt
> > to "in-memory" in prose. I already did this for most of the part, but
> > not in the cover letter indeed.
> 
> Fair.
> 
> FWIW, we do the same for "in core" or "in-core" in prose, and
> "incore" in identifier names, so the above is understandable
> position to take.
> 
> But stepping back a bit, does this new "in memory" refer to a
> concept that is different from what the rest of the system uses "in
> core" to represent?

No, in principle it's not any different. One of the reasons I decided to
go with "in memory" though is that this backend may eventually be
(power-)user-facing via the planned "objectStorage" extension.

This extension will work similar to how the "refStorage" extension
works, where every backend has a schema followed by an optional payload.
So for the files backend it would be "files://<path>", and if one wants
to configure a temporary ODB source that doesn't store objects it would
be "inmemory://". And overall, I think that "inmemory" is a lot easier
to understand intuitively compared to "incore".

The counter argument may be that this really only is for power users
anyway, as it's a rather risky thing to do (e.g. you must not update any
refs), and such power users may understand the concept of "in-core". But
even there I feel like it makes sense to rather say "in-memory".

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 01/16] odb: introduce "inmemory" source
  2026-04-08 21:00   ` Justin Tobler
@ 2026-04-09  5:22     ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  5:22 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

On Wed, Apr 08, 2026 at 04:00:48PM -0500, Justin Tobler wrote:
> On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> > Next to our typical object database sources, each object database also
> > has an implicit source of "cached" objects. These cached objects only
> > exist in memory and some use cases:
> > 
> >   - They contain evergreen objects that we expect to always exist, like
> >     for example the empty tree.
> > 
> >   - They can be used to store temporary objects that we don't want to
> >     persist to disk.
> > 
> > Overall, their use is somewhat restricted though. For example, we don't
> > provide the ability to use it as a temporary object database source that
> > allows the user to write objects, but discard them after Git exists. So
> > while these cached objects behave almost like a source, they aren't used
> > as one.
> 
> I find the wording of the second bullet point and paragraph above a
> little confusing. Are there existing uses where new objects are written
> to only the cache?

Yes, there's a single user with git-blame(1). I'll mention that user
explcitly.

> > @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o)
> >  	odb_close(o);
> >  	odb_free_sources(o);
> >  
> > -	for (size_t i = 0; i < o->cached_object_nr; i++)
> > -		free((char *) o->cached_objects[i].value.buf);
> > -	free(o->cached_objects);
> > +	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
> > +		free((char *) o->inmemory_objects->objects[i].value.buf);
> > +	free(o->inmemory_objects->objects);
> > +	free(o->inmemory_objects->base.path);
> > +	free(o->inmemory_objects);
> 
> Should we have some sort of `odb_source_inmemory_release()`?

Yup, this is coming in subsequent commits.

> > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> > new file mode 100644
> > index 0000000000..c7ac5c24f0
> > --- /dev/null
> > +++ b/odb/source-inmemory.c
> > @@ -0,0 +1,12 @@
> > +#include "git-compat-util.h"
> > +#include "odb/source-inmemory.h"
> > +
> > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
> > +{
> > +	struct odb_source_inmemory *source;
> > +
> > +	CALLOC_ARRAY(source, 1);
> > +	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
> 
> huh, so we set the path for the `struct odb_source` to "source". In the
> context of an inmemory source, a path doesn't make much sense. I suspect
> though that storing a path is likely only useful the context of the
> files ODB source. Is there reason for us to still keep this around in
> the generic ODB source?

There are two reasons for the "path" field to exist:

  - It is used to compare sources with one another to figure out whether
    two sources are actually the same. This is used when reloading
    sources. This usage makes sense in principle, but it's wrong that we
    consider this to be a "path" -- it should rather be considered an
    opaque "payload".

  - The path field is used in a bunch of sites to actually figure out
    paths. This is plain wrong, as we cannot guarantee that the field
    even is a path for backends that don't store data on the filesystem.

It's one of the topics that we've got on our plate, to disentangle this.
The goal is ultimately to move the path into the files backend, fix up
callers to do the right thing (TM) and then convert the current path
field that we have into a payload.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()`
  2026-04-08 21:13   ` Justin Tobler
@ 2026-04-09  5:22     ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  5:22 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

On Wed, Apr 08, 2026 at 04:13:45PM -0500, Justin Tobler wrote:
> On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> > diff --git a/odb.c b/odb.c
> > index d321242353..21cdedc31c 100644
> > --- a/odb.c
> > +++ b/odb.c
> > @@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb,
> >  	char *co_buf;
> >  
> >  	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
> > -	if (odb_has_object(odb, oid, 0) ||
> > -	    find_cached_object(odb, oid))
> > +	if (odb_has_object(odb, oid, 0))
> 
> Nice, odb_has_object() does indeed already check the object cache so
> that makes the explicit find_cached_object() redundant.
> 
> If a future where temporary objects could be written to the inmemory ODB
> source, would there ever be a reason for odb_has_object() to
> differentiate between inmemory and real objects?

We could in theory just append the in-memory source to the normal list
of sources, and that would ensure that all the usual operations would
know to also consider this source. But there's a couple of points that
speak against it, at least for now:

  - Callers that explicitly want to explicitly write temporary objects
    need to have a handle to the in-memory source. That handle would be
    hard to obtain if we were to only store the source in the list of
    sources.

  - It would be a change in behaviour if functions like
    `odb_for_each_object()` were to also enumerate in-memory objects.

The former one could be solved by having both the direct pointer and
keep the source in the list. The latter can be solved by having a
separate flag for `odb_for_each_object()` that tells the ODB that we
want to exclude/include in-memory objects.

But overall it feels like this would only complicate things without much
of a tangible benefit.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback
  2026-04-08 21:24   ` Justin Tobler
@ 2026-04-09  5:22     ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  5:22 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

On Wed, Apr 08, 2026 at 04:24:13PM -0500, Justin Tobler wrote:
> On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> > Implement the `read_object_stream()` callback function for the inmemory
> > source.
> 
> Hmmm, if the whole object is already in memory, outside providing a
> complete ODB source interface, is there really much reason for streaming
> the object in practice?

I cannot think of any, but wanted to provide this function anyway so
that the backend is complete.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 00/17] odb: introduce "in-memory" source
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (16 preceding siblings ...)
  2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano
@ 2026-04-09  7:24 ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 01/17] " Patrick Steinhardt
                     ` (17 more replies)
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
  18 siblings, 18 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Hi,

this patch series introduces the second object database source type,
which is the "in-memory" source.

This source may seem somewhat odd at first: it always starts out empty,
and any object written into it will only exist in memory until the
process exits. But the source already serves a purpose in our codebase,
where some commands, for example git-blame(1), write an in-memory
worktree commit.

Furthermore, I think that going forward it can serve more purposes as we
now have an easy way to write and read objects that will not get
persisted. I could see that this may be useful when for example
re-merging diffs. But eventually, once we have the object storage format
extension wired up, callers might even want to manually set up an
in-memory database as the primary ODB for write operations so that no
data will be persisted in an arbitrary write.

Last but not least, this patch series also serves the purpose of
eventually getting rid of the `struct object_info::whence` member.
Instead, we'll simply yield the ODB source a specific object has been
read from, together with some backend-specific data, which gives
strictly more information compared to the status quo.

The series is based onb15384c06f (A bit more post -rc1, 2026-04-08)
with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make
`write_object_stream()` pluggable, 2026-04-02) merged into it.

Changes in v2:
  - Fix handling of object IDs when writing objects.
  - I've changed the base of this series to include Justin's
    refactorings for the ODB write streams. I've updated the above
    paragraph detailing the merge base accordingly. @Junio: I'm fine to
    defer this patch series a bit until Justin's patch series has been
    merged to `next` in case this causes inconvenience.
  - Use "in-memory" instead of "inmemory" in commit messages.
  - Link to v1: https://patch.msgid.link/20260403-b4-pks-odb-source-inmemory-v1-0-8b8d1abaa25e@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (17):
      odb: introduce "in-memory" source
      odb/source-inmemory: implement `free()` callback
      odb: fix unnecessary call to `find_cached_object()`
      odb/source-inmemory: implement `read_object_info()` callback
      odb/source-inmemory: implement `read_object_stream()` callback
      odb/source-inmemory: implement `write_object()` callback
      odb/source-inmemory: implement `write_object()` callback
      odb/source-inmemory: implement `write_object_stream()` callback
      cbtree: allow using arbitrary wrapper structures for nodes
      oidtree: add ability to store data
      odb/source-inmemory: convert to use oidtree
      odb/source-inmemory: implement `for_each_object()` callback
      odb/source-inmemory: implement `find_abbrev_len()` callback
      odb/source-inmemory: implement `count_objects()` callback
      odb/source-inmemory: implement `freshen_object()` callback
      odb/source-inmemory: stub out remaining functions
      odb: generic in-memory source

 Makefile                 |   1 +
 cbtree.c                 |  25 +++-
 cbtree.h                 |  11 +-
 loose.c                  |   2 +-
 meson.build              |   1 +
 object-file.c            |   3 +-
 odb.c                    |  82 ++--------
 odb.h                    |   4 +-
 odb/source-inmemory.c    | 378 +++++++++++++++++++++++++++++++++++++++++++++++
 odb/source-inmemory.h    |  33 +++++
 odb/source.h             |   3 +
 oidtree.c                |  66 ++++++---
 oidtree.h                |  12 +-
 t/unit-tests/u-oidtree.c |  26 +++-
 14 files changed, 532 insertions(+), 115 deletions(-)

Range-diff versus v1:

 1:  b7cd1ae8d1 !  1:  df8567d908 odb: introduce "inmemory" source
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    odb: introduce "inmemory" source
    +    odb: introduce "in-memory" source
     
         Next to our typical object database sources, each object database also
         has an implicit source of "cached" objects. These cached objects only
    @@ Commit message
             for example the empty tree.
     
           - They can be used to store temporary objects that we don't want to
    -        persist to disk.
    +        persist to disk, which is used by git-blame(1) to create a fake
    +        worktree commit.
     
         Overall, their use is somewhat restricted though. For example, we don't
         provide the ability to use it as a temporary object database source that
    @@ Commit message
         as one.
     
         This is about to change over the following commits, where we will turn
    -    cached objects into a new "inmemory" source. This will allow us to use
    +    cached objects into a new "in-memory" source. This will allow us to use
         it exactly the same as any other source by providing the same common
         interface as the "files" source.
     
    -    For now, the inmemory source only hosts the cached objects and doesn't
    +    For now, the in-memory source only hosts the cached objects and doesn't
         provide any logic yet. This will change with subsequent commits, where
         we move respective functionality into the source.
     
    @@ Makefile: LIB_OBJS += object.o
      LIB_OBJS += odb/source-files.o
     +LIB_OBJS += odb/source-inmemory.o
      LIB_OBJS += odb/streaming.o
    + LIB_OBJS += odb/transaction.o
      LIB_OBJS += oid-array.o
    - LIB_OBJS += oidmap.o
     
      ## meson.build ##
     @@ meson.build: libgit_sources = [
    @@ meson.build: libgit_sources = [
        'odb/source-files.c',
     +  'odb/source-inmemory.c',
        'odb/streaming.c',
    +   'odb/transaction.c',
        'oid-array.c',
    -   'oidmap.c',
     
      ## odb.c ##
     @@
 2:  298758b4d5 !  2:  e1ffe26ca9 odb/source-inmemory: implement `free()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `free()` callback
     
    -    Implement the `free()` callback function for the "inmemory" source.
    +    Implement the `free()` callback function for the "in-memory" source.
     
         Note that this requires us to define `struct cached_object_entry` in
         "odb/source-inmemory.h", as it is accessed in both "odb.c" and
 3:  b57997d027 =  3:  f58424bb80 odb: fix unnecessary call to `find_cached_object()`
 4:  9ae26b9aa1 !  4:  786a240391 odb/source-inmemory: implement `read_object_info()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `read_object_info()` callback
     
    -    Implement the `read_object_info()` callback function for the inmemory
    +    Implement the `read_object_info()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
 5:  5d9781009e !  5:  22d3e7134b odb/source-inmemory: implement `read_object_stream()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `read_object_stream()` callback
     
    -    Implement the `read_object_stream()` callback function for the inmemory
    +    Implement the `read_object_stream()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
 6:  bc9620c608 !  6:  139e7f2beb odb/source-inmemory: implement `write_object()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `write_object()` callback
     
    -    Implement the `write_object()` callback function for the inmemory
    +    Implement the `write_object()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
 -:  ---------- >  7:  7f5ab16d1c odb/source-inmemory: implement `write_object()` callback
 7:  6d9f8634e1 !  8:  6006f5e782 odb/source-inmemory: implement `write_object_stream()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `write_object_stream()` callback
     
    -    Implement the `write_object_stream()` callback function for the inmemory
    +    Implement the `write_object_stream()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
    @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so
     +						   size_t len,
     +						   struct object_id *oid)
     +{
    ++	char buf[16384];
     +	size_t total_read = 0;
     +	char *data;
     +	int ret;
     +
     +	CALLOC_ARRAY(data, len);
     +	while (!stream->is_finished) {
    -+		unsigned long bytes_read;
    -+		const void *in;
    ++		ssize_t bytes_read;
     +
    -+		in = stream->read(stream, &bytes_read);
    ++		bytes_read = odb_write_stream_read(stream, buf, sizeof(buf));
     +		if (total_read + bytes_read > len) {
     +			ret = error("object stream yielded more bytes than expected");
     +			goto out;
     +		}
     +
    -+		memcpy(data, in, bytes_read);
    ++		memcpy(data, buf, bytes_read);
     +		total_read += bytes_read;
     +	}
     +
 8:  45f9c761ce =  9:  392d9bf6ed cbtree: allow using arbitrary wrapper structures for nodes
 9:  5eb7742886 = 10:  9fd88ffd16 oidtree: add ability to store data
10:  4f95cd0a51 ! 11:  6d4a77b47c odb/source-inmemory: convert to use oidtree
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: convert to use oidtree
     
    -    The inmemory source stores its objects in a simple array that we grow as
    +    The in-memory source stores its objects in a simple array that we grow as
         needed. This has a couple of downsides:
     
           - The object lookup is O(n). This doesn't matter in practice because
    @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so
     -	struct cached_object_entry *object;
     +	struct inmemory_object *object;
      
    + 	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
    + 
     -	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
     -		   inmemory->objects_alloc);
     -	object = &inmemory->objects[inmemory->objects_nr++];
11:  fc231e22dc ! 12:  5f345d76ef odb/source-inmemory: implement `for_each_object()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `for_each_object()` callback
     
    -    Implement the `for_each_object()` callback function for the inmemory
    +    Implement the `for_each_object()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
12:  c2437b2ba5 ! 13:  b428a1760b odb/source-inmemory: implement `find_abbrev_len()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `find_abbrev_len()` callback
     
    -    Implement the `find_abbrev_len()` callback function for the inmemory
    +    Implement the `find_abbrev_len()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
13:  fee0586da7 ! 14:  564cc60392 odb/source-inmemory: implement `count_objects()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `count_objects()` callback
     
    -    Implement the `count_objects()` callback function for the inmemory
    +    Implement the `count_objects()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
14:  634392eaf9 ! 15:  9ddfb6f67b odb/source-inmemory: implement `freshen_object()` callback
    @@ Metadata
      ## Commit message ##
         odb/source-inmemory: implement `freshen_object()` callback
     
    -    Implement the `freshen_object()` callback function for the inmemory
    +    Implement the `freshen_object()` callback function for the in-memory
         source.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
15:  3d1f08a849 = 16:  d76329a424 odb/source-inmemory: stub out remaining functions
16:  29deff493d ! 17:  41cd562975 odb: generic inmemory source
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    odb: generic inmemory source
    +    odb: generic in-memory source
     
         Make the in-memory source generic.
     

---
base-commit: a3ebc5a08e67ccac4c915622049a968a31e48662
change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v2 01/17] odb: introduce "in-memory" source
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  9:26     ` Karthik Nayak
  2026-04-09  7:24   ` [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
                     ` (16 subsequent siblings)
  17 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Next to our typical object database sources, each object database also
has an implicit source of "cached" objects. These cached objects only
exist in memory and some use cases:

  - They contain evergreen objects that we expect to always exist, like
    for example the empty tree.

  - They can be used to store temporary objects that we don't want to
    persist to disk, which is used by git-blame(1) to create a fake
    worktree commit.

Overall, their use is somewhat restricted though. For example, we don't
provide the ability to use it as a temporary object database source that
allows the user to write objects, but discard them after Git exists. So
while these cached objects behave almost like a source, they aren't used
as one.

This is about to change over the following commits, where we will turn
cached objects into a new "in-memory" source. This will allow us to use
it exactly the same as any other source by providing the same common
interface as the "files" source.

For now, the in-memory source only hosts the cached objects and doesn't
provide any logic yet. This will change with subsequent commits, where
we move respective functionality into the source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Makefile              |  1 +
 meson.build           |  1 +
 odb.c                 | 21 +++++++++++++--------
 odb.h                 |  4 ++--
 odb/source-inmemory.c | 12 ++++++++++++
 odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++
 odb/source.h          |  3 +++
 7 files changed, 67 insertions(+), 10 deletions(-)

diff --git a/Makefile b/Makefile
index 22a8993482..3cda12c455 100644
--- a/Makefile
+++ b/Makefile
@@ -1218,6 +1218,7 @@ LIB_OBJS += object.o
 LIB_OBJS += odb.o
 LIB_OBJS += odb/source.o
 LIB_OBJS += odb/source-files.o
+LIB_OBJS += odb/source-inmemory.o
 LIB_OBJS += odb/streaming.o
 LIB_OBJS += odb/transaction.o
 LIB_OBJS += oid-array.o
diff --git a/meson.build b/meson.build
index 6dc23b3af2..ffa73ce7ce 100644
--- a/meson.build
+++ b/meson.build
@@ -404,6 +404,7 @@ libgit_sources = [
   'odb.c',
   'odb/source.c',
   'odb/source-files.c',
+  'odb/source-inmemory.c',
   'odb/streaming.c',
   'odb/transaction.c',
   'oid-array.c',
diff --git a/odb.c b/odb.c
index 40a5e9c4e0..60e1eead25 100644
--- a/odb.c
+++ b/odb.c
@@ -14,6 +14,7 @@
 #include "object-file.h"
 #include "object-name.h"
 #include "odb.h"
+#include "odb/source-inmemory.h"
 #include "packfile.h"
 #include "path.h"
 #include "promisor-remote.h"
@@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob
 		.type = OBJ_TREE,
 		.buf = "",
 	};
-	const struct cached_object_entry *co = object_store->cached_objects;
+	const struct cached_object_entry *co = object_store->inmemory_objects->objects;
 
-	for (size_t i = 0; i < object_store->cached_object_nr; i++, co++)
+	for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++)
 		if (oideq(&co->oid, oid))
 			return &co->value;
 
@@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb,
 	    find_cached_object(odb, oid))
 		return 0;
 
-	ALLOC_GROW(odb->cached_objects,
-		   odb->cached_object_nr + 1, odb->cached_object_alloc);
-	co = &odb->cached_objects[odb->cached_object_nr++];
+	ALLOC_GROW(odb->inmemory_objects->objects,
+		   odb->inmemory_objects->objects_nr + 1,
+		   odb->inmemory_objects->objects_alloc);
+	co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++];
 	co->value.size = len;
 	co->value.type = type;
 	co_buf = xmalloc(len);
@@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo,
 	o->sources = odb_source_new(o, primary_source, true);
 	o->sources_tail = &o->sources->next;
 	o->alternate_db = xstrdup_or_null(secondary_sources);
+	o->inmemory_objects = odb_source_inmemory_new(o);
 
 	free(to_free);
 
@@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o)
 	odb_close(o);
 	odb_free_sources(o);
 
-	for (size_t i = 0; i < o->cached_object_nr; i++)
-		free((char *) o->cached_objects[i].value.buf);
-	free(o->cached_objects);
+	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
+		free((char *) o->inmemory_objects->objects[i].value.buf);
+	free(o->inmemory_objects->objects);
+	free(o->inmemory_objects->base.path);
+	free(o->inmemory_objects);
 
 	string_list_clear(&o->submodule_source_paths, 0);
 
diff --git a/odb.h b/odb.h
index 9eb8355aca..c3a7edf9c8 100644
--- a/odb.h
+++ b/odb.h
@@ -8,6 +8,7 @@
 #include "thread-utils.h"
 
 struct cached_object_entry;
+struct odb_source_inmemory;
 struct packed_git;
 struct repository;
 struct strbuf;
@@ -80,8 +81,7 @@ struct object_database {
 	 * to write them into the object store (e.g. a browse-only
 	 * application).
 	 */
-	struct cached_object_entry *cached_objects;
-	size_t cached_object_nr, cached_object_alloc;
+	struct odb_source_inmemory *inmemory_objects;
 
 	/*
 	 * A fast, rough count of the number of objects in the repository.
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
new file mode 100644
index 0000000000..c7ac5c24f0
--- /dev/null
+++ b/odb/source-inmemory.c
@@ -0,0 +1,12 @@
+#include "git-compat-util.h"
+#include "odb/source-inmemory.h"
+
+struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
+{
+	struct odb_source_inmemory *source;
+
+	CALLOC_ARRAY(source, 1);
+	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
+
+	return source;
+}
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
new file mode 100644
index 0000000000..95477bf36d
--- /dev/null
+++ b/odb/source-inmemory.h
@@ -0,0 +1,35 @@
+#ifndef ODB_SOURCE_INMEMORY_H
+#define ODB_SOURCE_INMEMORY_H
+
+#include "odb/source.h"
+
+struct cached_object_entry;
+
+/*
+ * An inmemory source that you can write objects to that shall be made
+ * available for reading, but that shouldn't ever be persisted to disk. Note
+ * that any objects written to this source will be stored in memory, so the
+ * number of objects you can store is limited by available system memory.
+ */
+struct odb_source_inmemory {
+	struct odb_source base;
+
+	struct cached_object_entry *objects;
+	size_t objects_nr, objects_alloc;
+};
+
+/* Create a new in-memory object database source. */
+struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb);
+
+/*
+ * Cast the given object database source to the inmemory backend. This will
+ * cause a BUG in case the source doesn't use this backend.
+ */
+static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source)
+{
+	if (source->type != ODB_SOURCE_INMEMORY)
+		BUG("trying to downcast source of type '%d' to inmemory", source->type);
+	return container_of(source, struct odb_source_inmemory, base);
+}
+
+#endif
diff --git a/odb/source.h b/odb/source.h
index f706e0608a..cd14f9e046 100644
--- a/odb/source.h
+++ b/odb/source.h
@@ -13,6 +13,9 @@ enum odb_source_type {
 
 	/* The "files" backend that uses loose objects and packfiles. */
 	ODB_SOURCE_FILES,
+
+	/* The "inmemory" backend that stores objects in memory. */
+	ODB_SOURCE_INMEMORY,
 };
 
 struct object_id;

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 01/17] " Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
                     ` (15 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `free()` callback function for the "in-memory" source.

Note that this requires us to define `struct cached_object_entry` in
"odb/source-inmemory.h", as it is accessed in both "odb.c" and
"odb/source-inmemory.c" now. This will be fixed in subsequent commits
though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 25 ++++---------------------
 odb/source-inmemory.c | 12 ++++++++++++
 odb/source-inmemory.h |  9 ++++++++-
 3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/odb.c b/odb.c
index 60e1eead25..1d65825ed3 100644
--- a/odb.c
+++ b/odb.c
@@ -32,21 +32,6 @@
 KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
 	struct odb_source *, 1, fspathhash, fspatheq)
 
-/*
- * This is meant to hold a *small* number of objects that you would
- * want odb_read_object() to be able to return, but yet you do not want
- * to write them into the object store (e.g. a browse-only
- * application).
- */
-struct cached_object_entry {
-	struct object_id oid;
-	struct cached_object {
-		enum object_type type;
-		const void *buf;
-		unsigned long size;
-	} value;
-};
-
 static const struct cached_object *find_cached_object(struct object_database *object_store,
 						      const struct object_id *oid)
 {
@@ -1109,6 +1094,10 @@ static void odb_free_sources(struct object_database *o)
 		odb_source_free(o->sources);
 		o->sources = next;
 	}
+
+	odb_source_free(&o->inmemory_objects->base);
+	o->inmemory_objects = NULL;
+
 	kh_destroy_odb_path_map(o->source_by_path);
 	o->source_by_path = NULL;
 }
@@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o)
 	odb_close(o);
 	odb_free_sources(o);
 
-	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
-		free((char *) o->inmemory_objects->objects[i].value.buf);
-	free(o->inmemory_objects->objects);
-	free(o->inmemory_objects->base.path);
-	free(o->inmemory_objects);
-
 	string_list_clear(&o->submodule_source_paths, 0);
 
 	free(o);
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index c7ac5c24f0..ccbb622eae 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,6 +1,16 @@
 #include "git-compat-util.h"
 #include "odb/source-inmemory.h"
 
+static void odb_source_inmemory_free(struct odb_source *source)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	for (size_t i = 0; i < inmemory->objects_nr; i++)
+		free((char *) inmemory->objects[i].value.buf);
+	free(inmemory->objects);
+	free(inmemory->base.path);
+	free(inmemory);
+}
+
 struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 {
 	struct odb_source_inmemory *source;
@@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	CALLOC_ARRAY(source, 1);
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
+	source->base.free = odb_source_inmemory_free;
+
 	return source;
 }
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
index 95477bf36d..14dc06f7c3 100644
--- a/odb/source-inmemory.h
+++ b/odb/source-inmemory.h
@@ -3,7 +3,14 @@
 
 #include "odb/source.h"
 
-struct cached_object_entry;
+struct cached_object_entry {
+	struct object_id oid;
+	struct cached_object {
+		enum object_type type;
+		const void *buf;
+		unsigned long size;
+	} value;
+};
 
 /*
  * An inmemory source that you can write objects to that shall be made

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()`
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 01/17] " Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
                     ` (14 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The function `odb_pretend_object()` writes an object into the in-memory
object database source. The effect of this is that the object will now
become readable, but it won't ever be persisted to disk.

Before storing the object, we first verify whether the object already
exists. This is done by calling `odb_has_object()` to check all sources,
followed by `find_cached_object()` to check whether we have already
stored the object in our in-memory source.

This is unnecessary though, as `odb_has_object()` already checks the
in-memory source transitively via:

  - `odb_has_object()`
  - `odb_read_object_info_extended()`
  - `do_oid_object_info_extended()`
  - `find_cached_object()`

Drop the explicit call to `find_cached_object()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/odb.c b/odb.c
index 1d65825ed3..ea3fcf5e11 100644
--- a/odb.c
+++ b/odb.c
@@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb,
 	char *co_buf;
 
 	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
-	if (odb_has_object(odb, oid, 0) ||
-	    find_cached_object(odb, oid))
+	if (odb_has_object(odb, oid, 0))
 		return 0;
 
 	ALLOC_GROW(odb->inmemory_objects->objects,

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  9:40     ` Karthik Nayak
  2026-04-09  7:24   ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
                     ` (13 subsequent siblings)
  17 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `read_object_info()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 39 +------------------------------------
 odb/source-inmemory.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 38 deletions(-)

diff --git a/odb.c b/odb.c
index ea3fcf5e11..6a3912adac 100644
--- a/odb.c
+++ b/odb.c
@@ -32,25 +32,6 @@
 KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
 	struct odb_source *, 1, fspathhash, fspatheq)
 
-static const struct cached_object *find_cached_object(struct object_database *object_store,
-						      const struct object_id *oid)
-{
-	static const struct cached_object empty_tree = {
-		.type = OBJ_TREE,
-		.buf = "",
-	};
-	const struct cached_object_entry *co = object_store->inmemory_objects->objects;
-
-	for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++)
-		if (oideq(&co->oid, oid))
-			return &co->value;
-
-	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
-		return &empty_tree;
-
-	return NULL;
-}
-
 int odb_mkstemp(struct object_database *odb,
 		struct strbuf *temp_filename, const char *pattern)
 {
@@ -570,7 +551,6 @@ static int do_oid_object_info_extended(struct object_database *odb,
 				       const struct object_id *oid,
 				       struct object_info *oi, unsigned flags)
 {
-	const struct cached_object *co;
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
@@ -580,25 +560,8 @@ static int do_oid_object_info_extended(struct object_database *odb,
 	if (is_null_oid(real))
 		return -1;
 
-	co = find_cached_object(odb, real);
-	if (co) {
-		if (oi) {
-			if (oi->typep)
-				*(oi->typep) = co->type;
-			if (oi->sizep)
-				*(oi->sizep) = co->size;
-			if (oi->disk_sizep)
-				*(oi->disk_sizep) = 0;
-			if (oi->delta_base_oid)
-				oidclr(oi->delta_base_oid, odb->repo->hash_algo);
-			if (oi->contentp)
-				*oi->contentp = xmemdupz(co->buf, co->size);
-			if (oi->mtimep)
-				*oi->mtimep = 0;
-			oi->whence = OI_CACHED;
-		}
+	if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags))
 		return 0;
-	}
 
 	odb_prepare_alternates(odb);
 
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index ccbb622eae..12c80f9b34 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,5 +1,57 @@
 #include "git-compat-util.h"
+#include "odb.h"
 #include "odb/source-inmemory.h"
+#include "repository.h"
+
+static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
+						      const struct object_id *oid)
+{
+	static const struct cached_object empty_tree = {
+		.type = OBJ_TREE,
+		.buf = "",
+	};
+	const struct cached_object_entry *co = source->objects;
+
+	for (size_t i = 0; i < source->objects_nr; i++, co++)
+		if (oideq(&co->oid, oid))
+			return &co->value;
+
+	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
+		return &empty_tree;
+
+	return NULL;
+}
+
+static int odb_source_inmemory_read_object_info(struct odb_source *source,
+						const struct object_id *oid,
+						struct object_info *oi,
+						enum object_info_flags flags UNUSED)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	const struct cached_object *object;
+
+	object = find_cached_object(inmemory, oid);
+	if (!object)
+		return -1;
+
+	if (oi) {
+		if (oi->typep)
+			*(oi->typep) = object->type;
+		if (oi->sizep)
+			*(oi->sizep) = object->size;
+		if (oi->disk_sizep)
+			*(oi->disk_sizep) = 0;
+		if (oi->delta_base_oid)
+			oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
+		if (oi->contentp)
+			*oi->contentp = xmemdupz(object->buf, object->size);
+		if (oi->mtimep)
+			*oi->mtimep = 0;
+		oi->whence = OI_CACHED;
+	}
+
+	return 0;
+}
 
 static void odb_source_inmemory_free(struct odb_source *source)
 {
@@ -19,6 +71,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
 	source->base.free = odb_source_inmemory_free;
+	source->base.read_object_info = odb_source_inmemory_read_object_info;
 
 	return source;
 }

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  9:49     ` Karthik Nayak
  2026-04-09  7:24   ` [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
                     ` (12 subsequent siblings)
  17 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `read_object_stream()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 12c80f9b34..4a68169430 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,6 +1,7 @@
 #include "git-compat-util.h"
 #include "odb.h"
 #include "odb/source-inmemory.h"
+#include "odb/streaming.h"
 #include "repository.h"
 
 static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
@@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 	return 0;
 }
 
+struct odb_read_stream_inmemory {
+	struct odb_read_stream base;
+	const void *buf;
+	size_t offset;
+};
+
+static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream,
+					     char *buf, size_t buf_len)
+{
+	struct odb_read_stream_inmemory *inmemory =
+		container_of(stream, struct odb_read_stream_inmemory, base);
+	size_t bytes = buf_len;
+
+	if (buf_len > inmemory->base.size - inmemory->offset)
+		bytes = inmemory->base.size - inmemory->offset;
+	memcpy(buf, inmemory->buf, bytes);
+
+	return bytes;
+}
+
+static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED)
+{
+	return 0;
+}
+
+static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
+						  struct odb_source *source,
+						  const struct object_id *oid)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct odb_read_stream_inmemory *stream;
+	const struct cached_object *object;
+
+	object = find_cached_object(inmemory, oid);
+	if (!object)
+		return -1;
+
+	CALLOC_ARRAY(stream, 1);
+	stream->base.read = odb_read_stream_inmemory_read;
+	stream->base.close = odb_read_stream_inmemory_close;
+	stream->base.size = object->size;
+	stream->base.type = object->type;
+	stream->buf = object->buf;
+
+	*out = &stream->base;
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -72,6 +121,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
+	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 
 	return source;
 }

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 07/17] " Patrick Steinhardt
                     ` (11 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `write_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 16 ++--------------
 odb/source-inmemory.c | 22 ++++++++++++++++++++++
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/odb.c b/odb.c
index 6a3912adac..24e929f03c 100644
--- a/odb.c
+++ b/odb.c
@@ -733,24 +733,12 @@ int odb_pretend_object(struct object_database *odb,
 		       void *buf, unsigned long len, enum object_type type,
 		       struct object_id *oid)
 {
-	struct cached_object_entry *co;
-	char *co_buf;
-
 	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
 	if (odb_has_object(odb, oid, 0))
 		return 0;
 
-	ALLOC_GROW(odb->inmemory_objects->objects,
-		   odb->inmemory_objects->objects_nr + 1,
-		   odb->inmemory_objects->objects_alloc);
-	co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++];
-	co->value.size = len;
-	co->value.type = type;
-	co_buf = xmalloc(len);
-	memcpy(co_buf, buf, len);
-	co->value.buf = co_buf;
-	oidcpy(&co->oid, oid);
-	return 0;
+	return odb_source_write_object(&odb->inmemory_objects->base,
+				       buf, len, type, oid, NULL, 0);
 }
 
 void *odb_read_object(struct object_database *odb,
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 4a68169430..d2fc4c4054 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -102,6 +102,27 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 	return 0;
 }
 
+static int odb_source_inmemory_write_object(struct odb_source *source,
+					    const void *buf, unsigned long len,
+					    enum object_type type,
+					    struct object_id *oid,
+					    struct object_id *compat_oid UNUSED,
+					    enum odb_write_object_flags flags UNUSED)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct cached_object_entry *object;
+
+	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
+		   inmemory->objects_alloc);
+	object = &inmemory->objects[inmemory->objects_nr++];
+	object->value.size = len;
+	object->value.type = type;
+	object->value.buf = xmemdupz(buf, len);
+	oidcpy(&object->oid, oid);
+
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -122,6 +143,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
+	source->base.write_object = odb_source_inmemory_write_object;
 
 	return source;
 }

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 07/17] odb/source-inmemory: implement `write_object()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09 10:27     ` Karthik Nayak
  2026-04-09  7:24   ` [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
                     ` (10 subsequent siblings)
  17 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `write_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index d2fc4c4054..96e8efd327 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,4 +1,5 @@
 #include "git-compat-util.h"
+#include "object-file.h"
 #include "odb.h"
 #include "odb/source-inmemory.h"
 #include "odb/streaming.h"
@@ -112,6 +113,8 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
 	struct cached_object_entry *object;
 
+	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
+
 	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
 		   inmemory->objects_alloc);
 	object = &inmemory->objects[inmemory->objects_nr++];

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 07/17] " Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
                     ` (9 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `write_object_stream()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 96e8efd327..578ceea550 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -126,6 +126,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
 	return 0;
 }
 
+static int odb_source_inmemory_write_object_stream(struct odb_source *source,
+						   struct odb_write_stream *stream,
+						   size_t len,
+						   struct object_id *oid)
+{
+	char buf[16384];
+	size_t total_read = 0;
+	char *data;
+	int ret;
+
+	CALLOC_ARRAY(data, len);
+	while (!stream->is_finished) {
+		ssize_t bytes_read;
+
+		bytes_read = odb_write_stream_read(stream, buf, sizeof(buf));
+		if (total_read + bytes_read > len) {
+			ret = error("object stream yielded more bytes than expected");
+			goto out;
+		}
+
+		memcpy(data, buf, bytes_read);
+		total_read += bytes_read;
+	}
+
+	if (total_read != len) {
+		ret = error("object stream yielded less bytes than expected");
+		goto out;
+	}
+
+	ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid,
+					       NULL, 0);
+	if (ret < 0)
+		goto out;
+
+out:
+	free(data);
+	return ret;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -147,6 +186,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.write_object = odb_source_inmemory_write_object;
+	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 
 	return source;
 }

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09 11:36     ` Karthik Nayak
  2026-04-09  7:24   ` [PATCH v2 10/17] oidtree: add ability to store data Patrick Steinhardt
                     ` (8 subsequent siblings)
  17 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The cbtree subsystem allows the user to store arbitrary data in a
prefix-free set of strings. This is used by us to store object IDs in a
way that we can easily iterate through them in lexicograph order, and so
that we can easily perform lookups with shortened object IDs.

In its current form, it is not easily possible to store arbitrary data
with the tree nodes. There are a couple of approaches such a caller
could try to use, but none of them really work:

  - One may embed the `struct cb_node` in a custom structure. This does
    not work though as `struct cb_node` contains a flex array, and
    embedding such a struct in another struct is forbidden.

  - One may use a `union` over `struct cb_node` and ones own data type,
    which _is_ allowed even if the struct contains a flex array. This
    does not work though, as the compiler may align members of the
    struct so that the node key would not immediately start where the
    flex array starts.

  - One may allocate `struct cb_node` such that it has room for both its
    key and the custom data. This has the downside though that if the
    custom data is itself a pointer to allocated memory, then the leak
    checker will not consider the pointer to be alive anymore.

Refactor the cbtree to drop the flex array and instead take in an
explicit offset for where to find the key, which allows the caller to
embed `struct cb_node` is a wrapper struct.

Note that this change has the downside that we now have a bit of padding
in our structure, which grows the size from 60 to 64 bytes on a 64 bit
system. On the other hand though, it allows us to get rid of the memory
copies that we previously had to do to ensure proper alignment. This
seems like a reasonable tradeoff.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 cbtree.c  | 25 ++++++++++++++++++-------
 cbtree.h  | 11 ++++++-----
 oidtree.c | 33 ++++++++++++++-------------------
 3 files changed, 38 insertions(+), 31 deletions(-)

diff --git a/cbtree.c b/cbtree.c
index 4ab794bddc..8f5edbb80a 100644
--- a/cbtree.c
+++ b/cbtree.c
@@ -7,6 +7,11 @@
 #include "git-compat-util.h"
 #include "cbtree.h"
 
+static inline uint8_t *cb_node_key(struct cb_tree *t, struct cb_node *node)
+{
+	return (uint8_t *) node + t->key_offset;
+}
+
 static struct cb_node *cb_node_of(const void *p)
 {
 	return (struct cb_node *)((uintptr_t)p - 1);
@@ -33,6 +38,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 	uint8_t c;
 	int newdirection;
 	struct cb_node **wherep, *p;
+	uint8_t *node_key, *p_key;
 
 	assert(!((uintptr_t)node & 1)); /* allocations must be aligned */
 
@@ -41,23 +47,26 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 		return NULL;	/* success */
 	}
 
+	node_key = cb_node_key(t, node);
+
 	/* see if a node already exists */
-	p = cb_internal_best_match(t->root, node->k, klen);
+	p = cb_internal_best_match(t->root, node_key, klen);
+	p_key = cb_node_key(t, p);
 
 	/* find first differing byte */
 	for (newbyte = 0; newbyte < klen; newbyte++) {
-		if (p->k[newbyte] != node->k[newbyte])
+		if (p_key[newbyte] != node_key[newbyte])
 			goto different_byte_found;
 	}
 	return p;	/* element exists, let user deal with it */
 
 different_byte_found:
-	newotherbits = p->k[newbyte] ^ node->k[newbyte];
+	newotherbits = p_key[newbyte] ^ node_key[newbyte];
 	newotherbits |= newotherbits >> 1;
 	newotherbits |= newotherbits >> 2;
 	newotherbits |= newotherbits >> 4;
 	newotherbits = (newotherbits & ~(newotherbits >> 1)) ^ 255;
-	c = p->k[newbyte];
+	c = p_key[newbyte];
 	newdirection = (1 + (newotherbits | c)) >> 8;
 
 	node->byte = newbyte;
@@ -78,7 +87,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 			break;
 		if (q->byte == newbyte && q->otherbits > newotherbits)
 			break;
-		c = q->byte < klen ? node->k[q->byte] : 0;
+		c = q->byte < klen ? node_key[q->byte] : 0;
 		direction = (1 + (q->otherbits | c)) >> 8;
 		wherep = q->child + direction;
 	}
@@ -93,7 +102,7 @@ struct cb_node *cb_lookup(struct cb_tree *t, const uint8_t *k, size_t klen)
 {
 	struct cb_node *p = cb_internal_best_match(t->root, k, klen);
 
-	return p && !memcmp(p->k, k, klen) ? p : NULL;
+	return p && !memcmp(cb_node_key(t, p), k, klen) ? p : NULL;
 }
 
 static int cb_descend(struct cb_node *p, cb_iter fn, void *arg)
@@ -115,6 +124,7 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
 	struct cb_node *p = t->root;
 	struct cb_node *top = p;
 	size_t i = 0;
+	uint8_t *p_key;
 
 	if (!p)
 		return 0; /* empty tree */
@@ -130,8 +140,9 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
 			top = p;
 	}
 
+	p_key = cb_node_key(t, p);
 	for (i = 0; i < klen; i++) {
-		if (p->k[i] != kpfx[i])
+		if (p_key[i] != kpfx[i])
 			return 0; /* "best" match failed */
 	}
 
diff --git a/cbtree.h b/cbtree.h
index c374b1b3db..3ce0d6b287 100644
--- a/cbtree.h
+++ b/cbtree.h
@@ -23,18 +23,19 @@ struct cb_node {
 	 */
 	uint32_t byte;
 	uint8_t otherbits;
-	uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */
 };
 
 struct cb_tree {
 	struct cb_node *root;
+	ptrdiff_t key_offset;
 };
 
-#define CBTREE_INIT { 0 }
-
-static inline void cb_init(struct cb_tree *t)
+static inline void cb_init(struct cb_tree *t,
+			   ptrdiff_t key_offset)
 {
-	struct cb_tree blank = CBTREE_INIT;
+	struct cb_tree blank = {
+		.key_offset = key_offset,
+	};
 	memcpy(t, &blank, sizeof(*t));
 }
 
diff --git a/oidtree.c b/oidtree.c
index ab9fe7ec7a..117649753f 100644
--- a/oidtree.c
+++ b/oidtree.c
@@ -6,9 +6,14 @@
 #include "oidtree.h"
 #include "hash.h"
 
+struct oidtree_node {
+	struct cb_node base;
+	struct object_id key;
+};
+
 void oidtree_init(struct oidtree *ot)
 {
-	cb_init(&ot->tree);
+	cb_init(&ot->tree, offsetof(struct oidtree_node, key));
 	mem_pool_init(&ot->mem_pool, 0);
 }
 
@@ -22,20 +27,13 @@ void oidtree_clear(struct oidtree *ot)
 
 void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 {
-	struct cb_node *on;
-	struct object_id k;
+	struct oidtree_node *on;
 
 	if (!oid->algo)
 		BUG("oidtree_insert requires oid->algo");
 
-	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid));
-
-	/*
-	 * Clear the padding and copy the result in separate steps to
-	 * respect the 4-byte alignment needed by struct object_id.
-	 */
-	oidcpy(&k, oid);
-	memcpy(on->k, &k, sizeof(k));
+	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on));
+	oidcpy(&on->key, oid);
 
 	/*
 	 * n.b. Current callers won't get us duplicates, here.  If a
@@ -43,7 +41,7 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 	 * that won't be freed until oidtree_clear.  Currently it's not
 	 * worth maintaining a free list
 	 */
-	cb_insert(&ot->tree, on, sizeof(*oid));
+	cb_insert(&ot->tree, &on->base, sizeof(*oid));
 }
 
 bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
@@ -73,21 +71,18 @@ struct oidtree_each_data {
 
 static int iter(struct cb_node *n, void *cb_data)
 {
+	struct oidtree_node *node = container_of(n, struct oidtree_node, base);
 	struct oidtree_each_data *data = cb_data;
-	struct object_id k;
-
-	/* Copy to provide 4-byte alignment needed by struct object_id. */
-	memcpy(&k, n->k, sizeof(k));
 
-	if (data->algo != GIT_HASH_UNKNOWN && data->algo != k.algo)
+	if (data->algo != GIT_HASH_UNKNOWN && data->algo != node->key.algo)
 		return 0;
 
 	if (data->last_nibble_at) {
-		if ((k.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0)
+		if ((node->key.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0)
 			return 0;
 	}
 
-	return data->cb(&k, data->cb_data);
+	return data->cb(&node->key, data->cb_data);
 }
 
 int oidtree_each(struct oidtree *ot, const struct object_id *prefix,

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 10/17] oidtree: add ability to store data
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
                     ` (7 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The oidtree data structure is currently only used to store object IDs,
without any associated data. So consequently, it can only really be used
to track which object IDs exist, and we can use the tree structure to
efficiently operate on OID prefixes.

But there are valid use cases where we want to both:

  - Store object IDs in a sorted order.

  - Associated arbitrary data with them.

Refactor the oidtree interface so that it allows us to store arbitrary
payloads within the respective nodes. This will be used in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 loose.c                  |  2 +-
 object-file.c            |  3 ++-
 oidtree.c                | 37 ++++++++++++++++++++++++++++++++-----
 oidtree.h                | 12 ++++++++++--
 t/unit-tests/u-oidtree.c | 26 +++++++++++++++++++++++---
 5 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/loose.c b/loose.c
index 07333be696..f7a3dd1a72 100644
--- a/loose.c
+++ b/loose.c
@@ -57,7 +57,7 @@ static int insert_loose_map(struct odb_source *source,
 	inserted |= insert_oid_pair(map->to_compat, oid, compat_oid);
 	inserted |= insert_oid_pair(map->to_storage, compat_oid, oid);
 	if (inserted)
-		oidtree_insert(files->loose->cache, compat_oid);
+		oidtree_insert(files->loose->cache, compat_oid, NULL);
 
 	return inserted;
 }
diff --git a/object-file.c b/object-file.c
index 3e70e5d668..d04ab57253 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1857,6 +1857,7 @@ static int for_each_object_wrapper_cb(const struct object_id *oid,
 }
 
 static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid,
+					       void *node_data UNUSED,
 					       void *cb_data)
 {
 	struct for_each_object_wrapper_data *data = cb_data;
@@ -2002,7 +2003,7 @@ static int append_loose_object(const struct object_id *oid,
 			       const char *path UNUSED,
 			       void *data)
 {
-	oidtree_insert(data, oid);
+	oidtree_insert(data, oid, NULL);
 	return 0;
 }
 
diff --git a/oidtree.c b/oidtree.c
index 117649753f..e43f18026e 100644
--- a/oidtree.c
+++ b/oidtree.c
@@ -9,6 +9,7 @@
 struct oidtree_node {
 	struct cb_node base;
 	struct object_id key;
+	void *data;
 };
 
 void oidtree_init(struct oidtree *ot)
@@ -25,15 +26,22 @@ void oidtree_clear(struct oidtree *ot)
 	}
 }
 
-void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
+struct oidtree_data {
+	struct object_id oid;
+};
+
+void oidtree_insert(struct oidtree *ot, const struct object_id *oid,
+		    void *data)
 {
 	struct oidtree_node *on;
+	struct cb_node *node;
 
 	if (!oid->algo)
 		BUG("oidtree_insert requires oid->algo");
 
 	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on));
 	oidcpy(&on->key, oid);
+	on->data = data;
 
 	/*
 	 * n.b. Current callers won't get us duplicates, here.  If a
@@ -41,13 +49,19 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 	 * that won't be freed until oidtree_clear.  Currently it's not
 	 * worth maintaining a free list
 	 */
-	cb_insert(&ot->tree, &on->base, sizeof(*oid));
+	node = cb_insert(&ot->tree, &on->base, sizeof(*oid));
+	if (node) {
+		struct oidtree_node *preexisting = container_of(node, struct oidtree_node, base);
+		preexisting->data = data;
+	}
 }
 
-bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+static struct oidtree_node *oidtree_lookup(struct oidtree *ot,
+					   const struct object_id *oid)
 {
 	struct object_id k;
 	size_t klen = sizeof(k);
+	struct cb_node *node;
 
 	oidcpy(&k, oid);
 
@@ -58,7 +72,20 @@ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
 	klen += BUILD_ASSERT_OR_ZERO(offsetof(struct object_id, hash) <
 				offsetof(struct object_id, algo));
 
-	return !!cb_lookup(&ot->tree, (const uint8_t *)&k, klen);
+	node = cb_lookup(&ot->tree, (const uint8_t *)&k, klen);
+	return node ? container_of(node, struct oidtree_node, base) : NULL;
+}
+
+bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+{
+	struct oidtree_node *node = oidtree_lookup(ot, oid);
+	return node ? 1 : 0;
+}
+
+void *oidtree_get(struct oidtree *ot, const struct object_id *oid)
+{
+	struct oidtree_node *node = oidtree_lookup(ot, oid);
+	return node ? node->data : NULL;
 }
 
 struct oidtree_each_data {
@@ -82,7 +109,7 @@ static int iter(struct cb_node *n, void *cb_data)
 			return 0;
 	}
 
-	return data->cb(&node->key, data->cb_data);
+	return data->cb(&node->key, node->data, data->cb_data);
 }
 
 int oidtree_each(struct oidtree *ot, const struct object_id *prefix,
diff --git a/oidtree.h b/oidtree.h
index 2b7bad2e60..baa5a436ea 100644
--- a/oidtree.h
+++ b/oidtree.h
@@ -29,18 +29,26 @@ void oidtree_init(struct oidtree *ot);
  */
 void oidtree_clear(struct oidtree *ot);
 
-/* Insert the object ID into the tree. */
-void oidtree_insert(struct oidtree *ot, const struct object_id *oid);
+/*
+ * Insert the object ID into the tree and store the given pointer alongside
+ * with it. The data pointer of any preexisting entry will be overwritten.
+ */
+void oidtree_insert(struct oidtree *ot, const struct object_id *oid,
+		    void *data);
 
 /* Check whether the tree contains the given object ID. */
 bool oidtree_contains(struct oidtree *ot, const struct object_id *oid);
 
+/* Get the payload stored with the given object ID. */
+void *oidtree_get(struct oidtree *ot, const struct object_id *oid);
+
 /*
  * Callback function used for `oidtree_each()`. Returning a non-zero exit code
  * will cause iteration to stop. The exit code will be propagated to the caller
  * of `oidtree_each()`.
  */
 typedef int (*oidtree_each_cb)(const struct object_id *oid,
+			       void *node_data,
 			       void *cb_data);
 
 /*
diff --git a/t/unit-tests/u-oidtree.c b/t/unit-tests/u-oidtree.c
index d4d05c7dc3..f0d5ebb733 100644
--- a/t/unit-tests/u-oidtree.c
+++ b/t/unit-tests/u-oidtree.c
@@ -19,7 +19,7 @@ static int fill_tree_loc(struct oidtree *ot, const char *hexes[], size_t n)
 	for (size_t i = 0; i < n; i++) {
 		struct object_id oid;
 		cl_parse_any_oid(hexes[i], &oid);
-		oidtree_insert(ot, &oid);
+		oidtree_insert(ot, &oid, NULL);
 	}
 	return 0;
 }
@@ -38,9 +38,9 @@ struct expected_hex_iter {
 	const char *query;
 };
 
-static int check_each_cb(const struct object_id *oid, void *data)
+static int check_each_cb(const struct object_id *oid, void *node_data UNUSED, void *cb_data)
 {
-	struct expected_hex_iter *hex_iter = data;
+	struct expected_hex_iter *hex_iter = cb_data;
 	struct object_id expected;
 
 	cl_assert(hex_iter->i < hex_iter->expected_hexes.nr);
@@ -105,3 +105,23 @@ void test_oidtree__each(void)
 	check_each(&ot, "32100", "321", NULL);
 	check_each(&ot, "32", "320", "321", NULL);
 }
+
+void test_oidtree__insert_overwrites_data(void)
+{
+	struct object_id oid;
+	struct oidtree ot;
+	int a, b;
+
+	cl_parse_any_oid("1", &oid);
+
+	oidtree_init(&ot);
+
+	oidtree_insert(&ot, &oid, NULL);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), NULL);
+	oidtree_insert(&ot, &oid, &a);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), &a);
+	oidtree_insert(&ot, &oid, &b);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), &b);
+
+	oidtree_clear(&ot);
+}

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 10/17] oidtree: add ability to store data Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
                     ` (6 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The in-memory source stores its objects in a simple array that we grow as
needed. This has a couple of downsides:

  - The object lookup is O(n). This doesn't matter in practice because
    we only store a small number of objects.

  - We don't have an easy way to iterate over all objects in
    lexicographic order.

  - We don't have an easy way to compute unique object ID prefixes.

Refactor the code to use an oidtree instead. This is the same data
structure used by our loose object source, and thus it means we get a
bunch of functionality for free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 72 +++++++++++++++++++++++++++++++++++++--------------
 odb/source-inmemory.h | 13 ++--------
 2 files changed, 54 insertions(+), 31 deletions(-)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 578ceea550..0420b98d00 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -3,20 +3,29 @@
 #include "odb.h"
 #include "odb/source-inmemory.h"
 #include "odb/streaming.h"
+#include "oidtree.h"
 #include "repository.h"
 
-static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
-						      const struct object_id *oid)
+struct inmemory_object {
+	enum object_type type;
+	const void *buf;
+	unsigned long size;
+};
+
+static const struct inmemory_object *find_cached_object(struct odb_source_inmemory *source,
+							const struct object_id *oid)
 {
-	static const struct cached_object empty_tree = {
+	static const struct inmemory_object empty_tree = {
 		.type = OBJ_TREE,
 		.buf = "",
 	};
-	const struct cached_object_entry *co = source->objects;
+	const struct inmemory_object *object;
 
-	for (size_t i = 0; i < source->objects_nr; i++, co++)
-		if (oideq(&co->oid, oid))
-			return &co->value;
+	if (source->objects) {
+		object = oidtree_get(source->objects, oid);
+		if (object)
+			return object;
+	}
 
 	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
 		return &empty_tree;
@@ -30,7 +39,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 						enum object_info_flags flags UNUSED)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	const struct cached_object *object;
+	const struct inmemory_object *object;
 
 	object = find_cached_object(inmemory, oid);
 	if (!object)
@@ -86,7 +95,7 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
 	struct odb_read_stream_inmemory *stream;
-	const struct cached_object *object;
+	const struct inmemory_object *object;
 
 	object = find_cached_object(inmemory, oid);
 	if (!object)
@@ -111,17 +120,23 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
 					    enum odb_write_object_flags flags UNUSED)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	struct cached_object_entry *object;
+	struct inmemory_object *object;
 
 	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
 
-	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
-		   inmemory->objects_alloc);
-	object = &inmemory->objects[inmemory->objects_nr++];
-	object->value.size = len;
-	object->value.type = type;
-	object->value.buf = xmemdupz(buf, len);
-	oidcpy(&object->oid, oid);
+	if (!inmemory->objects) {
+		CALLOC_ARRAY(inmemory->objects, 1);
+		oidtree_init(inmemory->objects);
+	} else if (oidtree_contains(inmemory->objects, oid)) {
+		return 0;
+	}
+
+	CALLOC_ARRAY(object, 1);
+	object->size = len;
+	object->type = type;
+	object->buf = xmemdupz(buf, len);
+
+	oidtree_insert(inmemory->objects, oid, object);
 
 	return 0;
 }
@@ -165,12 +180,29 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source,
 	return ret;
 }
 
+static int inmemory_object_free(const struct object_id *oid UNUSED,
+				void *node_data,
+				void *cb_data UNUSED)
+{
+	struct inmemory_object *object = node_data;
+	free((void *) object->buf);
+	free(object);
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	for (size_t i = 0; i < inmemory->objects_nr; i++)
-		free((char *) inmemory->objects[i].value.buf);
-	free(inmemory->objects);
+
+	if (inmemory->objects) {
+		struct object_id null_oid = { 0 };
+
+		oidtree_each(inmemory->objects, &null_oid, 0,
+			     inmemory_object_free, NULL);
+		oidtree_clear(inmemory->objects);
+		free(inmemory->objects);
+	}
+
 	free(inmemory->base.path);
 	free(inmemory);
 }
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
index 14dc06f7c3..02cf586b63 100644
--- a/odb/source-inmemory.h
+++ b/odb/source-inmemory.h
@@ -3,14 +3,7 @@
 
 #include "odb/source.h"
 
-struct cached_object_entry {
-	struct object_id oid;
-	struct cached_object {
-		enum object_type type;
-		const void *buf;
-		unsigned long size;
-	} value;
-};
+struct oidtree;
 
 /*
  * An inmemory source that you can write objects to that shall be made
@@ -20,9 +13,7 @@ struct cached_object_entry {
  */
 struct odb_source_inmemory {
 	struct odb_source base;
-
-	struct cached_object_entry *objects;
-	size_t objects_nr, objects_alloc;
+	struct oidtree *objects;
 };
 
 /* Create a new in-memory object database source. */

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
                     ` (5 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `for_each_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 86 +++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 70 insertions(+), 16 deletions(-)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 0420b98d00..d1674836cc 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -33,6 +33,28 @@ static const struct inmemory_object *find_cached_object(struct odb_source_inmemo
 	return NULL;
 }
 
+static void populate_object_info(struct odb_source_inmemory *source,
+				 struct object_info *oi,
+				 const struct inmemory_object *object)
+{
+	if (!oi)
+		return;
+
+	if (oi->typep)
+		*(oi->typep) = object->type;
+	if (oi->sizep)
+		*(oi->sizep) = object->size;
+	if (oi->disk_sizep)
+		*(oi->disk_sizep) = 0;
+	if (oi->delta_base_oid)
+		oidclr(oi->delta_base_oid, source->base.odb->repo->hash_algo);
+	if (oi->contentp)
+		*oi->contentp = xmemdupz(object->buf, object->size);
+	if (oi->mtimep)
+		*oi->mtimep = 0;
+	oi->whence = OI_CACHED;
+}
+
 static int odb_source_inmemory_read_object_info(struct odb_source *source,
 						const struct object_id *oid,
 						struct object_info *oi,
@@ -45,22 +67,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 	if (!object)
 		return -1;
 
-	if (oi) {
-		if (oi->typep)
-			*(oi->typep) = object->type;
-		if (oi->sizep)
-			*(oi->sizep) = object->size;
-		if (oi->disk_sizep)
-			*(oi->disk_sizep) = 0;
-		if (oi->delta_base_oid)
-			oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
-		if (oi->contentp)
-			*oi->contentp = xmemdupz(object->buf, object->size);
-		if (oi->mtimep)
-			*oi->mtimep = 0;
-		oi->whence = OI_CACHED;
-	}
-
+	populate_object_info(inmemory, oi, object);
 	return 0;
 }
 
@@ -112,6 +119,52 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 	return 0;
 }
 
+struct odb_source_inmemory_for_each_object_data {
+	struct odb_source_inmemory *inmemory;
+	const struct object_info *request;
+	odb_for_each_object_cb cb;
+	void *cb_data;
+};
+
+static int odb_source_inmemory_for_each_object_cb(const struct object_id *oid,
+						  void *node_data, void *cb_data)
+{
+	struct odb_source_inmemory_for_each_object_data *data = cb_data;
+	struct inmemory_object *object = node_data;
+
+	if (data->request) {
+		struct object_info oi = *data->request;
+		populate_object_info(data->inmemory, &oi, object);
+		return data->cb(oid, &oi, data->cb_data);
+	} else {
+		return data->cb(oid, NULL, data->cb_data);
+	}
+}
+
+static int odb_source_inmemory_for_each_object(struct odb_source *source,
+					       const struct object_info *request,
+					       odb_for_each_object_cb cb,
+					       void *cb_data,
+					       const struct odb_for_each_object_options *opts)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct odb_source_inmemory_for_each_object_data payload = {
+		.inmemory = inmemory,
+		.request = request,
+		.cb = cb,
+		.cb_data = cb_data,
+	};
+	struct object_id null_oid = { 0 };
+
+	if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) ||
+	    (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local))
+		return 0;
+
+	return oidtree_each(inmemory->objects,
+			    opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len,
+			    odb_source_inmemory_for_each_object_cb, &payload);
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -217,6 +270,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
+	source->base.for_each_object = odb_source_inmemory_for_each_object;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (11 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
                     ` (4 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `find_abbrev_len()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index d1674836cc..a8eba373ee 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -165,6 +165,44 @@ static int odb_source_inmemory_for_each_object(struct odb_source *source,
 			    odb_source_inmemory_for_each_object_cb, &payload);
 }
 
+struct find_abbrev_len_data {
+	const struct object_id *oid;
+	unsigned len;
+};
+
+static int find_abbrev_len_cb(const struct object_id *oid,
+			      struct object_info *oi UNUSED,
+			      void *cb_data)
+{
+	struct find_abbrev_len_data *data = cb_data;
+	unsigned len = oid_common_prefix_hexlen(oid, data->oid);
+	if (len != hash_algos[oid->algo].hexsz && len >= data->len)
+		data->len = len + 1;
+	return 0;
+}
+
+static int odb_source_inmemory_find_abbrev_len(struct odb_source *source,
+					       const struct object_id *oid,
+					       unsigned min_len,
+					       unsigned *out)
+{
+	struct odb_for_each_object_options opts = {
+		.prefix = oid,
+		.prefix_hex_len = min_len,
+	};
+	struct find_abbrev_len_data data = {
+		.oid = oid,
+		.len = min_len,
+	};
+	int ret;
+
+	ret = odb_source_inmemory_for_each_object(source, NULL, find_abbrev_len_cb,
+						  &data, &opts);
+	*out = data.len;
+
+	return ret;
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -271,6 +309,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
+	source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (12 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
                     ` (3 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `count_objects()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index a8eba373ee..f038debaa3 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -203,6 +203,25 @@ static int odb_source_inmemory_find_abbrev_len(struct odb_source *source,
 	return ret;
 }
 
+static int count_objects_cb(const struct object_id *oid UNUSED,
+			    struct object_info *oi UNUSED,
+			    void *cb_data)
+{
+	unsigned long *counter = cb_data;
+	(*counter)++;
+	return 0;
+}
+
+static int odb_source_inmemory_count_objects(struct odb_source *source,
+					     enum odb_count_objects_flags flags UNUSED,
+					     unsigned long *out)
+{
+	struct odb_for_each_object_options opts = { 0 };
+	*out = 0;
+	return odb_source_inmemory_for_each_object(source, NULL, count_objects_cb,
+						   out, &opts);
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -310,6 +329,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
 	source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len;
+	source->base.count_objects = odb_source_inmemory_count_objects;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (13 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09  7:24   ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
                     ` (2 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `freshen_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index f038debaa3..15a6a5ae64 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -290,6 +290,15 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source,
 	return ret;
 }
 
+static int odb_source_inmemory_freshen_object(struct odb_source *source,
+					      const struct object_id *oid)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	if (find_cached_object(inmemory, oid))
+		return 1;
+	return 0;
+}
+
 static int inmemory_object_free(const struct object_id *oid UNUSED,
 				void *node_data,
 				void *cb_data UNUSED)
@@ -332,6 +341,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.count_objects = odb_source_inmemory_count_objects;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
+	source->base.freshen_object = odb_source_inmemory_freshen_object;
 
 	return source;
 }

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (14 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09 19:39     ` Junio C Hamano
  2026-04-09  7:24   ` [PATCH v2 17/17] odb: generic in-memory source Patrick Steinhardt
  2026-04-09 11:44   ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak
  17 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Stub out remaining functions that we either don't need or that are
basically no-ops.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 15a6a5ae64..1140b1b916 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -299,6 +299,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source,
 	return 0;
 }
 
+static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED,
+						 struct odb_transaction **out UNUSED)
+{
+	return error("inmemory source does not support transactions");
+}
+
+static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED,
+					       struct strvec *out UNUSED)
+{
+	return 0;
+}
+
+static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED,
+					       const char *alternate UNUSED)
+{
+	return error("inmemory source does not support alternates");
+}
+
+static void odb_source_inmemory_close(struct odb_source *source UNUSED)
+{
+}
+
+static void odb_source_inmemory_reprepare(struct odb_source *source UNUSED)
+{
+}
+
 static int inmemory_object_free(const struct object_id *oid UNUSED,
 				void *node_data,
 				void *cb_data UNUSED)
@@ -334,6 +360,8 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
 	source->base.free = odb_source_inmemory_free;
+	source->base.close = odb_source_inmemory_close;
+	source->base.reprepare = odb_source_inmemory_reprepare;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
@@ -342,6 +370,9 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 	source->base.freshen_object = odb_source_inmemory_freshen_object;
+	source->base.begin_transaction = odb_source_inmemory_begin_transaction;
+	source->base.read_alternates = odb_source_inmemory_read_alternates;
+	source->base.write_alternate = odb_source_inmemory_write_alternate;
 
 	return source;
 }

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v2 17/17] odb: generic in-memory source
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (15 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
@ 2026-04-09  7:24   ` Patrick Steinhardt
  2026-04-09 11:44   ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09  7:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Make the in-memory source generic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c | 8 ++++----
 odb.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/odb.c b/odb.c
index 24e929f03c..965ef68e4e 100644
--- a/odb.c
+++ b/odb.c
@@ -560,7 +560,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
 	if (is_null_oid(real))
 		return -1;
 
-	if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags))
+	if (!odb_source_read_object_info(odb->inmemory_objects, oid, oi, flags))
 		return 0;
 
 	odb_prepare_alternates(odb);
@@ -737,7 +737,7 @@ int odb_pretend_object(struct object_database *odb,
 	if (odb_has_object(odb, oid, 0))
 		return 0;
 
-	return odb_source_write_object(&odb->inmemory_objects->base,
+	return odb_source_write_object(odb->inmemory_objects,
 				       buf, len, type, oid, NULL, 0);
 }
 
@@ -1020,7 +1020,7 @@ struct object_database *odb_new(struct repository *repo,
 	o->sources = odb_source_new(o, primary_source, true);
 	o->sources_tail = &o->sources->next;
 	o->alternate_db = xstrdup_or_null(secondary_sources);
-	o->inmemory_objects = odb_source_inmemory_new(o);
+	o->inmemory_objects = &odb_source_inmemory_new(o)->base;
 
 	free(to_free);
 
@@ -1045,7 +1045,7 @@ static void odb_free_sources(struct object_database *o)
 		o->sources = next;
 	}
 
-	odb_source_free(&o->inmemory_objects->base);
+	odb_source_free(o->inmemory_objects);
 	o->inmemory_objects = NULL;
 
 	kh_destroy_odb_path_map(o->source_by_path);
diff --git a/odb.h b/odb.h
index c3a7edf9c8..73553ed5a7 100644
--- a/odb.h
+++ b/odb.h
@@ -81,7 +81,7 @@ struct object_database {
 	 * to write them into the object store (e.g. a browse-only
 	 * application).
 	 */
-	struct odb_source_inmemory *inmemory_objects;
+	struct odb_source *inmemory_objects;
 
 	/*
 	 * A fast, rough count of the number of objects in the repository.

-- 
2.54.0.rc0.680.geaeac8ef83.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 01/17] odb: introduce "in-memory" source
  2026-04-09  7:24   ` [PATCH v2 01/17] " Patrick Steinhardt
@ 2026-04-09  9:26     ` Karthik Nayak
  2026-04-09 10:41       ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Karthik Nayak @ 2026-04-09  9:26 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 6426 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Next to our typical object database sources, each object database also
> has an implicit source of "cached" objects. These cached objects only
> exist in memory and some use cases:
>
>   - They contain evergreen objects that we expect to always exist, like
>     for example the empty tree.
>
>   - They can be used to store temporary objects that we don't want to
>     persist to disk, which is used by git-blame(1) to create a fake
>     worktree commit.
>
> Overall, their use is somewhat restricted though. For example, we don't
> provide the ability to use it as a temporary object database source that
> allows the user to write objects, but discard them after Git exists. So
> while these cached objects behave almost like a source, they aren't used
> as one.
>
> This is about to change over the following commits, where we will turn
> cached objects into a new "in-memory" source. This will allow us to use
> it exactly the same as any other source by providing the same common
> interface as the "files" source.
>
> For now, the in-memory source only hosts the cached objects and doesn't
> provide any logic yet. This will change with subsequent commits, where
> we move respective functionality into the source.

[snip]

> diff --git a/odb.c b/odb.c
> index 40a5e9c4e0..60e1eead25 100644
> --- a/odb.c
> +++ b/odb.c
> @@ -14,6 +14,7 @@
>  #include "object-file.h"
>  #include "object-name.h"
>  #include "odb.h"
> +#include "odb/source-inmemory.h"
>  #include "packfile.h"
>  #include "path.h"
>  #include "promisor-remote.h"
> @@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob
>  		.type = OBJ_TREE,
>  		.buf = "",
>  	};
> -	const struct cached_object_entry *co = object_store->cached_objects;
> +	const struct cached_object_entry *co = object_store->inmemory_objects->objects;
>
> -	for (size_t i = 0; i < object_store->cached_object_nr; i++, co++)
> +	for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++)
>  		if (oideq(&co->oid, oid))
>  			return &co->value;
>
> @@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb,
>  	    find_cached_object(odb, oid))
>  		return 0;
>
> -	ALLOC_GROW(odb->cached_objects,
> -		   odb->cached_object_nr + 1, odb->cached_object_alloc);
> -	co = &odb->cached_objects[odb->cached_object_nr++];
> +	ALLOC_GROW(odb->inmemory_objects->objects,
> +		   odb->inmemory_objects->objects_nr + 1,
> +		   odb->inmemory_objects->objects_alloc);
> +	co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++];

Okay so we introduce the inmemory object storage and directly write
objects to it. I guess in the upcoming commits, we'll swap to using the
API as we implement them.

Makes sense for now.

>  	co->value.size = len;
>  	co->value.type = type;
>  	co_buf = xmalloc(len);
> @@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo,
>  	o->sources = odb_source_new(o, primary_source, true);
>  	o->sources_tail = &o->sources->next;
>  	o->alternate_db = xstrdup_or_null(secondary_sources);
> +	o->inmemory_objects = odb_source_inmemory_new(o);
>
>  	free(to_free);
>
> @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o)
>  	odb_close(o);
>  	odb_free_sources(o);
>
> -	for (size_t i = 0; i < o->cached_object_nr; i++)
> -		free((char *) o->cached_objects[i].value.buf);
> -	free(o->cached_objects);
> +	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
> +		free((char *) o->inmemory_objects->objects[i].value.buf);
> +	free(o->inmemory_objects->objects);
> +	free(o->inmemory_objects->base.path);
> +	free(o->inmemory_objects);
>
>  	string_list_clear(&o->submodule_source_paths, 0);
>
> diff --git a/odb.h b/odb.h
> index 9eb8355aca..c3a7edf9c8 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -8,6 +8,7 @@
>  #include "thread-utils.h"
>
>  struct cached_object_entry;
> +struct odb_source_inmemory;
>  struct packed_git;
>  struct repository;
>  struct strbuf;
> @@ -80,8 +81,7 @@ struct object_database {
>  	 * to write them into the object store (e.g. a browse-only
>  	 * application).
>  	 */
> -	struct cached_object_entry *cached_objects;
> -	size_t cached_object_nr, cached_object_alloc;
> +	struct odb_source_inmemory *inmemory_objects;
>
>  	/*
>  	 * A fast, rough count of the number of objects in the repository.
> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> new file mode 100644
> index 0000000000..c7ac5c24f0
> --- /dev/null
> +++ b/odb/source-inmemory.c
> @@ -0,0 +1,12 @@
> +#include "git-compat-util.h"
> +#include "odb/source-inmemory.h"
> +
> +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
> +{
> +	struct odb_source_inmemory *source;
> +
> +	CALLOC_ARRAY(source, 1);
> +	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
> +
> +	return source;
> +}
> diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
> new file mode 100644
> index 0000000000..95477bf36d
> --- /dev/null
> +++ b/odb/source-inmemory.h
> @@ -0,0 +1,35 @@
> +#ifndef ODB_SOURCE_INMEMORY_H
> +#define ODB_SOURCE_INMEMORY_H
> +
> +#include "odb/source.h"
> +
> +struct cached_object_entry;
> +
> +/*
> + * An inmemory source that you can write objects to that shall be made
> + * available for reading, but that shouldn't ever be persisted to disk. Note
> + * that any objects written to this source will be stored in memory, so the
> + * number of objects you can store is limited by available system memory.
> + */
> +struct odb_source_inmemory {
> +	struct odb_source base;
> +
> +	struct cached_object_entry *objects;
> +	size_t objects_nr, objects_alloc;
> +};
> +
> +/* Create a new in-memory object database source. */
> +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb);
> +
> +/*
> + * Cast the given object database source to the inmemory backend. This will
> + * cause a BUG in case the source doesn't use this backend.
> + */
> +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source)
> +{
> +	if (source->type != ODB_SOURCE_INMEMORY)
> +		BUG("trying to downcast source of type '%d' to inmemory", source->type);
> +	return container_of(source, struct odb_source_inmemory, base);
> +}
> +

Interesting, in the refs namespace the downcast functions are added to
the source file (.c). This works too, is there any reason though?

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback
  2026-04-09  7:24   ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
@ 2026-04-09  9:40     ` Karthik Nayak
  2026-04-09 10:41       ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Karthik Nayak @ 2026-04-09  9:40 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 995 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

[snip]

> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> index ccbb622eae..12c80f9b34 100644
> --- a/odb/source-inmemory.c
> +++ b/odb/source-inmemory.c
> @@ -1,5 +1,57 @@
>  #include "git-compat-util.h"
> +#include "odb.h"
>  #include "odb/source-inmemory.h"
> +#include "repository.h"
> +
> +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
> +						      const struct object_id *oid)
> +{
> +	static const struct cached_object empty_tree = {
> +		.type = OBJ_TREE,
> +		.buf = "",
> +	};
> +	const struct cached_object_entry *co = source->objects;
> +
> +	for (size_t i = 0; i < source->objects_nr; i++, co++)
> +		if (oideq(&co->oid, oid))
> +			return &co->value;
> +
> +	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
> +		return &empty_tree;
> +

Silly questiong, would it make more sense to check for empty_tree before
iterating over all objects?

The rest looks good

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback
  2026-04-09  7:24   ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
@ 2026-04-09  9:49     ` Karthik Nayak
  2026-04-09 10:41       ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Karthik Nayak @ 2026-04-09  9:49 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2494 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Implement the `read_object_stream()` callback function for the in-memory
> source.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
>
> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> index 12c80f9b34..4a68169430 100644
> --- a/odb/source-inmemory.c
> +++ b/odb/source-inmemory.c
> @@ -1,6 +1,7 @@
>  #include "git-compat-util.h"
>  #include "odb.h"
>  #include "odb/source-inmemory.h"
> +#include "odb/streaming.h"
>  #include "repository.h"
>
>  static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
> @@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
>  	return 0;
>  }
>
> +struct odb_read_stream_inmemory {
> +	struct odb_read_stream base;
> +	const void *buf;
> +	size_t offset;
> +};
> +

To stream objects, we have a new structure which is used in the callback.

> +static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream,
> +					     char *buf, size_t buf_len)
> +{
> +	struct odb_read_stream_inmemory *inmemory =
> +		container_of(stream, struct odb_read_stream_inmemory, base);
> +	size_t bytes = buf_len;



> +	if (buf_len > inmemory->base.size - inmemory->offset)
> +		bytes = inmemory->base.size - inmemory->offset;
> +	memcpy(buf, inmemory->buf, bytes);
> +

Shouldn't the offset also be set and we only memcpy offset onwards?

> +	return bytes;
> +}
> +
> +static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED)
> +{
> +	return 0;
> +}
> +
> +static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
> +						  struct odb_source *source,
> +						  const struct object_id *oid)
> +{
> +	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
> +	struct odb_read_stream_inmemory *stream;
> +	const struct cached_object *object;
> +
> +	object = find_cached_object(inmemory, oid);
> +	if (!object)
> +		return -1;
> +
> +	CALLOC_ARRAY(stream, 1);
> +	stream->base.read = odb_read_stream_inmemory_read;
> +	stream->base.close = odb_read_stream_inmemory_close;
> +	stream->base.size = object->size;
> +	stream->base.type = object->type;
> +	stream->buf = object->buf;
> +

So the object is simply mapped to the structure which is propagated in
`read()`. Since we don't copy any new data over, `close()` has nothing
to do.

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 07/17] odb/source-inmemory: implement `write_object()` callback
  2026-04-09  7:24   ` [PATCH v2 07/17] " Patrick Steinhardt
@ 2026-04-09 10:27     ` Karthik Nayak
  2026-04-09 10:41       ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Karthik Nayak @ 2026-04-09 10:27 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1120 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Implement the `write_object()` callback function for the in-memory
> source.
>

rebase error? Seems like the commit message as the last commit.

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  odb/source-inmemory.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> index d2fc4c4054..96e8efd327 100644
> --- a/odb/source-inmemory.c
> +++ b/odb/source-inmemory.c
> @@ -1,4 +1,5 @@
>  #include "git-compat-util.h"
> +#include "object-file.h"
>  #include "odb.h"
>  #include "odb/source-inmemory.h"
>  #include "odb/streaming.h"
> @@ -112,6 +113,8 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
>  	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
>  	struct cached_object_entry *object;
>
> +	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
> +
>  	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
>  		   inmemory->objects_alloc);
>  	object = &inmemory->objects[inmemory->objects_nr++];
>
> --
> 2.54.0.rc0.680.geaeac8ef83.dirty

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 01/17] odb: introduce "in-memory" source
  2026-04-09  9:26     ` Karthik Nayak
@ 2026-04-09 10:41       ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler

On Thu, Apr 09, 2026 at 05:26:50AM -0400, Karthik Nayak wrote:
> > diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
> > new file mode 100644
> > index 0000000000..95477bf36d
> > --- /dev/null
> > +++ b/odb/source-inmemory.h
> > @@ -0,0 +1,35 @@
> > +#ifndef ODB_SOURCE_INMEMORY_H
> > +#define ODB_SOURCE_INMEMORY_H
> > +
> > +#include "odb/source.h"
> > +
> > +struct cached_object_entry;
> > +
> > +/*
> > + * An inmemory source that you can write objects to that shall be made
> > + * available for reading, but that shouldn't ever be persisted to disk. Note
> > + * that any objects written to this source will be stored in memory, so the
> > + * number of objects you can store is limited by available system memory.
> > + */
> > +struct odb_source_inmemory {
> > +	struct odb_source base;
> > +
> > +	struct cached_object_entry *objects;
> > +	size_t objects_nr, objects_alloc;
> > +};
> > +
> > +/* Create a new in-memory object database source. */
> > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb);
> > +
> > +/*
> > + * Cast the given object database source to the inmemory backend. This will
> > + * cause a BUG in case the source doesn't use this backend.
> > + */
> > +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source)
> > +{
> > +	if (source->type != ODB_SOURCE_INMEMORY)
> > +		BUG("trying to downcast source of type '%d' to inmemory", source->type);
> > +	return container_of(source, struct odb_source_inmemory, base);
> > +}
> > +
> 
> Interesting, in the refs namespace the downcast functions are added to
> the source file (.c). This works too, is there any reason though?

By having it static inline over here we can basically ensure that the
compiler can inline this call everywhere. I doubt that it really matters
in the end, but I guess it doesn't hurt, either.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback
  2026-04-09  9:40     ` Karthik Nayak
@ 2026-04-09 10:41       ` Patrick Steinhardt
  2026-04-09 11:22         ` Karthik Nayak
  0 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler

On Thu, Apr 09, 2026 at 05:40:01AM -0400, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> > index ccbb622eae..12c80f9b34 100644
> > --- a/odb/source-inmemory.c
> > +++ b/odb/source-inmemory.c
> > @@ -1,5 +1,57 @@
> >  #include "git-compat-util.h"
> > +#include "odb.h"
> >  #include "odb/source-inmemory.h"
> > +#include "repository.h"
> > +
> > +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
> > +						      const struct object_id *oid)
> > +{
> > +	static const struct cached_object empty_tree = {
> > +		.type = OBJ_TREE,
> > +		.buf = "",
> > +	};
> > +	const struct cached_object_entry *co = source->objects;
> > +
> > +	for (size_t i = 0; i < source->objects_nr; i++, co++)
> > +		if (oideq(&co->oid, oid))
> > +			return &co->value;
> > +
> > +	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
> > +		return &empty_tree;
> > +
> 
> Silly questiong, would it make more sense to check for empty_tree before
> iterating over all objects?
> 
> The rest looks good

Maybe? I guess for now reading the empty tree is the most important use
case we have for the in-memory backend, as we only write in-memory
objects in a single caller. On the other hand, `source->objects_nr`
would be zero in all the other cases, and jumping over the loop should
be fast enough to not matter in practice.

An alternative I was thinking about is to store the empty tree the same
way as we store all the other objects so that we don't have to special
case anything. That has the benefit that we can actually modify the tree
object, too, which may eventually become relevant with regards to an
object's mtime that we may want to update. The downside is that we have
another allocation here and need to eagerly initialize the data
structure that stores the objects.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback
  2026-04-09  9:49     ` Karthik Nayak
@ 2026-04-09 10:41       ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler

On Thu, Apr 09, 2026 at 05:49:32AM -0400, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Implement the `read_object_stream()` callback function for the in-memory
> > source.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 50 insertions(+)
> >
> > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> > index 12c80f9b34..4a68169430 100644
> > --- a/odb/source-inmemory.c
> > +++ b/odb/source-inmemory.c
> > @@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
> >  	return 0;
> >  }
> >
> > +struct odb_read_stream_inmemory {
> > +	struct odb_read_stream base;
> > +	const void *buf;
> > +	size_t offset;
> > +};
> > +
> 
> To stream objects, we have a new structure which is used in the callback.
> 
> > +static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream,
> > +					     char *buf, size_t buf_len)
> > +{
> > +	struct odb_read_stream_inmemory *inmemory =
> > +		container_of(stream, struct odb_read_stream_inmemory, base);
> > +	size_t bytes = buf_len;
> 
> 
> 
> > +	if (buf_len > inmemory->base.size - inmemory->offset)
> > +		bytes = inmemory->base.size - inmemory->offset;
> > +	memcpy(buf, inmemory->buf, bytes);
> > +
> 
> Shouldn't the offset also be set and we only memcpy offset onwards?

Oh, good catch. We don't have any users of this API yet, which is why it
went undetected. Will fix, thanks.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 07/17] odb/source-inmemory: implement `write_object()` callback
  2026-04-09 10:27     ` Karthik Nayak
@ 2026-04-09 10:41       ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler

On Thu, Apr 09, 2026 at 06:27:27AM -0400, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Implement the `write_object()` callback function for the in-memory
> > source.
> >
> 
> rebase error? Seems like the commit message as the last commit.

I saw the empty new commit in the range diff, but somehow didn't get
what was happening. But yes, this obviously needs to be squashed into
the preceding commit, thanks!

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback
  2026-04-09 10:41       ` Patrick Steinhardt
@ 2026-04-09 11:22         ` Karthik Nayak
  0 siblings, 0 replies; 85+ messages in thread
From: Karthik Nayak @ 2026-04-09 11:22 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2148 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> On Thu, Apr 09, 2026 at 05:40:01AM -0400, Karthik Nayak wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>> > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
>> > index ccbb622eae..12c80f9b34 100644
>> > --- a/odb/source-inmemory.c
>> > +++ b/odb/source-inmemory.c
>> > @@ -1,5 +1,57 @@
>> >  #include "git-compat-util.h"
>> > +#include "odb.h"
>> >  #include "odb/source-inmemory.h"
>> > +#include "repository.h"
>> > +
>> > +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
>> > +						      const struct object_id *oid)
>> > +{
>> > +	static const struct cached_object empty_tree = {
>> > +		.type = OBJ_TREE,
>> > +		.buf = "",
>> > +	};
>> > +	const struct cached_object_entry *co = source->objects;
>> > +
>> > +	for (size_t i = 0; i < source->objects_nr; i++, co++)
>> > +		if (oideq(&co->oid, oid))
>> > +			return &co->value;
>> > +
>> > +	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
>> > +		return &empty_tree;
>> > +
>>
>> Silly questiong, would it make more sense to check for empty_tree before
>> iterating over all objects?
>>
>> The rest looks good
>
> Maybe? I guess for now reading the empty tree is the most important use
> case we have for the in-memory backend, as we only write in-memory
> objects in a single caller. On the other hand, `source->objects_nr`
> would be zero in all the other cases, and jumping over the loop should
> be fast enough to not matter in practice.
>

That was what I understood, okay so it's fine as is.

> An alternative I was thinking about is to store the empty tree the same
> way as we store all the other objects so that we don't have to special
> case anything. That has the benefit that we can actually modify the tree
> object, too, which may eventually become relevant with regards to an
> object's mtime that we may want to update. The downside is that we have
> another allocation here and need to eagerly initialize the data
> structure that stores the objects.
>
> Patrick

That would be good too, but I also think maybe it is fine to just leave
it. This is simple enough.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes
  2026-04-09  7:24   ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
@ 2026-04-09 11:36     ` Karthik Nayak
  2026-04-09 11:46       ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Karthik Nayak @ 2026-04-09 11:36 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 428 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

[snip]

> diff --git a/cbtree.h b/cbtree.h
> index c374b1b3db..3ce0d6b287 100644
> --- a/cbtree.h
> +++ b/cbtree.h
> @@ -23,18 +23,19 @@ struct cb_node {
>  	 */
>  	uint32_t byte;
>  	uint8_t otherbits;
> -	uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */
>  };
>

Seems like we need to update the comments at the top of the header file
which still talks about this field.

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 00/17] odb: introduce "in-memory" source
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
                     ` (16 preceding siblings ...)
  2026-04-09  7:24   ` [PATCH v2 17/17] odb: generic in-memory source Patrick Steinhardt
@ 2026-04-09 11:44   ` Karthik Nayak
  2026-04-09 11:48     ` Patrick Steinhardt
  17 siblings, 1 reply; 85+ messages in thread
From: Karthik Nayak @ 2026-04-09 11:44 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1560 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> this patch series introduces the second object database source type,
> which is the "in-memory" source.
>
> This source may seem somewhat odd at first: it always starts out empty,
> and any object written into it will only exist in memory until the
> process exits. But the source already serves a purpose in our codebase,
> where some commands, for example git-blame(1), write an in-memory
> worktree commit.
>
> Furthermore, I think that going forward it can serve more purposes as we
> now have an easy way to write and read objects that will not get
> persisted. I could see that this may be useful when for example
> re-merging diffs. But eventually, once we have the object storage format
> extension wired up, callers might even want to manually set up an
> in-memory database as the primary ODB for write operations so that no
> data will be persisted in an arbitrary write.
>
> Last but not least, this patch series also serves the purpose of
> eventually getting rid of the `struct object_info::whence` member.
> Instead, we'll simply yield the ODB source a specific object has been
> read from, together with some backend-specific data, which gives
> strictly more information compared to the status quo.
>
> The series is based onb15384c06f (A bit more post -rc1, 2026-04-08)
> with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make
> `write_object_stream()` pluggable, 2026-04-02) merged into it.
>

Was a nice read, only a few comments from me. Should be good with a
re-roll!

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes
  2026-04-09 11:36     ` Karthik Nayak
@ 2026-04-09 11:46       ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09 11:46 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler

On Thu, Apr 09, 2026 at 07:36:50AM -0400, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> [snip]
> 
> > diff --git a/cbtree.h b/cbtree.h
> > index c374b1b3db..3ce0d6b287 100644
> > --- a/cbtree.h
> > +++ b/cbtree.h
> > @@ -23,18 +23,19 @@ struct cb_node {
> >  	 */
> >  	uint32_t byte;
> >  	uint8_t otherbits;
> > -	uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */
> >  };
> >
> 
> Seems like we need to update the comments at the top of the header file
> which still talks about this field.

Good eyes, will adapt.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 00/17] odb: introduce "in-memory" source
  2026-04-09 11:44   ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak
@ 2026-04-09 11:48     ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-09 11:48 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler

On Thu, Apr 09, 2026 at 07:44:17AM -0400, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Hi,
> >
> > this patch series introduces the second object database source type,
> > which is the "in-memory" source.
> >
> > This source may seem somewhat odd at first: it always starts out empty,
> > and any object written into it will only exist in memory until the
> > process exits. But the source already serves a purpose in our codebase,
> > where some commands, for example git-blame(1), write an in-memory
> > worktree commit.
> >
> > Furthermore, I think that going forward it can serve more purposes as we
> > now have an easy way to write and read objects that will not get
> > persisted. I could see that this may be useful when for example
> > re-merging diffs. But eventually, once we have the object storage format
> > extension wired up, callers might even want to manually set up an
> > in-memory database as the primary ODB for write operations so that no
> > data will be persisted in an arbitrary write.
> >
> > Last but not least, this patch series also serves the purpose of
> > eventually getting rid of the `struct object_info::whence` member.
> > Instead, we'll simply yield the ODB source a specific object has been
> > read from, together with some backend-specific data, which gives
> > strictly more information compared to the status quo.
> >
> > The series is based onb15384c06f (A bit more post -rc1, 2026-04-08)
> > with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make
> > `write_object_stream()` pluggable, 2026-04-02) merged into it.
> >
> 
> Was a nice read, only a few comments from me. Should be good with a
> re-roll!

Thanks! Will send the new version tomorrow to wait for some more
feedback.

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/16] odb: introduce "inmemory" source
  2026-04-09  5:22       ` Patrick Steinhardt
@ 2026-04-09 13:46         ` Junio C Hamano
  2026-04-10  4:53           ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Junio C Hamano @ 2026-04-09 13:46 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

>> But stepping back a bit, does this new "in memory" refer to a
>> concept that is different from what the rest of the system uses "in
>> core" to represent?
>
> No, in principle it's not any different. One of the reasons I decided to
> go with "in memory" though is that this backend may eventually be
> (power-)user-facing via the planned "objectStorage" extension.

Doesn't 

    git grep -E -e 'in[- ]?core' -- ':!Documentation/RelNotes' ':!t'

give many hits that we want to be in line with in the codebase
anyway, and even in some user-facing things?  I just noticed an
option "--no-kept-objects=in-core" (which I didn't know about ;-).

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions
  2026-04-09  7:24   ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
@ 2026-04-09 19:39     ` Junio C Hamano
  2026-04-10  4:53       ` Patrick Steinhardt
  0 siblings, 1 reply; 85+ messages in thread
From: Junio C Hamano @ 2026-04-09 19:39 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> Stub out remaining functions that we either don't need or that are
> basically no-ops.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>
> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> index 15a6a5ae64..1140b1b916 100644
> --- a/odb/source-inmemory.c
> +++ b/odb/source-inmemory.c
> @@ -299,6 +299,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source,
>  	return 0;
>  }
>  
> +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED,
> +						 struct odb_transaction **out UNUSED)
> +{
> +	return error("inmemory source does not support transactions");
> +}
> +
> +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED,
> +					       struct strvec *out UNUSED)
> +{
> +	return 0;
> +}
> +
> +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED,
> +					       const char *alternate UNUSED)
> +{
> +	return error("inmemory source does not support alternates");
> +}

OK, 00/17 said it only fixed log message, but the messages or
anything end-user facing should consistently say "in-memory".

Or, "in-core", if "incore" is chosen as part of identifiers to be
consistent with the rest of the system.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/16] odb: introduce "inmemory" source
  2026-04-09 13:46         ` Junio C Hamano
@ 2026-04-10  4:53           ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10  4:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Apr 09, 2026 at 06:46:30AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> But stepping back a bit, does this new "in memory" refer to a
> >> concept that is different from what the rest of the system uses "in
> >> core" to represent?
> >
> > No, in principle it's not any different. One of the reasons I decided to
> > go with "in memory" though is that this backend may eventually be
> > (power-)user-facing via the planned "objectStorage" extension.
> 
> Doesn't 
> 
>     git grep -E -e 'in[- ]?core' -- ':!Documentation/RelNotes' ':!t'
> 
> give many hits that we want to be in line with in the codebase
> anyway, and even in some user-facing things?  I just noticed an
> option "--no-kept-objects=in-core" (which I didn't know about ;-).

Most of the hits are in our code though, and end users wouldn't
typically see those. So what I think is more relevant is documentation
or options like the one you pointed out. But "--no-kept-objects=" is not
even documented, which basically leaves us with the following hits:

  Documentation/gitformat-pack.adoc:write a cruft pack. Crucially, the set of in-core kept packs is exactly the set
  Documentation/technical/parallel-checkout.adoc:parallelize the work of uncompressing the blobs, applying in-core
  Documentation/technical/racy-git.adoc:because in-core timestamps can have finer granularity than
  Documentation/technical/racy-git.adoc:([PATCH] Sync in core time granularity with filesystems,

I think that these hits are all related to what we're doing here, as
we're talking about object data that we handle in-core.

I initially said "it's not any different", but thinking a bit more about
it I think there is a slight difference: in-core could be any object
that we have parsed from the object database, even if it's backed by an
actual on-disk object. In-memory objects may not even have been parsed
at all, so technically speaking they may not even be in-core.

So in summary:

  - I think that end users have not really been exposed to the concept
    of "in-core".

  - The concepts of "in-core" and the ODB source here are slightly
    different, as any object is treated as "in-core" that has been
    parsed.

  - The concept of "in-memory" is easier for the end user to understand
    in the context of the ODB, as it's a more general concept compared
    to the very Git-specific "in-core" term".

Hope that makes sense :)

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions
  2026-04-09 19:39     ` Junio C Hamano
@ 2026-04-10  4:53       ` Patrick Steinhardt
  0 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10  4:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Justin Tobler

On Thu, Apr 09, 2026 at 12:39:00PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Stub out remaining functions that we either don't need or that are
> > basically no-ops.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++
> >  1 file changed, 31 insertions(+)
> >
> > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> > index 15a6a5ae64..1140b1b916 100644
> > --- a/odb/source-inmemory.c
> > +++ b/odb/source-inmemory.c
> > @@ -299,6 +299,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source,
> >  	return 0;
> >  }
> >  
> > +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED,
> > +						 struct odb_transaction **out UNUSED)
> > +{
> > +	return error("inmemory source does not support transactions");
> > +}
> > +
> > +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED,
> > +					       struct strvec *out UNUSED)
> > +{
> > +	return 0;
> > +}
> > +
> > +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED,
> > +					       const char *alternate UNUSED)
> > +{
> > +	return error("inmemory source does not support alternates");
> > +}
> 
> OK, 00/17 said it only fixed log message, but the messages or
> anything end-user facing should consistently say "in-memory".
> 
> Or, "in-core", if "incore" is chosen as part of identifiers to be
> consistent with the rest of the system.

Good catch, will fix. Thanks!

Patrick

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 00/17] odb: introduce "in-memory" source
  2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
                   ` (17 preceding siblings ...)
  2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
@ 2026-04-10 12:12 ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 01/17] " Patrick Steinhardt
                     ` (17 more replies)
  18 siblings, 18 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Hi,

this patch series introduces the second object database source type,
which is the "in-memory" source.

This source may seem somewhat odd at first: it always starts out empty,
and any object written into it will only exist in memory until the
process exits. But the source already serves a purpose in our codebase,
where some commands, for example git-blame(1), write an in-memory
worktree commit.

Furthermore, I think that going forward it can serve more purposes as we
now have an easy way to write and read objects that will not get
persisted. I could see that this may be useful when for example
re-merging diffs. But eventually, once we have the object storage format
extension wired up, callers might even want to manually set up an
in-memory database as the primary ODB for write operations so that no
data will be persisted in an arbitrary write.

Last but not least, this patch series also serves the purpose of
eventually getting rid of the `struct object_info::whence` member.
Instead, we'll simply yield the ODB source a specific object has been
read from, together with some backend-specific data, which gives
strictly more information compared to the status quo.

The series is based onb15384c06f (A bit more post -rc1, 2026-04-08)
with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make
`write_object_stream()` pluggable, 2026-04-02) merged into it.

Changes in v2:
  - Fix handling of object IDs when writing objects.
  - I've changed the base of this series to include Justin's
    refactorings for the ODB write streams. I've updated the above
    paragraph detailing the merge base accordingly. @Junio: I'm fine to
    defer this patch series a bit until Justin's patch series has been
    merged to `next` in case this causes inconvenience.
  - Use "in-memory" instead of "inmemory" in commit messages.
  - Link to v1: https://patch.msgid.link/20260403-b4-pks-odb-source-inmemory-v1-0-8b8d1abaa25e@pks.im

Changes in v3:
  - Fix a couple more instances where we were saying "inmemory" in
    prose.
  - Fix streaming interface when reading an object.
  - Add unit tests to exercise full functionality of the new source.
    Some of the functionality isn't exercised in our code base yet, so
    this allows us to verify that things work as expected.
  - Link to v2: https://patch.msgid.link/20260409-b4-pks-odb-source-inmemory-v2-0-f02b4f1c0f13@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (17):
      odb: introduce "in-memory" source
      odb/source-inmemory: implement `free()` callback
      odb: fix unnecessary call to `find_cached_object()`
      odb/source-inmemory: implement `read_object_info()` callback
      odb/source-inmemory: implement `read_object_stream()` callback
      odb/source-inmemory: implement `write_object()` callback
      odb/source-inmemory: implement `write_object_stream()` callback
      cbtree: allow using arbitrary wrapper structures for nodes
      oidtree: add ability to store data
      odb/source-inmemory: convert to use oidtree
      odb/source-inmemory: implement `for_each_object()` callback
      odb/source-inmemory: implement `find_abbrev_len()` callback
      odb/source-inmemory: implement `count_objects()` callback
      odb/source-inmemory: implement `freshen_object()` callback
      odb/source-inmemory: stub out remaining functions
      odb: generic in-memory source
      t/unit-tests: add tests for the in-memory object source

 Makefile                      |   2 +
 cbtree.c                      |  25 ++-
 cbtree.h                      |  17 +-
 loose.c                       |   2 +-
 meson.build                   |   1 +
 object-file.c                 |   3 +-
 odb.c                         |  82 ++-------
 odb.h                         |   4 +-
 odb/source-inmemory.c         | 382 ++++++++++++++++++++++++++++++++++++++++++
 odb/source-inmemory.h         |  33 ++++
 odb/source.h                  |   3 +
 oidtree.c                     |  66 +++++---
 oidtree.h                     |  12 +-
 t/meson.build                 |   1 +
 t/unit-tests/u-odb-inmemory.c | 313 ++++++++++++++++++++++++++++++++++
 t/unit-tests/u-oidtree.c      |  26 ++-
 16 files changed, 854 insertions(+), 118 deletions(-)

Range-diff versus v2:

 1:  b18e427c69 !  1:  155b2cdf81 odb: introduce "in-memory" source
    @@ odb/source-inmemory.h (new)
     +struct cached_object_entry;
     +
     +/*
    -+ * An inmemory source that you can write objects to that shall be made
    ++ * An in-memory source that you can write objects to that shall be made
     + * available for reading, but that shouldn't ever be persisted to disk. Note
     + * that any objects written to this source will be stored in memory, so the
     + * number of objects you can store is limited by available system memory.
    @@ odb/source-inmemory.h (new)
     +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb);
     +
     +/*
    -+ * Cast the given object database source to the inmemory backend. This will
    ++ * Cast the given object database source to the in-memory backend. This will
     + * cause a BUG in case the source doesn't use this backend.
     + */
     +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source)
     +{
     +	if (source->type != ODB_SOURCE_INMEMORY)
    -+		BUG("trying to downcast source of type '%d' to inmemory", source->type);
    ++		BUG("trying to downcast source of type '%d' to in-memory", source->type);
     +	return container_of(source, struct odb_source_inmemory, base);
     +}
     +
    @@ odb/source.h: enum odb_source_type {
      	/* The "files" backend that uses loose objects and packfiles. */
      	ODB_SOURCE_FILES,
     +
    -+	/* The "inmemory" backend that stores objects in memory. */
    ++	/* The "in-memory" backend that stores objects in memory. */
     +	ODB_SOURCE_INMEMORY,
      };
      
 2:  8fd337da90 !  2:  c66edd10a8 odb/source-inmemory: implement `free()` callback
    @@ odb/source-inmemory.h
     +};
      
      /*
    -  * An inmemory source that you can write objects to that shall be made
    +  * An in-memory source that you can write objects to that shall be made
 3:  f4ae2a2bde =  3:  a86549f39c odb: fix unnecessary call to `find_cached_object()`
 4:  8600b88530 =  4:  49ac739dd2 odb/source-inmemory: implement `read_object_info()` callback
 5:  ab33c0b7ee !  5:  321ef11be3 odb/source-inmemory: implement `read_object_stream()` callback
    @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od
      
     +struct odb_read_stream_inmemory {
     +	struct odb_read_stream base;
    -+	const void *buf;
    ++	const unsigned char *buf;
     +	size_t offset;
     +};
     +
    @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od
     +
     +	if (buf_len > inmemory->base.size - inmemory->offset)
     +		bytes = inmemory->base.size - inmemory->offset;
    -+	memcpy(buf, inmemory->buf, bytes);
    ++
    ++	memcpy(buf, inmemory->buf + inmemory->offset, bytes);
    ++	inmemory->offset += bytes;
     +
     +	return bytes;
     +}
 6:  983f886eeb !  6:  506df5e488 odb/source-inmemory: implement `write_object()` callback
    @@ odb.c: int odb_pretend_object(struct object_database *odb,
      void *odb_read_object(struct object_database *odb,
     
      ## odb/source-inmemory.c ##
    +@@
    + #include "git-compat-util.h"
    ++#include "object-file.h"
    + #include "odb.h"
    + #include "odb/source-inmemory.h"
    + #include "odb/streaming.h"
     @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
      	return 0;
      }
    @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct
     +	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
     +	struct cached_object_entry *object;
     +
    ++	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
    ++
     +	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
     +		   inmemory->objects_alloc);
     +	object = &inmemory->objects[inmemory->objects_nr++];
 7:  68edefa269 <  -:  ---------- odb/source-inmemory: implement `write_object()` callback
 8:  18d451152b !  7:  21eef34c1b odb/source-inmemory: implement `write_object_stream()` callback
    @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so
     +			goto out;
     +		}
     +
    -+		memcpy(data, buf, bytes_read);
    ++		memcpy(data + total_read, buf, bytes_read);
     +		total_read += bytes_read;
     +	}
     +
 9:  cee53b9853 !  8:  504e34d116 cbtree: allow using arbitrary wrapper structures for nodes
    @@ cbtree.c: int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
      
     
      ## cbtree.h ##
    +@@
    +  *
    +  * This is adapted to store arbitrary data (not just NUL-terminated C strings
    +  * and allocates no memory internally.  The user needs to allocate
    +- * "struct cb_node" and fill cb_node.k[] with arbitrary match data
    +- * for memcmp.
    +- * If "klen" is variable, then it should be embedded into "c_node.k[]"
    ++ * "struct cb_node" and provide `key_offset` to indicate where the key can be
    ++ * found relative to the `struct cb_node` for memcmp.
    ++ * If "klen" is variable, then it should be embedded into the key.
    +  * Recursion is bound by the maximum value of "klen" used.
    +  */
    + #ifndef CBTREE_H
     @@ cbtree.h: struct cb_node {
      	 */
      	uint32_t byte;
10:  8ad5b81b13 =  9:  9bdd475a92 oidtree: add ability to store data
11:  1ed2d23137 ! 10:  956b989529 odb/source-inmemory: convert to use oidtree
    @@ odb/source-inmemory.h
     +struct oidtree;
      
      /*
    -  * An inmemory source that you can write objects to that shall be made
    +  * An in-memory source that you can write objects to that shall be made
     @@ odb/source-inmemory.h: struct cached_object_entry {
       */
      struct odb_source_inmemory {
12:  99fbb1cc35 ! 11:  bec1428116 odb/source-inmemory: implement `for_each_object()` callback
    @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct
     +	if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) ||
     +	    (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local))
     +		return 0;
    ++	if (!inmemory->objects)
    ++		return 0;
     +
     +	return oidtree_each(inmemory->objects,
     +			    opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len,
13:  c87a621f39 = 12:  32dada3c27 odb/source-inmemory: implement `find_abbrev_len()` callback
14:  9b88f0c07b = 13:  43127840c0 odb/source-inmemory: implement `count_objects()` callback
15:  3c9493f2bb = 14:  439acbd068 odb/source-inmemory: implement `freshen_object()` callback
16:  f2b6317104 ! 15:  12c1b6ffd2 odb/source-inmemory: stub out remaining functions
    @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_
     +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED,
     +						 struct odb_transaction **out UNUSED)
     +{
    -+	return error("inmemory source does not support transactions");
    ++	return error("in-memory source does not support transactions");
     +}
     +
     +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED,
    @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_
     +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED,
     +					       const char *alternate UNUSED)
     +{
    -+	return error("inmemory source does not support alternates");
    ++	return error("in-memory source does not support alternates");
     +}
     +
     +static void odb_source_inmemory_close(struct odb_source *source UNUSED)
17:  81da5d5048 = 16:  ef37a61e7f odb: generic in-memory source
 -:  ---------- > 17:  51b51e0382 t/unit-tests: add tests for the in-memory object source

---
base-commit: a3ebc5a08e67ccac4c915622049a968a31e48662
change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH v3 01/17] odb: introduce "in-memory" source
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
                     ` (16 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Next to our typical object database sources, each object database also
has an implicit source of "cached" objects. These cached objects only
exist in memory and some use cases:

  - They contain evergreen objects that we expect to always exist, like
    for example the empty tree.

  - They can be used to store temporary objects that we don't want to
    persist to disk, which is used by git-blame(1) to create a fake
    worktree commit.

Overall, their use is somewhat restricted though. For example, we don't
provide the ability to use it as a temporary object database source that
allows the user to write objects, but discard them after Git exists. So
while these cached objects behave almost like a source, they aren't used
as one.

This is about to change over the following commits, where we will turn
cached objects into a new "in-memory" source. This will allow us to use
it exactly the same as any other source by providing the same common
interface as the "files" source.

For now, the in-memory source only hosts the cached objects and doesn't
provide any logic yet. This will change with subsequent commits, where
we move respective functionality into the source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Makefile              |  1 +
 meson.build           |  1 +
 odb.c                 | 21 +++++++++++++--------
 odb.h                 |  4 ++--
 odb/source-inmemory.c | 12 ++++++++++++
 odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++
 odb/source.h          |  3 +++
 7 files changed, 67 insertions(+), 10 deletions(-)

diff --git a/Makefile b/Makefile
index 22a8993482..3cda12c455 100644
--- a/Makefile
+++ b/Makefile
@@ -1218,6 +1218,7 @@ LIB_OBJS += object.o
 LIB_OBJS += odb.o
 LIB_OBJS += odb/source.o
 LIB_OBJS += odb/source-files.o
+LIB_OBJS += odb/source-inmemory.o
 LIB_OBJS += odb/streaming.o
 LIB_OBJS += odb/transaction.o
 LIB_OBJS += oid-array.o
diff --git a/meson.build b/meson.build
index 6dc23b3af2..ffa73ce7ce 100644
--- a/meson.build
+++ b/meson.build
@@ -404,6 +404,7 @@ libgit_sources = [
   'odb.c',
   'odb/source.c',
   'odb/source-files.c',
+  'odb/source-inmemory.c',
   'odb/streaming.c',
   'odb/transaction.c',
   'oid-array.c',
diff --git a/odb.c b/odb.c
index 40a5e9c4e0..60e1eead25 100644
--- a/odb.c
+++ b/odb.c
@@ -14,6 +14,7 @@
 #include "object-file.h"
 #include "object-name.h"
 #include "odb.h"
+#include "odb/source-inmemory.h"
 #include "packfile.h"
 #include "path.h"
 #include "promisor-remote.h"
@@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob
 		.type = OBJ_TREE,
 		.buf = "",
 	};
-	const struct cached_object_entry *co = object_store->cached_objects;
+	const struct cached_object_entry *co = object_store->inmemory_objects->objects;
 
-	for (size_t i = 0; i < object_store->cached_object_nr; i++, co++)
+	for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++)
 		if (oideq(&co->oid, oid))
 			return &co->value;
 
@@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb,
 	    find_cached_object(odb, oid))
 		return 0;
 
-	ALLOC_GROW(odb->cached_objects,
-		   odb->cached_object_nr + 1, odb->cached_object_alloc);
-	co = &odb->cached_objects[odb->cached_object_nr++];
+	ALLOC_GROW(odb->inmemory_objects->objects,
+		   odb->inmemory_objects->objects_nr + 1,
+		   odb->inmemory_objects->objects_alloc);
+	co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++];
 	co->value.size = len;
 	co->value.type = type;
 	co_buf = xmalloc(len);
@@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo,
 	o->sources = odb_source_new(o, primary_source, true);
 	o->sources_tail = &o->sources->next;
 	o->alternate_db = xstrdup_or_null(secondary_sources);
+	o->inmemory_objects = odb_source_inmemory_new(o);
 
 	free(to_free);
 
@@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o)
 	odb_close(o);
 	odb_free_sources(o);
 
-	for (size_t i = 0; i < o->cached_object_nr; i++)
-		free((char *) o->cached_objects[i].value.buf);
-	free(o->cached_objects);
+	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
+		free((char *) o->inmemory_objects->objects[i].value.buf);
+	free(o->inmemory_objects->objects);
+	free(o->inmemory_objects->base.path);
+	free(o->inmemory_objects);
 
 	string_list_clear(&o->submodule_source_paths, 0);
 
diff --git a/odb.h b/odb.h
index 9eb8355aca..c3a7edf9c8 100644
--- a/odb.h
+++ b/odb.h
@@ -8,6 +8,7 @@
 #include "thread-utils.h"
 
 struct cached_object_entry;
+struct odb_source_inmemory;
 struct packed_git;
 struct repository;
 struct strbuf;
@@ -80,8 +81,7 @@ struct object_database {
 	 * to write them into the object store (e.g. a browse-only
 	 * application).
 	 */
-	struct cached_object_entry *cached_objects;
-	size_t cached_object_nr, cached_object_alloc;
+	struct odb_source_inmemory *inmemory_objects;
 
 	/*
 	 * A fast, rough count of the number of objects in the repository.
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
new file mode 100644
index 0000000000..c7ac5c24f0
--- /dev/null
+++ b/odb/source-inmemory.c
@@ -0,0 +1,12 @@
+#include "git-compat-util.h"
+#include "odb/source-inmemory.h"
+
+struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
+{
+	struct odb_source_inmemory *source;
+
+	CALLOC_ARRAY(source, 1);
+	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
+
+	return source;
+}
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
new file mode 100644
index 0000000000..15db068ef7
--- /dev/null
+++ b/odb/source-inmemory.h
@@ -0,0 +1,35 @@
+#ifndef ODB_SOURCE_INMEMORY_H
+#define ODB_SOURCE_INMEMORY_H
+
+#include "odb/source.h"
+
+struct cached_object_entry;
+
+/*
+ * An in-memory source that you can write objects to that shall be made
+ * available for reading, but that shouldn't ever be persisted to disk. Note
+ * that any objects written to this source will be stored in memory, so the
+ * number of objects you can store is limited by available system memory.
+ */
+struct odb_source_inmemory {
+	struct odb_source base;
+
+	struct cached_object_entry *objects;
+	size_t objects_nr, objects_alloc;
+};
+
+/* Create a new in-memory object database source. */
+struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb);
+
+/*
+ * Cast the given object database source to the in-memory backend. This will
+ * cause a BUG in case the source doesn't use this backend.
+ */
+static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source)
+{
+	if (source->type != ODB_SOURCE_INMEMORY)
+		BUG("trying to downcast source of type '%d' to in-memory", source->type);
+	return container_of(source, struct odb_source_inmemory, base);
+}
+
+#endif
diff --git a/odb/source.h b/odb/source.h
index f706e0608a..0a440884e4 100644
--- a/odb/source.h
+++ b/odb/source.h
@@ -13,6 +13,9 @@ enum odb_source_type {
 
 	/* The "files" backend that uses loose objects and packfiles. */
 	ODB_SOURCE_FILES,
+
+	/* The "in-memory" backend that stores objects in memory. */
+	ODB_SOURCE_INMEMORY,
 };
 
 struct object_id;

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 01/17] " Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
                     ` (15 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `free()` callback function for the "in-memory" source.

Note that this requires us to define `struct cached_object_entry` in
"odb/source-inmemory.h", as it is accessed in both "odb.c" and
"odb/source-inmemory.c" now. This will be fixed in subsequent commits
though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 25 ++++---------------------
 odb/source-inmemory.c | 12 ++++++++++++
 odb/source-inmemory.h |  9 ++++++++-
 3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/odb.c b/odb.c
index 60e1eead25..1d65825ed3 100644
--- a/odb.c
+++ b/odb.c
@@ -32,21 +32,6 @@
 KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
 	struct odb_source *, 1, fspathhash, fspatheq)
 
-/*
- * This is meant to hold a *small* number of objects that you would
- * want odb_read_object() to be able to return, but yet you do not want
- * to write them into the object store (e.g. a browse-only
- * application).
- */
-struct cached_object_entry {
-	struct object_id oid;
-	struct cached_object {
-		enum object_type type;
-		const void *buf;
-		unsigned long size;
-	} value;
-};
-
 static const struct cached_object *find_cached_object(struct object_database *object_store,
 						      const struct object_id *oid)
 {
@@ -1109,6 +1094,10 @@ static void odb_free_sources(struct object_database *o)
 		odb_source_free(o->sources);
 		o->sources = next;
 	}
+
+	odb_source_free(&o->inmemory_objects->base);
+	o->inmemory_objects = NULL;
+
 	kh_destroy_odb_path_map(o->source_by_path);
 	o->source_by_path = NULL;
 }
@@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o)
 	odb_close(o);
 	odb_free_sources(o);
 
-	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
-		free((char *) o->inmemory_objects->objects[i].value.buf);
-	free(o->inmemory_objects->objects);
-	free(o->inmemory_objects->base.path);
-	free(o->inmemory_objects);
-
 	string_list_clear(&o->submodule_source_paths, 0);
 
 	free(o);
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index c7ac5c24f0..ccbb622eae 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,6 +1,16 @@
 #include "git-compat-util.h"
 #include "odb/source-inmemory.h"
 
+static void odb_source_inmemory_free(struct odb_source *source)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	for (size_t i = 0; i < inmemory->objects_nr; i++)
+		free((char *) inmemory->objects[i].value.buf);
+	free(inmemory->objects);
+	free(inmemory->base.path);
+	free(inmemory);
+}
+
 struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 {
 	struct odb_source_inmemory *source;
@@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	CALLOC_ARRAY(source, 1);
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
+	source->base.free = odb_source_inmemory_free;
+
 	return source;
 }
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
index 15db068ef7..d1b05a3996 100644
--- a/odb/source-inmemory.h
+++ b/odb/source-inmemory.h
@@ -3,7 +3,14 @@
 
 #include "odb/source.h"
 
-struct cached_object_entry;
+struct cached_object_entry {
+	struct object_id oid;
+	struct cached_object {
+		enum object_type type;
+		const void *buf;
+		unsigned long size;
+	} value;
+};
 
 /*
  * An in-memory source that you can write objects to that shall be made

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()`
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 01/17] " Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
                     ` (14 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The function `odb_pretend_object()` writes an object into the in-memory
object database source. The effect of this is that the object will now
become readable, but it won't ever be persisted to disk.

Before storing the object, we first verify whether the object already
exists. This is done by calling `odb_has_object()` to check all sources,
followed by `find_cached_object()` to check whether we have already
stored the object in our in-memory source.

This is unnecessary though, as `odb_has_object()` already checks the
in-memory source transitively via:

  - `odb_has_object()`
  - `odb_read_object_info_extended()`
  - `do_oid_object_info_extended()`
  - `find_cached_object()`

Drop the explicit call to `find_cached_object()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/odb.c b/odb.c
index 1d65825ed3..ea3fcf5e11 100644
--- a/odb.c
+++ b/odb.c
@@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb,
 	char *co_buf;
 
 	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
-	if (odb_has_object(odb, oid, 0) ||
-	    find_cached_object(odb, oid))
+	if (odb_has_object(odb, oid, 0))
 		return 0;
 
 	ALLOC_GROW(odb->inmemory_objects->objects,

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
                     ` (13 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `read_object_info()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 39 +------------------------------------
 odb/source-inmemory.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 38 deletions(-)

diff --git a/odb.c b/odb.c
index ea3fcf5e11..6a3912adac 100644
--- a/odb.c
+++ b/odb.c
@@ -32,25 +32,6 @@
 KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
 	struct odb_source *, 1, fspathhash, fspatheq)
 
-static const struct cached_object *find_cached_object(struct object_database *object_store,
-						      const struct object_id *oid)
-{
-	static const struct cached_object empty_tree = {
-		.type = OBJ_TREE,
-		.buf = "",
-	};
-	const struct cached_object_entry *co = object_store->inmemory_objects->objects;
-
-	for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++)
-		if (oideq(&co->oid, oid))
-			return &co->value;
-
-	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
-		return &empty_tree;
-
-	return NULL;
-}
-
 int odb_mkstemp(struct object_database *odb,
 		struct strbuf *temp_filename, const char *pattern)
 {
@@ -570,7 +551,6 @@ static int do_oid_object_info_extended(struct object_database *odb,
 				       const struct object_id *oid,
 				       struct object_info *oi, unsigned flags)
 {
-	const struct cached_object *co;
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
@@ -580,25 +560,8 @@ static int do_oid_object_info_extended(struct object_database *odb,
 	if (is_null_oid(real))
 		return -1;
 
-	co = find_cached_object(odb, real);
-	if (co) {
-		if (oi) {
-			if (oi->typep)
-				*(oi->typep) = co->type;
-			if (oi->sizep)
-				*(oi->sizep) = co->size;
-			if (oi->disk_sizep)
-				*(oi->disk_sizep) = 0;
-			if (oi->delta_base_oid)
-				oidclr(oi->delta_base_oid, odb->repo->hash_algo);
-			if (oi->contentp)
-				*oi->contentp = xmemdupz(co->buf, co->size);
-			if (oi->mtimep)
-				*oi->mtimep = 0;
-			oi->whence = OI_CACHED;
-		}
+	if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags))
 		return 0;
-	}
 
 	odb_prepare_alternates(odb);
 
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index ccbb622eae..12c80f9b34 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,5 +1,57 @@
 #include "git-compat-util.h"
+#include "odb.h"
 #include "odb/source-inmemory.h"
+#include "repository.h"
+
+static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
+						      const struct object_id *oid)
+{
+	static const struct cached_object empty_tree = {
+		.type = OBJ_TREE,
+		.buf = "",
+	};
+	const struct cached_object_entry *co = source->objects;
+
+	for (size_t i = 0; i < source->objects_nr; i++, co++)
+		if (oideq(&co->oid, oid))
+			return &co->value;
+
+	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
+		return &empty_tree;
+
+	return NULL;
+}
+
+static int odb_source_inmemory_read_object_info(struct odb_source *source,
+						const struct object_id *oid,
+						struct object_info *oi,
+						enum object_info_flags flags UNUSED)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	const struct cached_object *object;
+
+	object = find_cached_object(inmemory, oid);
+	if (!object)
+		return -1;
+
+	if (oi) {
+		if (oi->typep)
+			*(oi->typep) = object->type;
+		if (oi->sizep)
+			*(oi->sizep) = object->size;
+		if (oi->disk_sizep)
+			*(oi->disk_sizep) = 0;
+		if (oi->delta_base_oid)
+			oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
+		if (oi->contentp)
+			*oi->contentp = xmemdupz(object->buf, object->size);
+		if (oi->mtimep)
+			*oi->mtimep = 0;
+		oi->whence = OI_CACHED;
+	}
+
+	return 0;
+}
 
 static void odb_source_inmemory_free(struct odb_source *source)
 {
@@ -19,6 +71,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
 	source->base.free = odb_source_inmemory_free;
+	source->base.read_object_info = odb_source_inmemory_read_object_info;
 
 	return source;
 }

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
                     ` (12 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `read_object_stream()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 12c80f9b34..39f0e799c7 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,6 +1,7 @@
 #include "git-compat-util.h"
 #include "odb.h"
 #include "odb/source-inmemory.h"
+#include "odb/streaming.h"
 #include "repository.h"
 
 static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
@@ -53,6 +54,56 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 	return 0;
 }
 
+struct odb_read_stream_inmemory {
+	struct odb_read_stream base;
+	const unsigned char *buf;
+	size_t offset;
+};
+
+static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream,
+					     char *buf, size_t buf_len)
+{
+	struct odb_read_stream_inmemory *inmemory =
+		container_of(stream, struct odb_read_stream_inmemory, base);
+	size_t bytes = buf_len;
+
+	if (buf_len > inmemory->base.size - inmemory->offset)
+		bytes = inmemory->base.size - inmemory->offset;
+
+	memcpy(buf, inmemory->buf + inmemory->offset, bytes);
+	inmemory->offset += bytes;
+
+	return bytes;
+}
+
+static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED)
+{
+	return 0;
+}
+
+static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
+						  struct odb_source *source,
+						  const struct object_id *oid)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct odb_read_stream_inmemory *stream;
+	const struct cached_object *object;
+
+	object = find_cached_object(inmemory, oid);
+	if (!object)
+		return -1;
+
+	CALLOC_ARRAY(stream, 1);
+	stream->base.read = odb_read_stream_inmemory_read;
+	stream->base.close = odb_read_stream_inmemory_close;
+	stream->base.size = object->size;
+	stream->base.type = object->type;
+	stream->buf = object->buf;
+
+	*out = &stream->base;
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -72,6 +123,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
+	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 
 	return source;
 }

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
                     ` (11 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `write_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c                 | 16 ++--------------
 odb/source-inmemory.c | 25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/odb.c b/odb.c
index 6a3912adac..24e929f03c 100644
--- a/odb.c
+++ b/odb.c
@@ -733,24 +733,12 @@ int odb_pretend_object(struct object_database *odb,
 		       void *buf, unsigned long len, enum object_type type,
 		       struct object_id *oid)
 {
-	struct cached_object_entry *co;
-	char *co_buf;
-
 	hash_object_file(odb->repo->hash_algo, buf, len, type, oid);
 	if (odb_has_object(odb, oid, 0))
 		return 0;
 
-	ALLOC_GROW(odb->inmemory_objects->objects,
-		   odb->inmemory_objects->objects_nr + 1,
-		   odb->inmemory_objects->objects_alloc);
-	co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++];
-	co->value.size = len;
-	co->value.type = type;
-	co_buf = xmalloc(len);
-	memcpy(co_buf, buf, len);
-	co->value.buf = co_buf;
-	oidcpy(&co->oid, oid);
-	return 0;
+	return odb_source_write_object(&odb->inmemory_objects->base,
+				       buf, len, type, oid, NULL, 0);
 }
 
 void *odb_read_object(struct object_database *odb,
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 39f0e799c7..4848011df5 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -1,4 +1,5 @@
 #include "git-compat-util.h"
+#include "object-file.h"
 #include "odb.h"
 #include "odb/source-inmemory.h"
 #include "odb/streaming.h"
@@ -104,6 +105,29 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 	return 0;
 }
 
+static int odb_source_inmemory_write_object(struct odb_source *source,
+					    const void *buf, unsigned long len,
+					    enum object_type type,
+					    struct object_id *oid,
+					    struct object_id *compat_oid UNUSED,
+					    enum odb_write_object_flags flags UNUSED)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct cached_object_entry *object;
+
+	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
+
+	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
+		   inmemory->objects_alloc);
+	object = &inmemory->objects[inmemory->objects_nr++];
+	object->value.size = len;
+	object->value.type = type;
+	object->value.buf = xmemdupz(buf, len);
+	oidcpy(&object->oid, oid);
+
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -124,6 +148,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
+	source->base.write_object = odb_source_inmemory_write_object;
 
 	return source;
 }

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
                     ` (10 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `write_object_stream()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 4848011df5..d05a13df45 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -128,6 +128,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
 	return 0;
 }
 
+static int odb_source_inmemory_write_object_stream(struct odb_source *source,
+						   struct odb_write_stream *stream,
+						   size_t len,
+						   struct object_id *oid)
+{
+	char buf[16384];
+	size_t total_read = 0;
+	char *data;
+	int ret;
+
+	CALLOC_ARRAY(data, len);
+	while (!stream->is_finished) {
+		ssize_t bytes_read;
+
+		bytes_read = odb_write_stream_read(stream, buf, sizeof(buf));
+		if (total_read + bytes_read > len) {
+			ret = error("object stream yielded more bytes than expected");
+			goto out;
+		}
+
+		memcpy(data + total_read, buf, bytes_read);
+		total_read += bytes_read;
+	}
+
+	if (total_read != len) {
+		ret = error("object stream yielded less bytes than expected");
+		goto out;
+	}
+
+	ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid,
+					       NULL, 0);
+	if (ret < 0)
+		goto out;
+
+out:
+	free(data);
+	return ret;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
@@ -149,6 +188,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.write_object = odb_source_inmemory_write_object;
+	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 
 	return source;
 }

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 09/17] oidtree: add ability to store data Patrick Steinhardt
                     ` (9 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The cbtree subsystem allows the user to store arbitrary data in a
prefix-free set of strings. This is used by us to store object IDs in a
way that we can easily iterate through them in lexicograph order, and so
that we can easily perform lookups with shortened object IDs.

In its current form, it is not easily possible to store arbitrary data
with the tree nodes. There are a couple of approaches such a caller
could try to use, but none of them really work:

  - One may embed the `struct cb_node` in a custom structure. This does
    not work though as `struct cb_node` contains a flex array, and
    embedding such a struct in another struct is forbidden.

  - One may use a `union` over `struct cb_node` and ones own data type,
    which _is_ allowed even if the struct contains a flex array. This
    does not work though, as the compiler may align members of the
    struct so that the node key would not immediately start where the
    flex array starts.

  - One may allocate `struct cb_node` such that it has room for both its
    key and the custom data. This has the downside though that if the
    custom data is itself a pointer to allocated memory, then the leak
    checker will not consider the pointer to be alive anymore.

Refactor the cbtree to drop the flex array and instead take in an
explicit offset for where to find the key, which allows the caller to
embed `struct cb_node` is a wrapper struct.

Note that this change has the downside that we now have a bit of padding
in our structure, which grows the size from 60 to 64 bytes on a 64 bit
system. On the other hand though, it allows us to get rid of the memory
copies that we previously had to do to ensure proper alignment. This
seems like a reasonable tradeoff.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 cbtree.c  | 25 ++++++++++++++++++-------
 cbtree.h  | 17 +++++++++--------
 oidtree.c | 33 ++++++++++++++-------------------
 3 files changed, 41 insertions(+), 34 deletions(-)

diff --git a/cbtree.c b/cbtree.c
index 4ab794bddc..8f5edbb80a 100644
--- a/cbtree.c
+++ b/cbtree.c
@@ -7,6 +7,11 @@
 #include "git-compat-util.h"
 #include "cbtree.h"
 
+static inline uint8_t *cb_node_key(struct cb_tree *t, struct cb_node *node)
+{
+	return (uint8_t *) node + t->key_offset;
+}
+
 static struct cb_node *cb_node_of(const void *p)
 {
 	return (struct cb_node *)((uintptr_t)p - 1);
@@ -33,6 +38,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 	uint8_t c;
 	int newdirection;
 	struct cb_node **wherep, *p;
+	uint8_t *node_key, *p_key;
 
 	assert(!((uintptr_t)node & 1)); /* allocations must be aligned */
 
@@ -41,23 +47,26 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 		return NULL;	/* success */
 	}
 
+	node_key = cb_node_key(t, node);
+
 	/* see if a node already exists */
-	p = cb_internal_best_match(t->root, node->k, klen);
+	p = cb_internal_best_match(t->root, node_key, klen);
+	p_key = cb_node_key(t, p);
 
 	/* find first differing byte */
 	for (newbyte = 0; newbyte < klen; newbyte++) {
-		if (p->k[newbyte] != node->k[newbyte])
+		if (p_key[newbyte] != node_key[newbyte])
 			goto different_byte_found;
 	}
 	return p;	/* element exists, let user deal with it */
 
 different_byte_found:
-	newotherbits = p->k[newbyte] ^ node->k[newbyte];
+	newotherbits = p_key[newbyte] ^ node_key[newbyte];
 	newotherbits |= newotherbits >> 1;
 	newotherbits |= newotherbits >> 2;
 	newotherbits |= newotherbits >> 4;
 	newotherbits = (newotherbits & ~(newotherbits >> 1)) ^ 255;
-	c = p->k[newbyte];
+	c = p_key[newbyte];
 	newdirection = (1 + (newotherbits | c)) >> 8;
 
 	node->byte = newbyte;
@@ -78,7 +87,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen)
 			break;
 		if (q->byte == newbyte && q->otherbits > newotherbits)
 			break;
-		c = q->byte < klen ? node->k[q->byte] : 0;
+		c = q->byte < klen ? node_key[q->byte] : 0;
 		direction = (1 + (q->otherbits | c)) >> 8;
 		wherep = q->child + direction;
 	}
@@ -93,7 +102,7 @@ struct cb_node *cb_lookup(struct cb_tree *t, const uint8_t *k, size_t klen)
 {
 	struct cb_node *p = cb_internal_best_match(t->root, k, klen);
 
-	return p && !memcmp(p->k, k, klen) ? p : NULL;
+	return p && !memcmp(cb_node_key(t, p), k, klen) ? p : NULL;
 }
 
 static int cb_descend(struct cb_node *p, cb_iter fn, void *arg)
@@ -115,6 +124,7 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
 	struct cb_node *p = t->root;
 	struct cb_node *top = p;
 	size_t i = 0;
+	uint8_t *p_key;
 
 	if (!p)
 		return 0; /* empty tree */
@@ -130,8 +140,9 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
 			top = p;
 	}
 
+	p_key = cb_node_key(t, p);
 	for (i = 0; i < klen; i++) {
-		if (p->k[i] != kpfx[i])
+		if (p_key[i] != kpfx[i])
 			return 0; /* "best" match failed */
 	}
 
diff --git a/cbtree.h b/cbtree.h
index c374b1b3db..4647d4a32f 100644
--- a/cbtree.h
+++ b/cbtree.h
@@ -6,9 +6,9 @@
  *
  * This is adapted to store arbitrary data (not just NUL-terminated C strings
  * and allocates no memory internally.  The user needs to allocate
- * "struct cb_node" and fill cb_node.k[] with arbitrary match data
- * for memcmp.
- * If "klen" is variable, then it should be embedded into "c_node.k[]"
+ * "struct cb_node" and provide `key_offset` to indicate where the key can be
+ * found relative to the `struct cb_node` for memcmp.
+ * If "klen" is variable, then it should be embedded into the key.
  * Recursion is bound by the maximum value of "klen" used.
  */
 #ifndef CBTREE_H
@@ -23,18 +23,19 @@ struct cb_node {
 	 */
 	uint32_t byte;
 	uint8_t otherbits;
-	uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */
 };
 
 struct cb_tree {
 	struct cb_node *root;
+	ptrdiff_t key_offset;
 };
 
-#define CBTREE_INIT { 0 }
-
-static inline void cb_init(struct cb_tree *t)
+static inline void cb_init(struct cb_tree *t,
+			   ptrdiff_t key_offset)
 {
-	struct cb_tree blank = CBTREE_INIT;
+	struct cb_tree blank = {
+		.key_offset = key_offset,
+	};
 	memcpy(t, &blank, sizeof(*t));
 }
 
diff --git a/oidtree.c b/oidtree.c
index ab9fe7ec7a..117649753f 100644
--- a/oidtree.c
+++ b/oidtree.c
@@ -6,9 +6,14 @@
 #include "oidtree.h"
 #include "hash.h"
 
+struct oidtree_node {
+	struct cb_node base;
+	struct object_id key;
+};
+
 void oidtree_init(struct oidtree *ot)
 {
-	cb_init(&ot->tree);
+	cb_init(&ot->tree, offsetof(struct oidtree_node, key));
 	mem_pool_init(&ot->mem_pool, 0);
 }
 
@@ -22,20 +27,13 @@ void oidtree_clear(struct oidtree *ot)
 
 void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 {
-	struct cb_node *on;
-	struct object_id k;
+	struct oidtree_node *on;
 
 	if (!oid->algo)
 		BUG("oidtree_insert requires oid->algo");
 
-	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid));
-
-	/*
-	 * Clear the padding and copy the result in separate steps to
-	 * respect the 4-byte alignment needed by struct object_id.
-	 */
-	oidcpy(&k, oid);
-	memcpy(on->k, &k, sizeof(k));
+	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on));
+	oidcpy(&on->key, oid);
 
 	/*
 	 * n.b. Current callers won't get us duplicates, here.  If a
@@ -43,7 +41,7 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 	 * that won't be freed until oidtree_clear.  Currently it's not
 	 * worth maintaining a free list
 	 */
-	cb_insert(&ot->tree, on, sizeof(*oid));
+	cb_insert(&ot->tree, &on->base, sizeof(*oid));
 }
 
 bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
@@ -73,21 +71,18 @@ struct oidtree_each_data {
 
 static int iter(struct cb_node *n, void *cb_data)
 {
+	struct oidtree_node *node = container_of(n, struct oidtree_node, base);
 	struct oidtree_each_data *data = cb_data;
-	struct object_id k;
-
-	/* Copy to provide 4-byte alignment needed by struct object_id. */
-	memcpy(&k, n->k, sizeof(k));
 
-	if (data->algo != GIT_HASH_UNKNOWN && data->algo != k.algo)
+	if (data->algo != GIT_HASH_UNKNOWN && data->algo != node->key.algo)
 		return 0;
 
 	if (data->last_nibble_at) {
-		if ((k.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0)
+		if ((node->key.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0)
 			return 0;
 	}
 
-	return data->cb(&k, data->cb_data);
+	return data->cb(&node->key, data->cb_data);
 }
 
 int oidtree_each(struct oidtree *ot, const struct object_id *prefix,

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 09/17] oidtree: add ability to store data
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
                     ` (8 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The oidtree data structure is currently only used to store object IDs,
without any associated data. So consequently, it can only really be used
to track which object IDs exist, and we can use the tree structure to
efficiently operate on OID prefixes.

But there are valid use cases where we want to both:

  - Store object IDs in a sorted order.

  - Associated arbitrary data with them.

Refactor the oidtree interface so that it allows us to store arbitrary
payloads within the respective nodes. This will be used in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 loose.c                  |  2 +-
 object-file.c            |  3 ++-
 oidtree.c                | 37 ++++++++++++++++++++++++++++++++-----
 oidtree.h                | 12 ++++++++++--
 t/unit-tests/u-oidtree.c | 26 +++++++++++++++++++++++---
 5 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/loose.c b/loose.c
index 07333be696..f7a3dd1a72 100644
--- a/loose.c
+++ b/loose.c
@@ -57,7 +57,7 @@ static int insert_loose_map(struct odb_source *source,
 	inserted |= insert_oid_pair(map->to_compat, oid, compat_oid);
 	inserted |= insert_oid_pair(map->to_storage, compat_oid, oid);
 	if (inserted)
-		oidtree_insert(files->loose->cache, compat_oid);
+		oidtree_insert(files->loose->cache, compat_oid, NULL);
 
 	return inserted;
 }
diff --git a/object-file.c b/object-file.c
index 3e70e5d668..d04ab57253 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1857,6 +1857,7 @@ static int for_each_object_wrapper_cb(const struct object_id *oid,
 }
 
 static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid,
+					       void *node_data UNUSED,
 					       void *cb_data)
 {
 	struct for_each_object_wrapper_data *data = cb_data;
@@ -2002,7 +2003,7 @@ static int append_loose_object(const struct object_id *oid,
 			       const char *path UNUSED,
 			       void *data)
 {
-	oidtree_insert(data, oid);
+	oidtree_insert(data, oid, NULL);
 	return 0;
 }
 
diff --git a/oidtree.c b/oidtree.c
index 117649753f..e43f18026e 100644
--- a/oidtree.c
+++ b/oidtree.c
@@ -9,6 +9,7 @@
 struct oidtree_node {
 	struct cb_node base;
 	struct object_id key;
+	void *data;
 };
 
 void oidtree_init(struct oidtree *ot)
@@ -25,15 +26,22 @@ void oidtree_clear(struct oidtree *ot)
 	}
 }
 
-void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
+struct oidtree_data {
+	struct object_id oid;
+};
+
+void oidtree_insert(struct oidtree *ot, const struct object_id *oid,
+		    void *data)
 {
 	struct oidtree_node *on;
+	struct cb_node *node;
 
 	if (!oid->algo)
 		BUG("oidtree_insert requires oid->algo");
 
 	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on));
 	oidcpy(&on->key, oid);
+	on->data = data;
 
 	/*
 	 * n.b. Current callers won't get us duplicates, here.  If a
@@ -41,13 +49,19 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 	 * that won't be freed until oidtree_clear.  Currently it's not
 	 * worth maintaining a free list
 	 */
-	cb_insert(&ot->tree, &on->base, sizeof(*oid));
+	node = cb_insert(&ot->tree, &on->base, sizeof(*oid));
+	if (node) {
+		struct oidtree_node *preexisting = container_of(node, struct oidtree_node, base);
+		preexisting->data = data;
+	}
 }
 
-bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+static struct oidtree_node *oidtree_lookup(struct oidtree *ot,
+					   const struct object_id *oid)
 {
 	struct object_id k;
 	size_t klen = sizeof(k);
+	struct cb_node *node;
 
 	oidcpy(&k, oid);
 
@@ -58,7 +72,20 @@ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
 	klen += BUILD_ASSERT_OR_ZERO(offsetof(struct object_id, hash) <
 				offsetof(struct object_id, algo));
 
-	return !!cb_lookup(&ot->tree, (const uint8_t *)&k, klen);
+	node = cb_lookup(&ot->tree, (const uint8_t *)&k, klen);
+	return node ? container_of(node, struct oidtree_node, base) : NULL;
+}
+
+bool oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+{
+	struct oidtree_node *node = oidtree_lookup(ot, oid);
+	return node ? 1 : 0;
+}
+
+void *oidtree_get(struct oidtree *ot, const struct object_id *oid)
+{
+	struct oidtree_node *node = oidtree_lookup(ot, oid);
+	return node ? node->data : NULL;
 }
 
 struct oidtree_each_data {
@@ -82,7 +109,7 @@ static int iter(struct cb_node *n, void *cb_data)
 			return 0;
 	}
 
-	return data->cb(&node->key, data->cb_data);
+	return data->cb(&node->key, node->data, data->cb_data);
 }
 
 int oidtree_each(struct oidtree *ot, const struct object_id *prefix,
diff --git a/oidtree.h b/oidtree.h
index 2b7bad2e60..baa5a436ea 100644
--- a/oidtree.h
+++ b/oidtree.h
@@ -29,18 +29,26 @@ void oidtree_init(struct oidtree *ot);
  */
 void oidtree_clear(struct oidtree *ot);
 
-/* Insert the object ID into the tree. */
-void oidtree_insert(struct oidtree *ot, const struct object_id *oid);
+/*
+ * Insert the object ID into the tree and store the given pointer alongside
+ * with it. The data pointer of any preexisting entry will be overwritten.
+ */
+void oidtree_insert(struct oidtree *ot, const struct object_id *oid,
+		    void *data);
 
 /* Check whether the tree contains the given object ID. */
 bool oidtree_contains(struct oidtree *ot, const struct object_id *oid);
 
+/* Get the payload stored with the given object ID. */
+void *oidtree_get(struct oidtree *ot, const struct object_id *oid);
+
 /*
  * Callback function used for `oidtree_each()`. Returning a non-zero exit code
  * will cause iteration to stop. The exit code will be propagated to the caller
  * of `oidtree_each()`.
  */
 typedef int (*oidtree_each_cb)(const struct object_id *oid,
+			       void *node_data,
 			       void *cb_data);
 
 /*
diff --git a/t/unit-tests/u-oidtree.c b/t/unit-tests/u-oidtree.c
index d4d05c7dc3..f0d5ebb733 100644
--- a/t/unit-tests/u-oidtree.c
+++ b/t/unit-tests/u-oidtree.c
@@ -19,7 +19,7 @@ static int fill_tree_loc(struct oidtree *ot, const char *hexes[], size_t n)
 	for (size_t i = 0; i < n; i++) {
 		struct object_id oid;
 		cl_parse_any_oid(hexes[i], &oid);
-		oidtree_insert(ot, &oid);
+		oidtree_insert(ot, &oid, NULL);
 	}
 	return 0;
 }
@@ -38,9 +38,9 @@ struct expected_hex_iter {
 	const char *query;
 };
 
-static int check_each_cb(const struct object_id *oid, void *data)
+static int check_each_cb(const struct object_id *oid, void *node_data UNUSED, void *cb_data)
 {
-	struct expected_hex_iter *hex_iter = data;
+	struct expected_hex_iter *hex_iter = cb_data;
 	struct object_id expected;
 
 	cl_assert(hex_iter->i < hex_iter->expected_hexes.nr);
@@ -105,3 +105,23 @@ void test_oidtree__each(void)
 	check_each(&ot, "32100", "321", NULL);
 	check_each(&ot, "32", "320", "321", NULL);
 }
+
+void test_oidtree__insert_overwrites_data(void)
+{
+	struct object_id oid;
+	struct oidtree ot;
+	int a, b;
+
+	cl_parse_any_oid("1", &oid);
+
+	oidtree_init(&ot);
+
+	oidtree_insert(&ot, &oid, NULL);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), NULL);
+	oidtree_insert(&ot, &oid, &a);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), &a);
+	oidtree_insert(&ot, &oid, &b);
+	cl_assert_equal_p(oidtree_get(&ot, &oid), &b);
+
+	oidtree_clear(&ot);
+}

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 09/17] oidtree: add ability to store data Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
                     ` (7 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

The in-memory source stores its objects in a simple array that we grow as
needed. This has a couple of downsides:

  - The object lookup is O(n). This doesn't matter in practice because
    we only store a small number of objects.

  - We don't have an easy way to iterate over all objects in
    lexicographic order.

  - We don't have an easy way to compute unique object ID prefixes.

Refactor the code to use an oidtree instead. This is the same data
structure used by our loose object source, and thus it means we get a
bunch of functionality for free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 72 +++++++++++++++++++++++++++++++++++++--------------
 odb/source-inmemory.h | 13 ++--------
 2 files changed, 54 insertions(+), 31 deletions(-)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index d05a13df45..3b51cc7fef 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -3,20 +3,29 @@
 #include "odb.h"
 #include "odb/source-inmemory.h"
 #include "odb/streaming.h"
+#include "oidtree.h"
 #include "repository.h"
 
-static const struct cached_object *find_cached_object(struct odb_source_inmemory *source,
-						      const struct object_id *oid)
+struct inmemory_object {
+	enum object_type type;
+	const void *buf;
+	unsigned long size;
+};
+
+static const struct inmemory_object *find_cached_object(struct odb_source_inmemory *source,
+							const struct object_id *oid)
 {
-	static const struct cached_object empty_tree = {
+	static const struct inmemory_object empty_tree = {
 		.type = OBJ_TREE,
 		.buf = "",
 	};
-	const struct cached_object_entry *co = source->objects;
+	const struct inmemory_object *object;
 
-	for (size_t i = 0; i < source->objects_nr; i++, co++)
-		if (oideq(&co->oid, oid))
-			return &co->value;
+	if (source->objects) {
+		object = oidtree_get(source->objects, oid);
+		if (object)
+			return object;
+	}
 
 	if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree))
 		return &empty_tree;
@@ -30,7 +39,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 						enum object_info_flags flags UNUSED)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	const struct cached_object *object;
+	const struct inmemory_object *object;
 
 	object = find_cached_object(inmemory, oid);
 	if (!object)
@@ -88,7 +97,7 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
 	struct odb_read_stream_inmemory *stream;
-	const struct cached_object *object;
+	const struct inmemory_object *object;
 
 	object = find_cached_object(inmemory, oid);
 	if (!object)
@@ -113,17 +122,23 @@ static int odb_source_inmemory_write_object(struct odb_source *source,
 					    enum odb_write_object_flags flags UNUSED)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	struct cached_object_entry *object;
+	struct inmemory_object *object;
 
 	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
 
-	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
-		   inmemory->objects_alloc);
-	object = &inmemory->objects[inmemory->objects_nr++];
-	object->value.size = len;
-	object->value.type = type;
-	object->value.buf = xmemdupz(buf, len);
-	oidcpy(&object->oid, oid);
+	if (!inmemory->objects) {
+		CALLOC_ARRAY(inmemory->objects, 1);
+		oidtree_init(inmemory->objects);
+	} else if (oidtree_contains(inmemory->objects, oid)) {
+		return 0;
+	}
+
+	CALLOC_ARRAY(object, 1);
+	object->size = len;
+	object->type = type;
+	object->buf = xmemdupz(buf, len);
+
+	oidtree_insert(inmemory->objects, oid, object);
 
 	return 0;
 }
@@ -167,12 +182,29 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source,
 	return ret;
 }
 
+static int inmemory_object_free(const struct object_id *oid UNUSED,
+				void *node_data,
+				void *cb_data UNUSED)
+{
+	struct inmemory_object *object = node_data;
+	free((void *) object->buf);
+	free(object);
+	return 0;
+}
+
 static void odb_source_inmemory_free(struct odb_source *source)
 {
 	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
-	for (size_t i = 0; i < inmemory->objects_nr; i++)
-		free((char *) inmemory->objects[i].value.buf);
-	free(inmemory->objects);
+
+	if (inmemory->objects) {
+		struct object_id null_oid = { 0 };
+
+		oidtree_each(inmemory->objects, &null_oid, 0,
+			     inmemory_object_free, NULL);
+		oidtree_clear(inmemory->objects);
+		free(inmemory->objects);
+	}
+
 	free(inmemory->base.path);
 	free(inmemory);
 }
diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
index d1b05a3996..a88fc2e320 100644
--- a/odb/source-inmemory.h
+++ b/odb/source-inmemory.h
@@ -3,14 +3,7 @@
 
 #include "odb/source.h"
 
-struct cached_object_entry {
-	struct object_id oid;
-	struct cached_object {
-		enum object_type type;
-		const void *buf;
-		unsigned long size;
-	} value;
-};
+struct oidtree;
 
 /*
  * An in-memory source that you can write objects to that shall be made
@@ -20,9 +13,7 @@ struct cached_object_entry {
  */
 struct odb_source_inmemory {
 	struct odb_source base;
-
-	struct cached_object_entry *objects;
-	size_t objects_nr, objects_alloc;
+	struct oidtree *objects;
 };
 
 /* Create a new in-memory object database source. */

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
                     ` (6 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `for_each_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 88 +++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 72 insertions(+), 16 deletions(-)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 3b51cc7fef..f60eecbdbb 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -33,6 +33,28 @@ static const struct inmemory_object *find_cached_object(struct odb_source_inmemo
 	return NULL;
 }
 
+static void populate_object_info(struct odb_source_inmemory *source,
+				 struct object_info *oi,
+				 const struct inmemory_object *object)
+{
+	if (!oi)
+		return;
+
+	if (oi->typep)
+		*(oi->typep) = object->type;
+	if (oi->sizep)
+		*(oi->sizep) = object->size;
+	if (oi->disk_sizep)
+		*(oi->disk_sizep) = 0;
+	if (oi->delta_base_oid)
+		oidclr(oi->delta_base_oid, source->base.odb->repo->hash_algo);
+	if (oi->contentp)
+		*oi->contentp = xmemdupz(object->buf, object->size);
+	if (oi->mtimep)
+		*oi->mtimep = 0;
+	oi->whence = OI_CACHED;
+}
+
 static int odb_source_inmemory_read_object_info(struct odb_source *source,
 						const struct object_id *oid,
 						struct object_info *oi,
@@ -45,22 +67,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source,
 	if (!object)
 		return -1;
 
-	if (oi) {
-		if (oi->typep)
-			*(oi->typep) = object->type;
-		if (oi->sizep)
-			*(oi->sizep) = object->size;
-		if (oi->disk_sizep)
-			*(oi->disk_sizep) = 0;
-		if (oi->delta_base_oid)
-			oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
-		if (oi->contentp)
-			*oi->contentp = xmemdupz(object->buf, object->size);
-		if (oi->mtimep)
-			*oi->mtimep = 0;
-		oi->whence = OI_CACHED;
-	}
-
+	populate_object_info(inmemory, oi, object);
 	return 0;
 }
 
@@ -114,6 +121,54 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
 	return 0;
 }
 
+struct odb_source_inmemory_for_each_object_data {
+	struct odb_source_inmemory *inmemory;
+	const struct object_info *request;
+	odb_for_each_object_cb cb;
+	void *cb_data;
+};
+
+static int odb_source_inmemory_for_each_object_cb(const struct object_id *oid,
+						  void *node_data, void *cb_data)
+{
+	struct odb_source_inmemory_for_each_object_data *data = cb_data;
+	struct inmemory_object *object = node_data;
+
+	if (data->request) {
+		struct object_info oi = *data->request;
+		populate_object_info(data->inmemory, &oi, object);
+		return data->cb(oid, &oi, data->cb_data);
+	} else {
+		return data->cb(oid, NULL, data->cb_data);
+	}
+}
+
+static int odb_source_inmemory_for_each_object(struct odb_source *source,
+					       const struct object_info *request,
+					       odb_for_each_object_cb cb,
+					       void *cb_data,
+					       const struct odb_for_each_object_options *opts)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	struct odb_source_inmemory_for_each_object_data payload = {
+		.inmemory = inmemory,
+		.request = request,
+		.cb = cb,
+		.cb_data = cb_data,
+	};
+	struct object_id null_oid = { 0 };
+
+	if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) ||
+	    (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local))
+		return 0;
+	if (!inmemory->objects)
+		return 0;
+
+	return oidtree_each(inmemory->objects,
+			    opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len,
+			    odb_source_inmemory_for_each_object_cb, &payload);
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -219,6 +274,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.free = odb_source_inmemory_free;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
+	source->base.for_each_object = odb_source_inmemory_for_each_object;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
                     ` (5 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `find_abbrev_len()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index f60eecbdbb..44d9bbedec 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -169,6 +169,44 @@ static int odb_source_inmemory_for_each_object(struct odb_source *source,
 			    odb_source_inmemory_for_each_object_cb, &payload);
 }
 
+struct find_abbrev_len_data {
+	const struct object_id *oid;
+	unsigned len;
+};
+
+static int find_abbrev_len_cb(const struct object_id *oid,
+			      struct object_info *oi UNUSED,
+			      void *cb_data)
+{
+	struct find_abbrev_len_data *data = cb_data;
+	unsigned len = oid_common_prefix_hexlen(oid, data->oid);
+	if (len != hash_algos[oid->algo].hexsz && len >= data->len)
+		data->len = len + 1;
+	return 0;
+}
+
+static int odb_source_inmemory_find_abbrev_len(struct odb_source *source,
+					       const struct object_id *oid,
+					       unsigned min_len,
+					       unsigned *out)
+{
+	struct odb_for_each_object_options opts = {
+		.prefix = oid,
+		.prefix_hex_len = min_len,
+	};
+	struct find_abbrev_len_data data = {
+		.oid = oid,
+		.len = min_len,
+	};
+	int ret;
+
+	ret = odb_source_inmemory_for_each_object(source, NULL, find_abbrev_len_cb,
+						  &data, &opts);
+	*out = data.len;
+
+	return ret;
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -275,6 +313,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
+	source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (11 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
                     ` (4 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `count_objects()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 44d9bbedec..674dbcad30 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -207,6 +207,25 @@ static int odb_source_inmemory_find_abbrev_len(struct odb_source *source,
 	return ret;
 }
 
+static int count_objects_cb(const struct object_id *oid UNUSED,
+			    struct object_info *oi UNUSED,
+			    void *cb_data)
+{
+	unsigned long *counter = cb_data;
+	(*counter)++;
+	return 0;
+}
+
+static int odb_source_inmemory_count_objects(struct odb_source *source,
+					     enum odb_count_objects_flags flags UNUSED,
+					     unsigned long *out)
+{
+	struct odb_for_each_object_options opts = { 0 };
+	*out = 0;
+	return odb_source_inmemory_for_each_object(source, NULL, count_objects_cb,
+						   out, &opts);
+}
+
 static int odb_source_inmemory_write_object(struct odb_source *source,
 					    const void *buf, unsigned long len,
 					    enum object_type type,
@@ -314,6 +333,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
 	source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len;
+	source->base.count_objects = odb_source_inmemory_count_objects;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (12 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
                     ` (3 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Implement the `freshen_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 674dbcad30..8934e0f547 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -294,6 +294,15 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source,
 	return ret;
 }
 
+static int odb_source_inmemory_freshen_object(struct odb_source *source,
+					      const struct object_id *oid)
+{
+	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
+	if (find_cached_object(inmemory, oid))
+		return 1;
+	return 0;
+}
+
 static int inmemory_object_free(const struct object_id *oid UNUSED,
 				void *node_data,
 				void *cb_data UNUSED)
@@ -336,6 +345,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.count_objects = odb_source_inmemory_count_objects;
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
+	source->base.freshen_object = odb_source_inmemory_freshen_object;
 
 	return source;
 }

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (13 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 16/17] odb: generic in-memory source Patrick Steinhardt
                     ` (2 subsequent siblings)
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Stub out remaining functions that we either don't need or that are
basically no-ops.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index 8934e0f547..e004566d76 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -303,6 +303,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source,
 	return 0;
 }
 
+static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED,
+						 struct odb_transaction **out UNUSED)
+{
+	return error("in-memory source does not support transactions");
+}
+
+static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED,
+					       struct strvec *out UNUSED)
+{
+	return 0;
+}
+
+static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED,
+					       const char *alternate UNUSED)
+{
+	return error("in-memory source does not support alternates");
+}
+
+static void odb_source_inmemory_close(struct odb_source *source UNUSED)
+{
+}
+
+static void odb_source_inmemory_reprepare(struct odb_source *source UNUSED)
+{
+}
+
 static int inmemory_object_free(const struct object_id *oid UNUSED,
 				void *node_data,
 				void *cb_data UNUSED)
@@ -338,6 +364,8 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);
 
 	source->base.free = odb_source_inmemory_free;
+	source->base.close = odb_source_inmemory_close;
+	source->base.reprepare = odb_source_inmemory_reprepare;
 	source->base.read_object_info = odb_source_inmemory_read_object_info;
 	source->base.read_object_stream = odb_source_inmemory_read_object_stream;
 	source->base.for_each_object = odb_source_inmemory_for_each_object;
@@ -346,6 +374,9 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
 	source->base.write_object = odb_source_inmemory_write_object;
 	source->base.write_object_stream = odb_source_inmemory_write_object_stream;
 	source->base.freshen_object = odb_source_inmemory_freshen_object;
+	source->base.begin_transaction = odb_source_inmemory_begin_transaction;
+	source->base.read_alternates = odb_source_inmemory_read_alternates;
+	source->base.write_alternate = odb_source_inmemory_write_alternate;
 
 	return source;
 }

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 16/17] odb: generic in-memory source
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (14 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-10 12:12   ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt
  2026-04-14  8:27   ` [PATCH v3 00/17] odb: introduce "in-memory" source Karthik Nayak
  17 siblings, 0 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

Make the in-memory source generic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c | 8 ++++----
 odb.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/odb.c b/odb.c
index 24e929f03c..965ef68e4e 100644
--- a/odb.c
+++ b/odb.c
@@ -560,7 +560,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
 	if (is_null_oid(real))
 		return -1;
 
-	if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags))
+	if (!odb_source_read_object_info(odb->inmemory_objects, oid, oi, flags))
 		return 0;
 
 	odb_prepare_alternates(odb);
@@ -737,7 +737,7 @@ int odb_pretend_object(struct object_database *odb,
 	if (odb_has_object(odb, oid, 0))
 		return 0;
 
-	return odb_source_write_object(&odb->inmemory_objects->base,
+	return odb_source_write_object(odb->inmemory_objects,
 				       buf, len, type, oid, NULL, 0);
 }
 
@@ -1020,7 +1020,7 @@ struct object_database *odb_new(struct repository *repo,
 	o->sources = odb_source_new(o, primary_source, true);
 	o->sources_tail = &o->sources->next;
 	o->alternate_db = xstrdup_or_null(secondary_sources);
-	o->inmemory_objects = odb_source_inmemory_new(o);
+	o->inmemory_objects = &odb_source_inmemory_new(o)->base;
 
 	free(to_free);
 
@@ -1045,7 +1045,7 @@ static void odb_free_sources(struct object_database *o)
 		o->sources = next;
 	}
 
-	odb_source_free(&o->inmemory_objects->base);
+	odb_source_free(o->inmemory_objects);
 	o->inmemory_objects = NULL;
 
 	kh_destroy_odb_path_map(o->source_by_path);
diff --git a/odb.h b/odb.h
index c3a7edf9c8..73553ed5a7 100644
--- a/odb.h
+++ b/odb.h
@@ -81,7 +81,7 @@ struct object_database {
 	 * to write them into the object store (e.g. a browse-only
 	 * application).
 	 */
-	struct odb_source_inmemory *inmemory_objects;
+	struct odb_source *inmemory_objects;
 
 	/*
 	 * A fast, rough count of the number of objects in the repository.

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (15 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 16/17] odb: generic in-memory source Patrick Steinhardt
@ 2026-04-10 12:12   ` Patrick Steinhardt
  2026-04-14  8:45     ` Karthik Nayak
  2026-04-14  8:27   ` [PATCH v3 00/17] odb: introduce "in-memory" source Karthik Nayak
  17 siblings, 1 reply; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Justin Tobler

While the in-memory object source is a full-fledged source, our code
base only exercises parts of its functionality because we only use it in
git-blame(1). Implement unit tests to verify that the yet-unused
functionality of the backend works as expected.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Makefile                      |   1 +
 t/meson.build                 |   1 +
 t/unit-tests/u-odb-inmemory.c | 313 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 315 insertions(+)

diff --git a/Makefile b/Makefile
index 3cda12c455..68b4daa1ad 100644
--- a/Makefile
+++ b/Makefile
@@ -1529,6 +1529,7 @@ CLAR_TEST_SUITES += u-hash
 CLAR_TEST_SUITES += u-hashmap
 CLAR_TEST_SUITES += u-list-objects-filter-options
 CLAR_TEST_SUITES += u-mem-pool
+CLAR_TEST_SUITES += u-odb-inmemory
 CLAR_TEST_SUITES += u-oid-array
 CLAR_TEST_SUITES += u-oidmap
 CLAR_TEST_SUITES += u-oidtree
diff --git a/t/meson.build b/t/meson.build
index 7528e5cda5..db5e01c49b 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -6,6 +6,7 @@ clar_test_suites = [
   'unit-tests/u-hashmap.c',
   'unit-tests/u-list-objects-filter-options.c',
   'unit-tests/u-mem-pool.c',
+  'unit-tests/u-odb-inmemory.c',
   'unit-tests/u-oid-array.c',
   'unit-tests/u-oidmap.c',
   'unit-tests/u-oidtree.c',
diff --git a/t/unit-tests/u-odb-inmemory.c b/t/unit-tests/u-odb-inmemory.c
new file mode 100644
index 0000000000..482502ef4b
--- /dev/null
+++ b/t/unit-tests/u-odb-inmemory.c
@@ -0,0 +1,313 @@
+#include "unit-test.h"
+#include "hex.h"
+#include "odb/source-inmemory.h"
+#include "odb/streaming.h"
+#include "oidset.h"
+#include "repository.h"
+#include "strbuf.h"
+
+#define RANDOM_OID "da39a3ee5e6b4b0d3255bfef95601890afd80709"
+#define FOOBAR_OID "f6ea0495187600e7b2288c8ac19c5886383a4632"
+
+static struct repository repo = {
+	.hash_algo = &hash_algos[GIT_HASH_SHA1],
+};
+static struct object_database *odb;
+
+static void cl_assert_object_info(struct odb_source_inmemory *source,
+				  const struct object_id *oid,
+				  enum object_type expected_type,
+				  const char *expected_content)
+{
+	enum object_type actual_type;
+	unsigned long actual_size;
+	void *actual_content;
+	struct object_info oi = {
+		.typep = &actual_type,
+		.sizep = &actual_size,
+		.contentp = &actual_content,
+	};
+
+	cl_must_pass(odb_source_read_object_info(&source->base, oid, &oi, 0));
+	cl_assert_equal_u(actual_size, strlen(expected_content));
+	cl_assert_equal_u(actual_type, expected_type);
+	cl_assert_equal_s((char *) actual_content, expected_content);
+
+	free(actual_content);
+}
+
+void test_odb_inmemory__initialize(void)
+{
+	odb = odb_new(&repo, "", "");
+}
+
+void test_odb_inmemory__cleanup(void)
+{
+	odb_free(odb);
+}
+
+void test_odb_inmemory__new(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	cl_assert_equal_i(source->base.type, ODB_SOURCE_INMEMORY);
+	odb_source_free(&source->base);
+}
+
+void test_odb_inmemory__read_missing_object(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	struct object_id oid;
+	const char *end;
+
+	cl_must_pass(parse_oid_hex_algop(RANDOM_OID, &oid, &end, repo.hash_algo));
+	cl_must_fail(odb_source_read_object_info(&source->base, &oid, NULL, 0));
+
+	odb_source_free(&source->base);
+}
+
+void test_odb_inmemory__read_empty_tree(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	cl_assert_object_info(source, repo.hash_algo->empty_tree, OBJ_TREE, "");
+	odb_source_free(&source->base);
+}
+
+void test_odb_inmemory__read_written_object(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	const char data[] = "foobar";
+	struct object_id written_oid;
+
+	cl_must_pass(odb_source_write_object(&source->base, data, strlen(data),
+					     OBJ_BLOB, &written_oid, NULL, 0));
+	cl_assert_equal_s(oid_to_hex(&written_oid), FOOBAR_OID);
+	cl_assert_object_info(source, &written_oid, OBJ_BLOB, "foobar");
+
+	odb_source_free(&source->base);
+}
+
+void test_odb_inmemory__read_stream_object(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	struct odb_read_stream *stream;
+	struct object_id written_oid;
+	const char data[] = "foobar";
+	char buf[3] = { 0 };
+
+	cl_must_pass(odb_source_write_object(&source->base, data, strlen(data),
+					     OBJ_BLOB, &written_oid, NULL, 0));
+
+	cl_must_pass(odb_source_read_object_stream(&stream, &source->base,
+						   &written_oid));
+	cl_assert_equal_i(stream->type, OBJ_BLOB);
+	cl_assert_equal_u(stream->size, 6);
+
+	cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 2);
+	cl_assert_equal_s(buf, "fo");
+	cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 2);
+	cl_assert_equal_s(buf, "ob");
+	cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 2);
+	cl_assert_equal_s(buf, "ar");
+	cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 0);
+
+	odb_read_stream_close(stream);
+	odb_source_free(&source->base);
+}
+
+static int add_one_object(const struct object_id *oid,
+			  struct object_info *oi UNUSED,
+			  void *payload)
+{
+	struct oidset *actual_oids = payload;
+	cl_must_pass(oidset_insert(actual_oids, oid));
+	return 0;
+}
+
+void test_odb_inmemory__for_each_object(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	struct odb_for_each_object_options opts = { 0 };
+	struct oidset expected_oids = OIDSET_INIT;
+	struct oidset actual_oids = OIDSET_INIT;
+	struct strbuf buf = STRBUF_INIT;
+
+	cl_must_pass(odb_source_for_each_object(&source->base, NULL,
+						add_one_object, &actual_oids, &opts));
+	cl_assert_equal_u(oidset_size(&actual_oids), 0);
+
+	for (int i = 0; i < 10; i++) {
+		struct object_id written_oid;
+
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, "%d", i);
+
+		cl_must_pass(odb_source_write_object(&source->base, buf.buf, buf.len,
+						     OBJ_BLOB, &written_oid, NULL, 0));
+		cl_must_pass(oidset_insert(&expected_oids, &written_oid));
+	}
+
+	cl_must_pass(odb_source_for_each_object(&source->base, NULL,
+						add_one_object, &actual_oids, &opts));
+	cl_assert_equal_b(oidset_equal(&expected_oids, &actual_oids), true);
+
+	odb_source_free(&source->base);
+	oidset_clear(&expected_oids);
+	oidset_clear(&actual_oids);
+	strbuf_release(&buf);
+}
+
+static int abort_after_two_objects(const struct object_id *oid UNUSED,
+				   struct object_info *oi UNUSED,
+				   void *payload)
+{
+	unsigned *counter = payload;
+	(*counter)++;
+	if (*counter == 2)
+		return 123;
+	return 0;
+}
+
+void test_odb_inmemory__for_each_object_can_abort_iteration(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	struct odb_for_each_object_options opts = { 0 };
+	struct object_id written_oid;
+	unsigned counter = 0;
+
+	cl_must_pass(odb_source_write_object(&source->base, "1", 1,
+					     OBJ_BLOB, &written_oid, NULL, 0));
+	cl_must_pass(odb_source_write_object(&source->base, "2", 1,
+					     OBJ_BLOB, &written_oid, NULL, 0));
+	cl_must_pass(odb_source_write_object(&source->base, "3", 1,
+					     OBJ_BLOB, &written_oid, NULL, 0));
+
+	cl_assert_equal_i(odb_source_for_each_object(&source->base, NULL,
+						     abort_after_two_objects,
+						     &counter, &opts),
+			  123);
+	cl_assert_equal_u(counter, 2);
+
+	odb_source_free(&source->base);
+}
+
+void test_odb_inmemory__count_objects(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	struct object_id written_oid;
+	unsigned long count;
+
+	cl_must_pass(odb_source_count_objects(&source->base, 0, &count));
+	cl_assert_equal_u(count, 0);
+
+	cl_must_pass(odb_source_write_object(&source->base, "1", 1,
+					     OBJ_BLOB, &written_oid, NULL, 0));
+	cl_must_pass(odb_source_write_object(&source->base, "2", 1,
+					     OBJ_BLOB, &written_oid, NULL, 0));
+	cl_must_pass(odb_source_write_object(&source->base, "3", 1,
+					     OBJ_BLOB, &written_oid, NULL, 0));
+
+	cl_must_pass(odb_source_count_objects(&source->base, 0, &count));
+	cl_assert_equal_u(count, 3);
+
+	odb_source_free(&source->base);
+}
+
+void test_odb_inmemory__find_abbrev_len(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	struct object_id oid1, oid2;
+	unsigned abbrev_len;
+
+	/*
+	 * The two blobs we're about to write share the first 10 hex characters
+	 * of their object IDs ("a09f43dc45"), so at least 11 characters are
+	 * needed to tell them apart:
+	 *
+	 *   "368317" -> a09f43dc4562d45115583f5094640ae237df55f7
+	 *   "514796" -> a09f43dc45fef837235eb7e6b1a6ca5e169a3981
+	 *
+	 * With only one blob written we expect a length of 4.
+	 */
+	cl_must_pass(odb_source_write_object(&source->base, "368317", strlen("368317"),
+					     OBJ_BLOB, &oid1, NULL, 0));
+	cl_must_pass(odb_source_find_abbrev_len(&source->base, &oid1, 4,
+						&abbrev_len));
+	cl_assert_equal_u(abbrev_len, 4);
+
+	/*
+	 * With both objects present, the shared 10-character prefix means we
+	 * need at least 11 characters to uniquely identify either object.
+	 */
+	cl_must_pass(odb_source_write_object(&source->base, "514796", strlen("514796"),
+					     OBJ_BLOB, &oid2, NULL, 0));
+	cl_must_pass(odb_source_find_abbrev_len(&source->base, &oid1, 4,
+						&abbrev_len));
+	cl_assert_equal_u(abbrev_len, 11);
+
+	odb_source_free(&source->base);
+}
+
+void test_odb_inmemory__freshen_object(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	struct object_id written_oid;
+	struct object_id oid;
+	const char *end;
+
+	cl_must_pass(parse_oid_hex_algop(RANDOM_OID, &oid, &end, repo.hash_algo));
+	cl_assert_equal_i(odb_source_freshen_object(&source->base, &oid), 0);
+
+	cl_must_pass(odb_source_write_object(&source->base, "foobar",
+					     strlen("foobar"), OBJ_BLOB,
+					     &written_oid, NULL, 0));
+	cl_assert_equal_i(odb_source_freshen_object(&source->base,
+						    &written_oid), 1);
+
+	odb_source_free(&source->base);
+}
+
+struct membuf_write_stream {
+	struct odb_write_stream base;
+	const char *buf;
+	size_t offset;
+	size_t size;
+};
+
+static ssize_t membuf_write_stream_read(struct odb_write_stream *stream,
+					unsigned char *buf, size_t len)
+{
+	struct membuf_write_stream *s = container_of(stream, struct membuf_write_stream, base);
+	size_t chunk_size = 2;
+
+	if (chunk_size > len)
+		chunk_size = len;
+	if (chunk_size > s->size - s->offset)
+		chunk_size = s->size - s->offset;
+
+	memcpy(buf, s->buf + s->offset, chunk_size);
+
+	s->offset += chunk_size;
+	if (s->offset == s->size)
+		s->base.is_finished = 1;
+
+	return chunk_size;
+}
+
+void test_odb_inmemory__write_object_stream(void)
+{
+	struct odb_source_inmemory *source = odb_source_inmemory_new(odb);
+	const char data[] = "foobar";
+	struct membuf_write_stream stream = {
+		.base.read = membuf_write_stream_read,
+		.buf = data,
+		.size = strlen(data),
+	};
+	struct object_id written_oid;
+
+	cl_must_pass(odb_source_write_object_stream(&source->base, &stream.base,
+						    strlen(data), &written_oid));
+	cl_assert_equal_s(oid_to_hex(&written_oid), FOOBAR_OID);
+	cl_assert_object_info(source, &written_oid, OBJ_BLOB, "foobar");
+
+	odb_source_free(&source->base);
+}

-- 
2.54.0.rc0.707.g0fbf48f4d6.dirty


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH v3 00/17] odb: introduce "in-memory" source
  2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
                     ` (16 preceding siblings ...)
  2026-04-10 12:12   ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt
@ 2026-04-14  8:27   ` Karthik Nayak
  17 siblings, 0 replies; 85+ messages in thread
From: Karthik Nayak @ 2026-04-14  8:27 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 8315 bytes --]

Patrick Steinhardt <ps@pks.im> writes:
[snip]

> Range-diff versus v2:
>
>  1:  b18e427c69 !  1:  155b2cdf81 odb: introduce "in-memory" source
>     @@ odb/source-inmemory.h (new)
>      +struct cached_object_entry;
>      +
>      +/*
>     -+ * An inmemory source that you can write objects to that shall be made
>     ++ * An in-memory source that you can write objects to that shall be made
>      + * available for reading, but that shouldn't ever be persisted to disk. Note
>      + * that any objects written to this source will be stored in memory, so the
>      + * number of objects you can store is limited by available system memory.
>     @@ odb/source-inmemory.h (new)
>      +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb);
>      +
>      +/*
>     -+ * Cast the given object database source to the inmemory backend. This will
>     ++ * Cast the given object database source to the in-memory backend. This will
>      + * cause a BUG in case the source doesn't use this backend.
>      + */
>      +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source)
>      +{
>      +	if (source->type != ODB_SOURCE_INMEMORY)
>     -+		BUG("trying to downcast source of type '%d' to inmemory", source->type);
>     ++		BUG("trying to downcast source of type '%d' to in-memory", source->type);
>      +	return container_of(source, struct odb_source_inmemory, base);
>      +}
>      +
>     @@ odb/source.h: enum odb_source_type {
>       	/* The "files" backend that uses loose objects and packfiles. */
>       	ODB_SOURCE_FILES,
>      +
>     -+	/* The "inmemory" backend that stores objects in memory. */
>     ++	/* The "in-memory" backend that stores objects in memory. */
>      +	ODB_SOURCE_INMEMORY,
>       };
>
>  2:  8fd337da90 !  2:  c66edd10a8 odb/source-inmemory: implement `free()` callback
>     @@ odb/source-inmemory.h
>      +};
>
>       /*
>     -  * An inmemory source that you can write objects to that shall be made
>     +  * An in-memory source that you can write objects to that shall be made
>  3:  f4ae2a2bde =  3:  a86549f39c odb: fix unnecessary call to `find_cached_object()`
>  4:  8600b88530 =  4:  49ac739dd2 odb/source-inmemory: implement `read_object_info()` callback
>  5:  ab33c0b7ee !  5:  321ef11be3 odb/source-inmemory: implement `read_object_stream()` callback
>     @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od
>
>      +struct odb_read_stream_inmemory {
>      +	struct odb_read_stream base;
>     -+	const void *buf;
>     ++	const unsigned char *buf;

Okay this does make more sense.

>      +	size_t offset;
>      +};
>      +
>     @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od
>      +
>      +	if (buf_len > inmemory->base.size - inmemory->offset)
>      +		bytes = inmemory->base.size - inmemory->offset;
>     -+	memcpy(buf, inmemory->buf, bytes);
>     ++
>     ++	memcpy(buf, inmemory->buf + inmemory->offset, bytes);
>     ++	inmemory->offset += bytes;

Now, we also use the offset correctly.

>      +
>      +	return bytes;
>      +}
>  6:  983f886eeb !  6:  506df5e488 odb/source-inmemory: implement `write_object()` callback
>     @@ odb.c: int odb_pretend_object(struct object_database *odb,
>       void *odb_read_object(struct object_database *odb,
>
>       ## odb/source-inmemory.c ##
>     +@@
>     + #include "git-compat-util.h"
>     ++#include "object-file.h"
>     + #include "odb.h"
>     + #include "odb/source-inmemory.h"
>     + #include "odb/streaming.h"
>      @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out,
>       	return 0;
>       }
>     @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct
>      +	struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source);
>      +	struct cached_object_entry *object;
>      +
>     ++	hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid);
>     ++
>      +	ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1,
>      +		   inmemory->objects_alloc);
>      +	object = &inmemory->objects[inmemory->objects_nr++];
>  7:  68edefa269 <  -:  ---------- odb/source-inmemory: implement `write_object()` callback
>  8:  18d451152b !  7:  21eef34c1b odb/source-inmemory: implement `write_object_stream()` callback
>     @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so
>      +			goto out;
>      +		}
>      +
>     -+		memcpy(data, buf, bytes_read);
>     ++		memcpy(data + total_read, buf, bytes_read);
>      +		total_read += bytes_read;
>      +	}
>      +
>  9:  cee53b9853 !  8:  504e34d116 cbtree: allow using arbitrary wrapper structures for nodes
>     @@ cbtree.c: int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen,
>
>
>       ## cbtree.h ##
>     +@@
>     +  *
>     +  * This is adapted to store arbitrary data (not just NUL-terminated C strings
>     +  * and allocates no memory internally.  The user needs to allocate
>     +- * "struct cb_node" and fill cb_node.k[] with arbitrary match data
>     +- * for memcmp.
>     +- * If "klen" is variable, then it should be embedded into "c_node.k[]"
>     ++ * "struct cb_node" and provide `key_offset` to indicate where the key can be
>     ++ * found relative to the `struct cb_node` for memcmp.
>     ++ * If "klen" is variable, then it should be embedded into the key.
>     +  * Recursion is bound by the maximum value of "klen" used.
>     +  */

We fix up the comments here also.

>     + #ifndef CBTREE_H
>      @@ cbtree.h: struct cb_node {
>       	 */
>       	uint32_t byte;
> 10:  8ad5b81b13 =  9:  9bdd475a92 oidtree: add ability to store data
> 11:  1ed2d23137 ! 10:  956b989529 odb/source-inmemory: convert to use oidtree
>     @@ odb/source-inmemory.h
>      +struct oidtree;
>
>       /*
>     -  * An inmemory source that you can write objects to that shall be made
>     +  * An in-memory source that you can write objects to that shall be made
>      @@ odb/source-inmemory.h: struct cached_object_entry {
>        */
>       struct odb_source_inmemory {
> 12:  99fbb1cc35 ! 11:  bec1428116 odb/source-inmemory: implement `for_each_object()` callback
>     @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct
>      +	if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) ||
>      +	    (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local))
>      +		return 0;
>     ++	if (!inmemory->objects)
>     ++		return 0;
>      +
>      +	return oidtree_each(inmemory->objects,
>      +			    opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len,
> 13:  c87a621f39 = 12:  32dada3c27 odb/source-inmemory: implement `find_abbrev_len()` callback
> 14:  9b88f0c07b = 13:  43127840c0 odb/source-inmemory: implement `count_objects()` callback
> 15:  3c9493f2bb = 14:  439acbd068 odb/source-inmemory: implement `freshen_object()` callback
> 16:  f2b6317104 ! 15:  12c1b6ffd2 odb/source-inmemory: stub out remaining functions
>     @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_
>      +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED,
>      +						 struct odb_transaction **out UNUSED)
>      +{
>     -+	return error("inmemory source does not support transactions");
>     ++	return error("in-memory source does not support transactions");
>      +}
>      +
>      +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED,
>     @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_
>      +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED,
>      +					       const char *alternate UNUSED)
>      +{
>     -+	return error("inmemory source does not support alternates");
>     ++	return error("in-memory source does not support alternates");
>      +}
>      +
>      +static void odb_source_inmemory_close(struct odb_source *source UNUSED)
> 17:  81da5d5048 = 16:  ef37a61e7f odb: generic in-memory source
>  -:  ---------- > 17:  51b51e0382 t/unit-tests: add tests for the in-memory object source

The range diff looks good. I'll have a look at the unit test patch
independently. Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source
  2026-04-10 12:12   ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt
@ 2026-04-14  8:45     ` Karthik Nayak
  0 siblings, 0 replies; 85+ messages in thread
From: Karthik Nayak @ 2026-04-14  8:45 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 381 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> While the in-memory object source is a full-fledged source, our code
> base only exercises parts of its functionality because we only use it in
> git-blame(1). Implement unit tests to verify that the yet-unused
> functionality of the backend works as expected.
>

This patch seems extensive and good!

Overall I'm happy with this version.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2026-04-14  8:45 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 01/16] " Patrick Steinhardt
2026-04-08 21:00   ` Justin Tobler
2026-04-09  5:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
2026-04-08 21:05   ` Justin Tobler
2026-04-03  6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
2026-04-08 21:13   ` Justin Tobler
2026-04-09  5:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
2026-04-08 21:24   ` Justin Tobler
2026-04-09  5:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
2026-04-03 22:11   ` Junio C Hamano
2026-04-08  8:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 09/16] oidtree: add ability to store data Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 10/16] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 15/16] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 16/16] odb: generic inmemory source Patrick Steinhardt
2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano
2026-04-08  8:22   ` Patrick Steinhardt
2026-04-08 21:48     ` Junio C Hamano
2026-04-09  5:22       ` Patrick Steinhardt
2026-04-09 13:46         ` Junio C Hamano
2026-04-10  4:53           ` Patrick Steinhardt
2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 01/17] " Patrick Steinhardt
2026-04-09  9:26     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
2026-04-09  9:40     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09 11:22         ` Karthik Nayak
2026-04-09  7:24   ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
2026-04-09  9:49     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 07/17] " Patrick Steinhardt
2026-04-09 10:27     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
2026-04-09 11:36     ` Karthik Nayak
2026-04-09 11:46       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 10/17] oidtree: add ability to store data Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
2026-04-09 19:39     ` Junio C Hamano
2026-04-10  4:53       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 17/17] odb: generic in-memory source Patrick Steinhardt
2026-04-09 11:44   ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak
2026-04-09 11:48     ` Patrick Steinhardt
2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 01/17] " Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 09/17] oidtree: add ability to store data Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 16/17] odb: generic in-memory source Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt
2026-04-14  8:45     ` Karthik Nayak
2026-04-14  8:27   ` [PATCH v3 00/17] odb: introduce "in-memory" source Karthik Nayak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox