* [PATCH 00/16] odb: introduce "inmemory" source
@ 2026-04-03 6:01 Patrick Steinhardt
2026-04-03 6:01 ` [PATCH 01/16] " Patrick Steinhardt
` (18 more replies)
0 siblings, 19 replies; 85+ messages in thread
From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw)
To: git
Hi,
this patch series introduces the second object database source type,
which is the "inmemory" source.
This source may seem somewhat odd at first: it always starts out empty,
and any object written into it will only exist in memory until the
process exits. But the source already serves a purpose in our codebase,
where some commands, for example git-blame(1), write an in-memory
worktree commit.
Furthermore, I think that going forward it can serve more purposes as we
now have an easy way to write and read objects that will not get
persisted. I could see that this may be useful when for example
re-merging diffs. But eventually, once we have the object storage format
extension wired up, callers might even want to manually set up an
in-memory database as the primary ODB for write operations so that no
data will be persisted in an arbitrary write.
Last but not least, this patch series also serves the purpose of
eventually getting rid of the `struct object_info::whence` member.
Instead, we'll simply yield the ODB source a specific object has been
read from, together with some backend-specific data, which gives
strictly more information compared to the status quo.
The series is based on cf2139f8e1 (The 24th batch, 2026-04-01) with
ps/odb-cleanup at 109bcb7d1d (odb: drop unneeded headers and forward
decls, 2026-04-01) merged into it.
Thanks!
Patrick
---
Patrick Steinhardt (16):
odb: introduce "inmemory" source
odb/source-inmemory: implement `free()` callback
odb: fix unnecessary call to `find_cached_object()`
odb/source-inmemory: implement `read_object_info()` callback
odb/source-inmemory: implement `read_object_stream()` callback
odb/source-inmemory: implement `write_object()` callback
odb/source-inmemory: implement `write_object_stream()` callback
cbtree: allow using arbitrary wrapper structures for nodes
oidtree: add ability to store data
odb/source-inmemory: convert to use oidtree
odb/source-inmemory: implement `for_each_object()` callback
odb/source-inmemory: implement `find_abbrev_len()` callback
odb/source-inmemory: implement `count_objects()` callback
odb/source-inmemory: implement `freshen_object()` callback
odb/source-inmemory: stub out remaining functions
odb: generic inmemory source
Makefile | 1 +
cbtree.c | 25 +++-
cbtree.h | 11 +-
loose.c | 2 +-
meson.build | 1 +
object-file.c | 3 +-
odb.c | 82 ++---------
odb.h | 4 +-
odb/source-inmemory.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++
odb/source-inmemory.h | 33 +++++
odb/source.h | 3 +
oidtree.c | 66 ++++++---
oidtree.h | 12 +-
t/unit-tests/u-oidtree.c | 26 +++-
14 files changed, 529 insertions(+), 115 deletions(-)
---
base-commit: 3d05c3e2906489caa9f12f0af18dc233a6b8032c
change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43
^ permalink raw reply [flat|nested] 85+ messages in thread* [PATCH 01/16] odb: introduce "inmemory" source 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-08 21:00 ` Justin Tobler 2026-04-03 6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt ` (17 subsequent siblings) 18 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Next to our typical object database sources, each object database also has an implicit source of "cached" objects. These cached objects only exist in memory and some use cases: - They contain evergreen objects that we expect to always exist, like for example the empty tree. - They can be used to store temporary objects that we don't want to persist to disk. Overall, their use is somewhat restricted though. For example, we don't provide the ability to use it as a temporary object database source that allows the user to write objects, but discard them after Git exists. So while these cached objects behave almost like a source, they aren't used as one. This is about to change over the following commits, where we will turn cached objects into a new "inmemory" source. This will allow us to use it exactly the same as any other source by providing the same common interface as the "files" source. For now, the inmemory source only hosts the cached objects and doesn't provide any logic yet. This will change with subsequent commits, where we move respective functionality into the source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Makefile | 1 + meson.build | 1 + odb.c | 21 +++++++++++++-------- odb.h | 4 ++-- odb/source-inmemory.c | 12 ++++++++++++ odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++ odb/source.h | 3 +++ 7 files changed, 67 insertions(+), 10 deletions(-) diff --git a/Makefile b/Makefile index dbf0022054..175391e6f8 100644 --- a/Makefile +++ b/Makefile @@ -1218,6 +1218,7 @@ LIB_OBJS += object.o LIB_OBJS += odb.o LIB_OBJS += odb/source.o LIB_OBJS += odb/source-files.o +LIB_OBJS += odb/source-inmemory.o LIB_OBJS += odb/streaming.o LIB_OBJS += oid-array.o LIB_OBJS += oidmap.o diff --git a/meson.build b/meson.build index 8309942d18..8f55d2650e 100644 --- a/meson.build +++ b/meson.build @@ -404,6 +404,7 @@ libgit_sources = [ 'odb.c', 'odb/source.c', 'odb/source-files.c', + 'odb/source-inmemory.c', 'odb/streaming.c', 'oid-array.c', 'oidmap.c', diff --git a/odb.c b/odb.c index 9b28fe25ef..95b21e2cfd 100644 --- a/odb.c +++ b/odb.c @@ -14,6 +14,7 @@ #include "object-file.h" #include "object-name.h" #include "odb.h" +#include "odb/source-inmemory.h" #include "packfile.h" #include "path.h" #include "promisor-remote.h" @@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob .type = OBJ_TREE, .buf = "", }; - const struct cached_object_entry *co = object_store->cached_objects; + const struct cached_object_entry *co = object_store->inmemory_objects->objects; - for (size_t i = 0; i < object_store->cached_object_nr; i++, co++) + for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++) if (oideq(&co->oid, oid)) return &co->value; @@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb, find_cached_object(odb, oid)) return 0; - ALLOC_GROW(odb->cached_objects, - odb->cached_object_nr + 1, odb->cached_object_alloc); - co = &odb->cached_objects[odb->cached_object_nr++]; + ALLOC_GROW(odb->inmemory_objects->objects, + odb->inmemory_objects->objects_nr + 1, + odb->inmemory_objects->objects_alloc); + co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++]; co->value.size = len; co->value.type = type; co_buf = xmalloc(len); @@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo, o->sources = odb_source_new(o, primary_source, true); o->sources_tail = &o->sources->next; o->alternate_db = xstrdup_or_null(secondary_sources); + o->inmemory_objects = odb_source_inmemory_new(o); free(to_free); @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o) odb_close(o); odb_free_sources(o); - for (size_t i = 0; i < o->cached_object_nr; i++) - free((char *) o->cached_objects[i].value.buf); - free(o->cached_objects); + for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) + free((char *) o->inmemory_objects->objects[i].value.buf); + free(o->inmemory_objects->objects); + free(o->inmemory_objects->base.path); + free(o->inmemory_objects); string_list_clear(&o->submodule_source_paths, 0); diff --git a/odb.h b/odb.h index 3a711f6547..3d20270a05 100644 --- a/odb.h +++ b/odb.h @@ -8,6 +8,7 @@ #include "thread-utils.h" struct cached_object_entry; +struct odb_source_inmemory; struct packed_git; struct repository; struct strbuf; @@ -98,8 +99,7 @@ struct object_database { * to write them into the object store (e.g. a browse-only * application). */ - struct cached_object_entry *cached_objects; - size_t cached_object_nr, cached_object_alloc; + struct odb_source_inmemory *inmemory_objects; /* * A fast, rough count of the number of objects in the repository. diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c new file mode 100644 index 0000000000..c7ac5c24f0 --- /dev/null +++ b/odb/source-inmemory.c @@ -0,0 +1,12 @@ +#include "git-compat-util.h" +#include "odb/source-inmemory.h" + +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) +{ + struct odb_source_inmemory *source; + + CALLOC_ARRAY(source, 1); + odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); + + return source; +} diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h new file mode 100644 index 0000000000..95477bf36d --- /dev/null +++ b/odb/source-inmemory.h @@ -0,0 +1,35 @@ +#ifndef ODB_SOURCE_INMEMORY_H +#define ODB_SOURCE_INMEMORY_H + +#include "odb/source.h" + +struct cached_object_entry; + +/* + * An inmemory source that you can write objects to that shall be made + * available for reading, but that shouldn't ever be persisted to disk. Note + * that any objects written to this source will be stored in memory, so the + * number of objects you can store is limited by available system memory. + */ +struct odb_source_inmemory { + struct odb_source base; + + struct cached_object_entry *objects; + size_t objects_nr, objects_alloc; +}; + +/* Create a new in-memory object database source. */ +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb); + +/* + * Cast the given object database source to the inmemory backend. This will + * cause a BUG in case the source doesn't use this backend. + */ +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source) +{ + if (source->type != ODB_SOURCE_INMEMORY) + BUG("trying to downcast source of type '%d' to inmemory", source->type); + return container_of(source, struct odb_source_inmemory, base); +} + +#endif diff --git a/odb/source.h b/odb/source.h index f706e0608a..cd14f9e046 100644 --- a/odb/source.h +++ b/odb/source.h @@ -13,6 +13,9 @@ enum odb_source_type { /* The "files" backend that uses loose objects and packfiles. */ ODB_SOURCE_FILES, + + /* The "inmemory" backend that stores objects in memory. */ + ODB_SOURCE_INMEMORY, }; struct object_id; -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 01/16] odb: introduce "inmemory" source 2026-04-03 6:01 ` [PATCH 01/16] " Patrick Steinhardt @ 2026-04-08 21:00 ` Justin Tobler 2026-04-09 5:22 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Justin Tobler @ 2026-04-08 21:00 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git On 26/04/03 08:01AM, Patrick Steinhardt wrote: > Next to our typical object database sources, each object database also > has an implicit source of "cached" objects. These cached objects only > exist in memory and some use cases: > > - They contain evergreen objects that we expect to always exist, like > for example the empty tree. > > - They can be used to store temporary objects that we don't want to > persist to disk. > > Overall, their use is somewhat restricted though. For example, we don't > provide the ability to use it as a temporary object database source that > allows the user to write objects, but discard them after Git exists. So > while these cached objects behave almost like a source, they aren't used > as one. I find the wording of the second bullet point and paragraph above a little confusing. Are there existing uses where new objects are written to only the cache? > This is about to change over the following commits, where we will turn > cached objects into a new "inmemory" source. This will allow us to use > it exactly the same as any other source by providing the same common > interface as the "files" source. Treating the object cache just like any other ODB source seems like a good direction. > For now, the inmemory source only hosts the cached objects and doesn't > provide any logic yet. This will change with subsequent commits, where > we move respective functionality into the source. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > Makefile | 1 + > meson.build | 1 + > odb.c | 21 +++++++++++++-------- > odb.h | 4 ++-- > odb/source-inmemory.c | 12 ++++++++++++ > odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++ > odb/source.h | 3 +++ > 7 files changed, 67 insertions(+), 10 deletions(-) > [snip] > @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o) > odb_close(o); > odb_free_sources(o); > > - for (size_t i = 0; i < o->cached_object_nr; i++) > - free((char *) o->cached_objects[i].value.buf); > - free(o->cached_objects); > + for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) > + free((char *) o->inmemory_objects->objects[i].value.buf); > + free(o->inmemory_objects->objects); > + free(o->inmemory_objects->base.path); > + free(o->inmemory_objects); Should we have some sort of `odb_source_inmemory_release()`? > > string_list_clear(&o->submodule_source_paths, 0); > > diff --git a/odb.h b/odb.h > index 3a711f6547..3d20270a05 100644 > --- a/odb.h > +++ b/odb.h > @@ -8,6 +8,7 @@ > #include "thread-utils.h" > > struct cached_object_entry; > +struct odb_source_inmemory; > struct packed_git; > struct repository; > struct strbuf; > @@ -98,8 +99,7 @@ struct object_database { > * to write them into the object store (e.g. a browse-only > * application). > */ > - struct cached_object_entry *cached_objects; > - size_t cached_object_nr, cached_object_alloc; > + struct odb_source_inmemory *inmemory_objects; We store an inmemory ODB source instead of the cache object info directly. Makes sense. > > /* > * A fast, rough count of the number of objects in the repository. > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > new file mode 100644 > index 0000000000..c7ac5c24f0 > --- /dev/null > +++ b/odb/source-inmemory.c > @@ -0,0 +1,12 @@ > +#include "git-compat-util.h" > +#include "odb/source-inmemory.h" > + > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) > +{ > + struct odb_source_inmemory *source; > + > + CALLOC_ARRAY(source, 1); > + odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); huh, so we set the path for the `struct odb_source` to "source". In the context of an inmemory source, a path doesn't make much sense. I suspect though that storing a path is likely only useful the context of the files ODB source. Is there reason for us to still keep this around in the generic ODB source? > + > + return source; > +} > diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h > new file mode 100644 > index 0000000000..95477bf36d > --- /dev/null > +++ b/odb/source-inmemory.h > @@ -0,0 +1,35 @@ > +#ifndef ODB_SOURCE_INMEMORY_H > +#define ODB_SOURCE_INMEMORY_H > + > +#include "odb/source.h" > + > +struct cached_object_entry; > + > +/* > + * An inmemory source that you can write objects to that shall be made > + * available for reading, but that shouldn't ever be persisted to disk. Note > + * that any objects written to this source will be stored in memory, so the > + * number of objects you can store is limited by available system memory. > + */ > +struct odb_source_inmemory { > + struct odb_source base; > + > + struct cached_object_entry *objects; > + size_t objects_nr, objects_alloc; > +}; This new ODB source now just contains the object cache info. Looks good. -Justin ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 01/16] odb: introduce "inmemory" source 2026-04-08 21:00 ` Justin Tobler @ 2026-04-09 5:22 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 5:22 UTC (permalink / raw) To: Justin Tobler; +Cc: git On Wed, Apr 08, 2026 at 04:00:48PM -0500, Justin Tobler wrote: > On 26/04/03 08:01AM, Patrick Steinhardt wrote: > > Next to our typical object database sources, each object database also > > has an implicit source of "cached" objects. These cached objects only > > exist in memory and some use cases: > > > > - They contain evergreen objects that we expect to always exist, like > > for example the empty tree. > > > > - They can be used to store temporary objects that we don't want to > > persist to disk. > > > > Overall, their use is somewhat restricted though. For example, we don't > > provide the ability to use it as a temporary object database source that > > allows the user to write objects, but discard them after Git exists. So > > while these cached objects behave almost like a source, they aren't used > > as one. > > I find the wording of the second bullet point and paragraph above a > little confusing. Are there existing uses where new objects are written > to only the cache? Yes, there's a single user with git-blame(1). I'll mention that user explcitly. > > @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o) > > odb_close(o); > > odb_free_sources(o); > > > > - for (size_t i = 0; i < o->cached_object_nr; i++) > > - free((char *) o->cached_objects[i].value.buf); > > - free(o->cached_objects); > > + for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) > > + free((char *) o->inmemory_objects->objects[i].value.buf); > > + free(o->inmemory_objects->objects); > > + free(o->inmemory_objects->base.path); > > + free(o->inmemory_objects); > > Should we have some sort of `odb_source_inmemory_release()`? Yup, this is coming in subsequent commits. > > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > > new file mode 100644 > > index 0000000000..c7ac5c24f0 > > --- /dev/null > > +++ b/odb/source-inmemory.c > > @@ -0,0 +1,12 @@ > > +#include "git-compat-util.h" > > +#include "odb/source-inmemory.h" > > + > > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) > > +{ > > + struct odb_source_inmemory *source; > > + > > + CALLOC_ARRAY(source, 1); > > + odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); > > huh, so we set the path for the `struct odb_source` to "source". In the > context of an inmemory source, a path doesn't make much sense. I suspect > though that storing a path is likely only useful the context of the > files ODB source. Is there reason for us to still keep this around in > the generic ODB source? There are two reasons for the "path" field to exist: - It is used to compare sources with one another to figure out whether two sources are actually the same. This is used when reloading sources. This usage makes sense in principle, but it's wrong that we consider this to be a "path" -- it should rather be considered an opaque "payload". - The path field is used in a bunch of sites to actually figure out paths. This is plain wrong, as we cannot guarantee that the field even is a path for backends that don't store data on the filesystem. It's one of the topics that we've got on our plate, to disentangle this. The goal is ultimately to move the path into the files backend, fix up callers to do the right thing (TM) and then convert the current path field that we have into a payload. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 02/16] odb/source-inmemory: implement `free()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 01/16] " Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-08 21:05 ` Justin Tobler 2026-04-03 6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt ` (16 subsequent siblings) 18 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Implement the `free()` callback function for the "inmemory" source. Note that this requires us to define `struct cached_object_entry` in "odb/source-inmemory.h", as it is accessed in both "odb.c" and "odb/source-inmemory.c" now. This will be fixed in subsequent commits though. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 25 ++++--------------------- odb/source-inmemory.c | 12 ++++++++++++ odb/source-inmemory.h | 9 ++++++++- 3 files changed, 24 insertions(+), 22 deletions(-) diff --git a/odb.c b/odb.c index 95b21e2cfd..d321242353 100644 --- a/odb.c +++ b/odb.c @@ -32,21 +32,6 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, struct odb_source *, 1, fspathhash, fspatheq) -/* - * This is meant to hold a *small* number of objects that you would - * want odb_read_object() to be able to return, but yet you do not want - * to write them into the object store (e.g. a browse-only - * application). - */ -struct cached_object_entry { - struct object_id oid; - struct cached_object { - enum object_type type; - const void *buf; - unsigned long size; - } value; -}; - static const struct cached_object *find_cached_object(struct object_database *object_store, const struct object_id *oid) { @@ -1109,6 +1094,10 @@ static void odb_free_sources(struct object_database *o) odb_source_free(o->sources); o->sources = next; } + + odb_source_free(&o->inmemory_objects->base); + o->inmemory_objects = NULL; + kh_destroy_odb_path_map(o->source_by_path); o->source_by_path = NULL; } @@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o) odb_close(o); odb_free_sources(o); - for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) - free((char *) o->inmemory_objects->objects[i].value.buf); - free(o->inmemory_objects->objects); - free(o->inmemory_objects->base.path); - free(o->inmemory_objects); - string_list_clear(&o->submodule_source_paths, 0); free(o); diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index c7ac5c24f0..ccbb622eae 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,6 +1,16 @@ #include "git-compat-util.h" #include "odb/source-inmemory.h" +static void odb_source_inmemory_free(struct odb_source *source) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + for (size_t i = 0; i < inmemory->objects_nr; i++) + free((char *) inmemory->objects[i].value.buf); + free(inmemory->objects); + free(inmemory->base.path); + free(inmemory); +} + struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) { struct odb_source_inmemory *source; @@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) CALLOC_ARRAY(source, 1); odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); + source->base.free = odb_source_inmemory_free; + return source; } diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h index 95477bf36d..14dc06f7c3 100644 --- a/odb/source-inmemory.h +++ b/odb/source-inmemory.h @@ -3,7 +3,14 @@ #include "odb/source.h" -struct cached_object_entry; +struct cached_object_entry { + struct object_id oid; + struct cached_object { + enum object_type type; + const void *buf; + unsigned long size; + } value; +}; /* * An inmemory source that you can write objects to that shall be made -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 02/16] odb/source-inmemory: implement `free()` callback 2026-04-03 6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt @ 2026-04-08 21:05 ` Justin Tobler 0 siblings, 0 replies; 85+ messages in thread From: Justin Tobler @ 2026-04-08 21:05 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git On 26/04/03 08:01AM, Patrick Steinhardt wrote: > @@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o) > odb_close(o); > odb_free_sources(o); > > - for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) > - free((char *) o->inmemory_objects->objects[i].value.buf); > - free(o->inmemory_objects->objects); > - free(o->inmemory_objects->base.path); > - free(o->inmemory_objects); Ah ok, this addresses a comment in the previous patch. > - > string_list_clear(&o->submodule_source_paths, 0); > > free(o); > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > index c7ac5c24f0..ccbb622eae 100644 > --- a/odb/source-inmemory.c > +++ b/odb/source-inmemory.c > @@ -1,6 +1,16 @@ > #include "git-compat-util.h" > #include "odb/source-inmemory.h" > > +static void odb_source_inmemory_free(struct odb_source *source) > +{ > + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); > + for (size_t i = 0; i < inmemory->objects_nr; i++) > + free((char *) inmemory->objects[i].value.buf); > + free(inmemory->objects); > + free(inmemory->base.path); > + free(inmemory); > +} > + > struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) > { > struct odb_source_inmemory *source; > @@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) > CALLOC_ARRAY(source, 1); > odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); > > + source->base.free = odb_source_inmemory_free; We wire up a function to specifically handle freeing the inmemory ODB source. Looks good. -Justin ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 01/16] " Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-08 21:13 ` Justin Tobler 2026-04-03 6:01 ` [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt ` (15 subsequent siblings) 18 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git The function `odb_pretend_object()` writes an object into the in-memory object database source. The effect of this is that the object will now become readable, but it won't ever be persisted to disk. Before storing the object, we first verify whether the object already exists. This is done by calling `odb_has_object()` to check all sources, followed by `find_cached_object()` to check whether we have already stored the object in our in-memory source. This is unnecessary though, as `odb_has_object()` already checks the in-memory source transitively via: - `odb_has_object()` - `odb_read_object_info_extended()` - `do_oid_object_info_extended()` - `find_cached_object()` Drop the explicit call to `find_cached_object()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/odb.c b/odb.c index d321242353..21cdedc31c 100644 --- a/odb.c +++ b/odb.c @@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb, char *co_buf; hash_object_file(odb->repo->hash_algo, buf, len, type, oid); - if (odb_has_object(odb, oid, 0) || - find_cached_object(odb, oid)) + if (odb_has_object(odb, oid, 0)) return 0; ALLOC_GROW(odb->inmemory_objects->objects, -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` 2026-04-03 6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt @ 2026-04-08 21:13 ` Justin Tobler 2026-04-09 5:22 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Justin Tobler @ 2026-04-08 21:13 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git On 26/04/03 08:01AM, Patrick Steinhardt wrote: > diff --git a/odb.c b/odb.c > index d321242353..21cdedc31c 100644 > --- a/odb.c > +++ b/odb.c > @@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb, > char *co_buf; > > hash_object_file(odb->repo->hash_algo, buf, len, type, oid); > - if (odb_has_object(odb, oid, 0) || > - find_cached_object(odb, oid)) > + if (odb_has_object(odb, oid, 0)) Nice, odb_has_object() does indeed already check the object cache so that makes the explicit find_cached_object() redundant. If a future where temporary objects could be written to the inmemory ODB source, would there ever be a reason for odb_has_object() to differentiate between inmemory and real objects? -Justin ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` 2026-04-08 21:13 ` Justin Tobler @ 2026-04-09 5:22 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 5:22 UTC (permalink / raw) To: Justin Tobler; +Cc: git On Wed, Apr 08, 2026 at 04:13:45PM -0500, Justin Tobler wrote: > On 26/04/03 08:01AM, Patrick Steinhardt wrote: > > diff --git a/odb.c b/odb.c > > index d321242353..21cdedc31c 100644 > > --- a/odb.c > > +++ b/odb.c > > @@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb, > > char *co_buf; > > > > hash_object_file(odb->repo->hash_algo, buf, len, type, oid); > > - if (odb_has_object(odb, oid, 0) || > > - find_cached_object(odb, oid)) > > + if (odb_has_object(odb, oid, 0)) > > Nice, odb_has_object() does indeed already check the object cache so > that makes the explicit find_cached_object() redundant. > > If a future where temporary objects could be written to the inmemory ODB > source, would there ever be a reason for odb_has_object() to > differentiate between inmemory and real objects? We could in theory just append the in-memory source to the normal list of sources, and that would ensure that all the usual operations would know to also consider this source. But there's a couple of points that speak against it, at least for now: - Callers that explicitly want to explicitly write temporary objects need to have a handle to the in-memory source. That handle would be hard to obtain if we were to only store the source in the list of sources. - It would be a change in behaviour if functions like `odb_for_each_object()` were to also enumerate in-memory objects. The former one could be solved by having both the direct pointer and keep the source in the list. The latter can be solved by having a separate flag for `odb_for_each_object()` that tells the ODB that we want to exclude/include in-memory objects. But overall it feels like this would only complicate things without much of a tangible benefit. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (2 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt ` (14 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Implement the `read_object_info()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 39 +------------------------------------ odb/source-inmemory.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+), 38 deletions(-) diff --git a/odb.c b/odb.c index 21cdedc31c..b8e7356951 100644 --- a/odb.c +++ b/odb.c @@ -32,25 +32,6 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, struct odb_source *, 1, fspathhash, fspatheq) -static const struct cached_object *find_cached_object(struct object_database *object_store, - const struct object_id *oid) -{ - static const struct cached_object empty_tree = { - .type = OBJ_TREE, - .buf = "", - }; - const struct cached_object_entry *co = object_store->inmemory_objects->objects; - - for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++) - if (oideq(&co->oid, oid)) - return &co->value; - - if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) - return &empty_tree; - - return NULL; -} - int odb_mkstemp(struct object_database *odb, struct strbuf *temp_filename, const char *pattern) { @@ -570,7 +551,6 @@ static int do_oid_object_info_extended(struct object_database *odb, const struct object_id *oid, struct object_info *oi, unsigned flags) { - const struct cached_object *co; const struct object_id *real = oid; int already_retried = 0; @@ -580,25 +560,8 @@ static int do_oid_object_info_extended(struct object_database *odb, if (is_null_oid(real)) return -1; - co = find_cached_object(odb, real); - if (co) { - if (oi) { - if (oi->typep) - *(oi->typep) = co->type; - if (oi->sizep) - *(oi->sizep) = co->size; - if (oi->disk_sizep) - *(oi->disk_sizep) = 0; - if (oi->delta_base_oid) - oidclr(oi->delta_base_oid, odb->repo->hash_algo); - if (oi->contentp) - *oi->contentp = xmemdupz(co->buf, co->size); - if (oi->mtimep) - *oi->mtimep = 0; - oi->whence = OI_CACHED; - } + if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags)) return 0; - } odb_prepare_alternates(odb); diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index ccbb622eae..12c80f9b34 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,5 +1,57 @@ #include "git-compat-util.h" +#include "odb.h" #include "odb/source-inmemory.h" +#include "repository.h" + +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, + const struct object_id *oid) +{ + static const struct cached_object empty_tree = { + .type = OBJ_TREE, + .buf = "", + }; + const struct cached_object_entry *co = source->objects; + + for (size_t i = 0; i < source->objects_nr; i++, co++) + if (oideq(&co->oid, oid)) + return &co->value; + + if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) + return &empty_tree; + + return NULL; +} + +static int odb_source_inmemory_read_object_info(struct odb_source *source, + const struct object_id *oid, + struct object_info *oi, + enum object_info_flags flags UNUSED) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + const struct cached_object *object; + + object = find_cached_object(inmemory, oid); + if (!object) + return -1; + + if (oi) { + if (oi->typep) + *(oi->typep) = object->type; + if (oi->sizep) + *(oi->sizep) = object->size; + if (oi->disk_sizep) + *(oi->disk_sizep) = 0; + if (oi->delta_base_oid) + oidclr(oi->delta_base_oid, source->odb->repo->hash_algo); + if (oi->contentp) + *oi->contentp = xmemdupz(object->buf, object->size); + if (oi->mtimep) + *oi->mtimep = 0; + oi->whence = OI_CACHED; + } + + return 0; +} static void odb_source_inmemory_free(struct odb_source *source) { @@ -19,6 +71,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); source->base.free = odb_source_inmemory_free; + source->base.read_object_info = odb_source_inmemory_read_object_info; return source; } -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (3 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-08 21:24 ` Justin Tobler 2026-04-03 6:01 ` [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt ` (13 subsequent siblings) 18 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Implement the `read_object_stream()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 12c80f9b34..4a68169430 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,6 +1,7 @@ #include "git-compat-util.h" #include "odb.h" #include "odb/source-inmemory.h" +#include "odb/streaming.h" #include "repository.h" static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, @@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, return 0; } +struct odb_read_stream_inmemory { + struct odb_read_stream base; + const void *buf; + size_t offset; +}; + +static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream, + char *buf, size_t buf_len) +{ + struct odb_read_stream_inmemory *inmemory = + container_of(stream, struct odb_read_stream_inmemory, base); + size_t bytes = buf_len; + + if (buf_len > inmemory->base.size - inmemory->offset) + bytes = inmemory->base.size - inmemory->offset; + memcpy(buf, inmemory->buf, bytes); + + return bytes; +} + +static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED) +{ + return 0; +} + +static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, + struct odb_source *source, + const struct object_id *oid) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct odb_read_stream_inmemory *stream; + const struct cached_object *object; + + object = find_cached_object(inmemory, oid); + if (!object) + return -1; + + CALLOC_ARRAY(stream, 1); + stream->base.read = odb_read_stream_inmemory_read; + stream->base.close = odb_read_stream_inmemory_close; + stream->base.size = object->size; + stream->base.type = object->type; + stream->buf = object->buf; + + *out = &stream->base; + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -72,6 +121,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; + source->base.read_object_stream = odb_source_inmemory_read_object_stream; return source; } -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback 2026-04-03 6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt @ 2026-04-08 21:24 ` Justin Tobler 2026-04-09 5:22 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Justin Tobler @ 2026-04-08 21:24 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git On 26/04/03 08:01AM, Patrick Steinhardt wrote: > Implement the `read_object_stream()` callback function for the inmemory > source. Hmmm, if the whole object is already in memory, outside providing a complete ODB source interface, is there really much reason for streaming the object in practice? The patch itself looks good though. -Justin ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback 2026-04-08 21:24 ` Justin Tobler @ 2026-04-09 5:22 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 5:22 UTC (permalink / raw) To: Justin Tobler; +Cc: git On Wed, Apr 08, 2026 at 04:24:13PM -0500, Justin Tobler wrote: > On 26/04/03 08:01AM, Patrick Steinhardt wrote: > > Implement the `read_object_stream()` callback function for the inmemory > > source. > > Hmmm, if the whole object is already in memory, outside providing a > complete ODB source interface, is there really much reason for streaming > the object in practice? I cannot think of any, but wanted to provide this function anyway so that the backend is complete. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (4 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt ` (12 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Implement the `write_object()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 16 ++-------------- odb/source-inmemory.c | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/odb.c b/odb.c index b8e7356951..34228c0cd5 100644 --- a/odb.c +++ b/odb.c @@ -733,24 +733,12 @@ int odb_pretend_object(struct object_database *odb, void *buf, unsigned long len, enum object_type type, struct object_id *oid) { - struct cached_object_entry *co; - char *co_buf; - hash_object_file(odb->repo->hash_algo, buf, len, type, oid); if (odb_has_object(odb, oid, 0)) return 0; - ALLOC_GROW(odb->inmemory_objects->objects, - odb->inmemory_objects->objects_nr + 1, - odb->inmemory_objects->objects_alloc); - co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++]; - co->value.size = len; - co->value.type = type; - co_buf = xmalloc(len); - memcpy(co_buf, buf, len); - co->value.buf = co_buf; - oidcpy(&co->oid, oid); - return 0; + return odb_source_write_object(&odb->inmemory_objects->base, + buf, len, type, oid, NULL, 0); } void *odb_read_object(struct object_database *odb, diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 4a68169430..d2fc4c4054 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -102,6 +102,27 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, return 0; } +static int odb_source_inmemory_write_object(struct odb_source *source, + const void *buf, unsigned long len, + enum object_type type, + struct object_id *oid, + struct object_id *compat_oid UNUSED, + enum odb_write_object_flags flags UNUSED) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct cached_object_entry *object; + + ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, + inmemory->objects_alloc); + object = &inmemory->objects[inmemory->objects_nr++]; + object->value.size = len; + object->value.type = type; + object->value.buf = xmemdupz(buf, len); + oidcpy(&object->oid, oid); + + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -122,6 +143,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; + source->base.write_object = odb_source_inmemory_write_object; return source; } -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (5 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 22:11 ` Junio C Hamano 2026-04-03 6:01 ` [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt ` (11 subsequent siblings) 18 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Implement the `write_object_stream()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index d2fc4c4054..890e2a8c7c 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -123,6 +123,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source, return 0; } +static int odb_source_inmemory_write_object_stream(struct odb_source *source, + struct odb_write_stream *stream, + size_t len, + struct object_id *oid) +{ + size_t total_read = 0; + char *data; + int ret; + + CALLOC_ARRAY(data, len); + while (!stream->is_finished) { + unsigned long bytes_read; + const void *in; + + in = stream->read(stream, &bytes_read); + if (total_read + bytes_read > len) { + ret = error("object stream yielded more bytes than expected"); + goto out; + } + + memcpy(data, in, bytes_read); + total_read += bytes_read; + } + + if (total_read != len) { + ret = error("object stream yielded less bytes than expected"); + goto out; + } + + ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid, + NULL, 0); + if (ret < 0) + goto out; + +out: + free(data); + return ret; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -144,6 +183,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.write_object = odb_source_inmemory_write_object; + source->base.write_object_stream = odb_source_inmemory_write_object_stream; return source; } -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback 2026-04-03 6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt @ 2026-04-03 22:11 ` Junio C Hamano 2026-04-08 8:22 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Junio C Hamano @ 2026-04-03 22:11 UTC (permalink / raw) To: Patrick Steinhardt, Justin Tobler; +Cc: git Patrick Steinhardt <ps@pks.im> writes: > Implement the `write_object_stream()` callback function for the inmemory > source. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 40 insertions(+) As the signature of the .read() method drastically changes in another topic in flight, https://lore.kernel.org/git/20260402213220.2651523-4-jltobler@gmail.com/ this needs a bit of inter-topic coordination. > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > index d2fc4c4054..890e2a8c7c 100644 > --- a/odb/source-inmemory.c > +++ b/odb/source-inmemory.c > @@ -123,6 +123,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source, > return 0; > } > > +static int odb_source_inmemory_write_object_stream(struct odb_source *source, > + struct odb_write_stream *stream, > + size_t len, > + struct object_id *oid) > +{ > + size_t total_read = 0; > + char *data; > + int ret; > + > + CALLOC_ARRAY(data, len); > + while (!stream->is_finished) { > + unsigned long bytes_read; > + const void *in; > + > + in = stream->read(stream, &bytes_read); > + if (total_read + bytes_read > len) { > + ret = error("object stream yielded more bytes than expected"); > + goto out; > + } > + > + memcpy(data, in, bytes_read); > + total_read += bytes_read; > + } > + > + if (total_read != len) { > + ret = error("object stream yielded less bytes than expected"); > + goto out; > + } > + > + ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid, > + NULL, 0); > + if (ret < 0) > + goto out; > + > +out: > + free(data); > + return ret; > +} > + > static void odb_source_inmemory_free(struct odb_source *source) > { > struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); > @@ -144,6 +183,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) > source->base.read_object_info = odb_source_inmemory_read_object_info; > source->base.read_object_stream = odb_source_inmemory_read_object_stream; > source->base.write_object = odb_source_inmemory_write_object; > + source->base.write_object_stream = odb_source_inmemory_write_object_stream; > > return source; > } ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback 2026-04-03 22:11 ` Junio C Hamano @ 2026-04-08 8:22 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-08 8:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: Justin Tobler, git On Fri, Apr 03, 2026 at 03:11:16PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > Implement the `write_object_stream()` callback function for the inmemory > > source. > > > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 40 insertions(+) > > As the signature of the .read() method drastically changes in > another topic in flight, > > https://lore.kernel.org/git/20260402213220.2651523-4-jltobler@gmail.com/ > > this needs a bit of inter-topic coordination. Fair. I think Justin's patch series is close to landing, so I'll rebase my patch series on top of his. Thanks for flagging. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (6 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 09/16] oidtree: add ability to store data Patrick Steinhardt ` (10 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git The cbtree subsystem allows the user to store arbitrary data in a prefix-free set of strings. This is used by us to store object IDs in a way that we can easily iterate through them in lexicograph order, and so that we can easily perform lookups with shortened object IDs. In its current form, it is not easily possible to store arbitrary data with the tree nodes. There are a couple of approaches such a caller could try to use, but none of them really work: - One may embed the `struct cb_node` in a custom structure. This does not work though as `struct cb_node` contains a flex array, and embedding such a struct in another struct is forbidden. - One may use a `union` over `struct cb_node` and ones own data type, which _is_ allowed even if the struct contains a flex array. This does not work though, as the compiler may align members of the struct so that the node key would not immediately start where the flex array starts. - One may allocate `struct cb_node` such that it has room for both its key and the custom data. This has the downside though that if the custom data is itself a pointer to allocated memory, then the leak checker will not consider the pointer to be alive anymore. Refactor the cbtree to drop the flex array and instead take in an explicit offset for where to find the key, which allows the caller to embed `struct cb_node` is a wrapper struct. Note that this change has the downside that we now have a bit of padding in our structure, which grows the size from 60 to 64 bytes on a 64 bit system. On the other hand though, it allows us to get rid of the memory copies that we previously had to do to ensure proper alignment. This seems like a reasonable tradeoff. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- cbtree.c | 25 ++++++++++++++++++------- cbtree.h | 11 ++++++----- oidtree.c | 33 ++++++++++++++------------------- 3 files changed, 38 insertions(+), 31 deletions(-) diff --git a/cbtree.c b/cbtree.c index 4ab794bddc..8f5edbb80a 100644 --- a/cbtree.c +++ b/cbtree.c @@ -7,6 +7,11 @@ #include "git-compat-util.h" #include "cbtree.h" +static inline uint8_t *cb_node_key(struct cb_tree *t, struct cb_node *node) +{ + return (uint8_t *) node + t->key_offset; +} + static struct cb_node *cb_node_of(const void *p) { return (struct cb_node *)((uintptr_t)p - 1); @@ -33,6 +38,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) uint8_t c; int newdirection; struct cb_node **wherep, *p; + uint8_t *node_key, *p_key; assert(!((uintptr_t)node & 1)); /* allocations must be aligned */ @@ -41,23 +47,26 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) return NULL; /* success */ } + node_key = cb_node_key(t, node); + /* see if a node already exists */ - p = cb_internal_best_match(t->root, node->k, klen); + p = cb_internal_best_match(t->root, node_key, klen); + p_key = cb_node_key(t, p); /* find first differing byte */ for (newbyte = 0; newbyte < klen; newbyte++) { - if (p->k[newbyte] != node->k[newbyte]) + if (p_key[newbyte] != node_key[newbyte]) goto different_byte_found; } return p; /* element exists, let user deal with it */ different_byte_found: - newotherbits = p->k[newbyte] ^ node->k[newbyte]; + newotherbits = p_key[newbyte] ^ node_key[newbyte]; newotherbits |= newotherbits >> 1; newotherbits |= newotherbits >> 2; newotherbits |= newotherbits >> 4; newotherbits = (newotherbits & ~(newotherbits >> 1)) ^ 255; - c = p->k[newbyte]; + c = p_key[newbyte]; newdirection = (1 + (newotherbits | c)) >> 8; node->byte = newbyte; @@ -78,7 +87,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) break; if (q->byte == newbyte && q->otherbits > newotherbits) break; - c = q->byte < klen ? node->k[q->byte] : 0; + c = q->byte < klen ? node_key[q->byte] : 0; direction = (1 + (q->otherbits | c)) >> 8; wherep = q->child + direction; } @@ -93,7 +102,7 @@ struct cb_node *cb_lookup(struct cb_tree *t, const uint8_t *k, size_t klen) { struct cb_node *p = cb_internal_best_match(t->root, k, klen); - return p && !memcmp(p->k, k, klen) ? p : NULL; + return p && !memcmp(cb_node_key(t, p), k, klen) ? p : NULL; } static int cb_descend(struct cb_node *p, cb_iter fn, void *arg) @@ -115,6 +124,7 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, struct cb_node *p = t->root; struct cb_node *top = p; size_t i = 0; + uint8_t *p_key; if (!p) return 0; /* empty tree */ @@ -130,8 +140,9 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, top = p; } + p_key = cb_node_key(t, p); for (i = 0; i < klen; i++) { - if (p->k[i] != kpfx[i]) + if (p_key[i] != kpfx[i]) return 0; /* "best" match failed */ } diff --git a/cbtree.h b/cbtree.h index c374b1b3db..3ce0d6b287 100644 --- a/cbtree.h +++ b/cbtree.h @@ -23,18 +23,19 @@ struct cb_node { */ uint32_t byte; uint8_t otherbits; - uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */ }; struct cb_tree { struct cb_node *root; + ptrdiff_t key_offset; }; -#define CBTREE_INIT { 0 } - -static inline void cb_init(struct cb_tree *t) +static inline void cb_init(struct cb_tree *t, + ptrdiff_t key_offset) { - struct cb_tree blank = CBTREE_INIT; + struct cb_tree blank = { + .key_offset = key_offset, + }; memcpy(t, &blank, sizeof(*t)); } diff --git a/oidtree.c b/oidtree.c index ab9fe7ec7a..117649753f 100644 --- a/oidtree.c +++ b/oidtree.c @@ -6,9 +6,14 @@ #include "oidtree.h" #include "hash.h" +struct oidtree_node { + struct cb_node base; + struct object_id key; +}; + void oidtree_init(struct oidtree *ot) { - cb_init(&ot->tree); + cb_init(&ot->tree, offsetof(struct oidtree_node, key)); mem_pool_init(&ot->mem_pool, 0); } @@ -22,20 +27,13 @@ void oidtree_clear(struct oidtree *ot) void oidtree_insert(struct oidtree *ot, const struct object_id *oid) { - struct cb_node *on; - struct object_id k; + struct oidtree_node *on; if (!oid->algo) BUG("oidtree_insert requires oid->algo"); - on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid)); - - /* - * Clear the padding and copy the result in separate steps to - * respect the 4-byte alignment needed by struct object_id. - */ - oidcpy(&k, oid); - memcpy(on->k, &k, sizeof(k)); + on = mem_pool_alloc(&ot->mem_pool, sizeof(*on)); + oidcpy(&on->key, oid); /* * n.b. Current callers won't get us duplicates, here. If a @@ -43,7 +41,7 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid) * that won't be freed until oidtree_clear. Currently it's not * worth maintaining a free list */ - cb_insert(&ot->tree, on, sizeof(*oid)); + cb_insert(&ot->tree, &on->base, sizeof(*oid)); } bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) @@ -73,21 +71,18 @@ struct oidtree_each_data { static int iter(struct cb_node *n, void *cb_data) { + struct oidtree_node *node = container_of(n, struct oidtree_node, base); struct oidtree_each_data *data = cb_data; - struct object_id k; - - /* Copy to provide 4-byte alignment needed by struct object_id. */ - memcpy(&k, n->k, sizeof(k)); - if (data->algo != GIT_HASH_UNKNOWN && data->algo != k.algo) + if (data->algo != GIT_HASH_UNKNOWN && data->algo != node->key.algo) return 0; if (data->last_nibble_at) { - if ((k.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0) + if ((node->key.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0) return 0; } - return data->cb(&k, data->cb_data); + return data->cb(&node->key, data->cb_data); } int oidtree_each(struct oidtree *ot, const struct object_id *prefix, -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 09/16] oidtree: add ability to store data 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (7 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 10/16] odb/source-inmemory: convert to use oidtree Patrick Steinhardt ` (9 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git The oidtree data structure is currently only used to store object IDs, without any associated data. So consequently, it can only really be used to track which object IDs exist, and we can use the tree structure to efficiently operate on OID prefixes. But there are valid use cases where we want to both: - Store object IDs in a sorted order. - Associated arbitrary data with them. Refactor the oidtree interface so that it allows us to store arbitrary payloads within the respective nodes. This will be used in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- loose.c | 2 +- object-file.c | 3 ++- oidtree.c | 37 ++++++++++++++++++++++++++++++++----- oidtree.h | 12 ++++++++++-- t/unit-tests/u-oidtree.c | 26 +++++++++++++++++++++++--- 5 files changed, 68 insertions(+), 12 deletions(-) diff --git a/loose.c b/loose.c index 07333be696..f7a3dd1a72 100644 --- a/loose.c +++ b/loose.c @@ -57,7 +57,7 @@ static int insert_loose_map(struct odb_source *source, inserted |= insert_oid_pair(map->to_compat, oid, compat_oid); inserted |= insert_oid_pair(map->to_storage, compat_oid, oid); if (inserted) - oidtree_insert(files->loose->cache, compat_oid); + oidtree_insert(files->loose->cache, compat_oid, NULL); return inserted; } diff --git a/object-file.c b/object-file.c index 98a4678ca4..c0805f0ebb 100644 --- a/object-file.c +++ b/object-file.c @@ -1850,6 +1850,7 @@ static int for_each_object_wrapper_cb(const struct object_id *oid, } static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid, + void *node_data UNUSED, void *cb_data) { struct for_each_object_wrapper_data *data = cb_data; @@ -1995,7 +1996,7 @@ static int append_loose_object(const struct object_id *oid, const char *path UNUSED, void *data) { - oidtree_insert(data, oid); + oidtree_insert(data, oid, NULL); return 0; } diff --git a/oidtree.c b/oidtree.c index 117649753f..e43f18026e 100644 --- a/oidtree.c +++ b/oidtree.c @@ -9,6 +9,7 @@ struct oidtree_node { struct cb_node base; struct object_id key; + void *data; }; void oidtree_init(struct oidtree *ot) @@ -25,15 +26,22 @@ void oidtree_clear(struct oidtree *ot) } } -void oidtree_insert(struct oidtree *ot, const struct object_id *oid) +struct oidtree_data { + struct object_id oid; +}; + +void oidtree_insert(struct oidtree *ot, const struct object_id *oid, + void *data) { struct oidtree_node *on; + struct cb_node *node; if (!oid->algo) BUG("oidtree_insert requires oid->algo"); on = mem_pool_alloc(&ot->mem_pool, sizeof(*on)); oidcpy(&on->key, oid); + on->data = data; /* * n.b. Current callers won't get us duplicates, here. If a @@ -41,13 +49,19 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid) * that won't be freed until oidtree_clear. Currently it's not * worth maintaining a free list */ - cb_insert(&ot->tree, &on->base, sizeof(*oid)); + node = cb_insert(&ot->tree, &on->base, sizeof(*oid)); + if (node) { + struct oidtree_node *preexisting = container_of(node, struct oidtree_node, base); + preexisting->data = data; + } } -bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) +static struct oidtree_node *oidtree_lookup(struct oidtree *ot, + const struct object_id *oid) { struct object_id k; size_t klen = sizeof(k); + struct cb_node *node; oidcpy(&k, oid); @@ -58,7 +72,20 @@ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) klen += BUILD_ASSERT_OR_ZERO(offsetof(struct object_id, hash) < offsetof(struct object_id, algo)); - return !!cb_lookup(&ot->tree, (const uint8_t *)&k, klen); + node = cb_lookup(&ot->tree, (const uint8_t *)&k, klen); + return node ? container_of(node, struct oidtree_node, base) : NULL; +} + +bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) +{ + struct oidtree_node *node = oidtree_lookup(ot, oid); + return node ? 1 : 0; +} + +void *oidtree_get(struct oidtree *ot, const struct object_id *oid) +{ + struct oidtree_node *node = oidtree_lookup(ot, oid); + return node ? node->data : NULL; } struct oidtree_each_data { @@ -82,7 +109,7 @@ static int iter(struct cb_node *n, void *cb_data) return 0; } - return data->cb(&node->key, data->cb_data); + return data->cb(&node->key, node->data, data->cb_data); } int oidtree_each(struct oidtree *ot, const struct object_id *prefix, diff --git a/oidtree.h b/oidtree.h index 2b7bad2e60..baa5a436ea 100644 --- a/oidtree.h +++ b/oidtree.h @@ -29,18 +29,26 @@ void oidtree_init(struct oidtree *ot); */ void oidtree_clear(struct oidtree *ot); -/* Insert the object ID into the tree. */ -void oidtree_insert(struct oidtree *ot, const struct object_id *oid); +/* + * Insert the object ID into the tree and store the given pointer alongside + * with it. The data pointer of any preexisting entry will be overwritten. + */ +void oidtree_insert(struct oidtree *ot, const struct object_id *oid, + void *data); /* Check whether the tree contains the given object ID. */ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid); +/* Get the payload stored with the given object ID. */ +void *oidtree_get(struct oidtree *ot, const struct object_id *oid); + /* * Callback function used for `oidtree_each()`. Returning a non-zero exit code * will cause iteration to stop. The exit code will be propagated to the caller * of `oidtree_each()`. */ typedef int (*oidtree_each_cb)(const struct object_id *oid, + void *node_data, void *cb_data); /* diff --git a/t/unit-tests/u-oidtree.c b/t/unit-tests/u-oidtree.c index d4d05c7dc3..f0d5ebb733 100644 --- a/t/unit-tests/u-oidtree.c +++ b/t/unit-tests/u-oidtree.c @@ -19,7 +19,7 @@ static int fill_tree_loc(struct oidtree *ot, const char *hexes[], size_t n) for (size_t i = 0; i < n; i++) { struct object_id oid; cl_parse_any_oid(hexes[i], &oid); - oidtree_insert(ot, &oid); + oidtree_insert(ot, &oid, NULL); } return 0; } @@ -38,9 +38,9 @@ struct expected_hex_iter { const char *query; }; -static int check_each_cb(const struct object_id *oid, void *data) +static int check_each_cb(const struct object_id *oid, void *node_data UNUSED, void *cb_data) { - struct expected_hex_iter *hex_iter = data; + struct expected_hex_iter *hex_iter = cb_data; struct object_id expected; cl_assert(hex_iter->i < hex_iter->expected_hexes.nr); @@ -105,3 +105,23 @@ void test_oidtree__each(void) check_each(&ot, "32100", "321", NULL); check_each(&ot, "32", "320", "321", NULL); } + +void test_oidtree__insert_overwrites_data(void) +{ + struct object_id oid; + struct oidtree ot; + int a, b; + + cl_parse_any_oid("1", &oid); + + oidtree_init(&ot); + + oidtree_insert(&ot, &oid, NULL); + cl_assert_equal_p(oidtree_get(&ot, &oid), NULL); + oidtree_insert(&ot, &oid, &a); + cl_assert_equal_p(oidtree_get(&ot, &oid), &a); + oidtree_insert(&ot, &oid, &b); + cl_assert_equal_p(oidtree_get(&ot, &oid), &b); + + oidtree_clear(&ot); +} -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 10/16] odb/source-inmemory: convert to use oidtree 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (8 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 09/16] oidtree: add ability to store data Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt ` (8 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git The inmemory source stores its objects in a simple array that we grow as needed. This has a couple of downsides: - The object lookup is O(n). This doesn't matter in practice because we only store a small number of objects. - We don't have an easy way to iterate over all objects in lexicographic order. - We don't have an easy way to compute unique object ID prefixes. Refactor the code to use an oidtree instead. This is the same data structure used by our loose object source, and thus it means we get a bunch of functionality for free. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 72 +++++++++++++++++++++++++++++++++++++-------------- odb/source-inmemory.h | 13 ++-------- 2 files changed, 54 insertions(+), 31 deletions(-) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 890e2a8c7c..22bae6927e 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -2,20 +2,29 @@ #include "odb.h" #include "odb/source-inmemory.h" #include "odb/streaming.h" +#include "oidtree.h" #include "repository.h" -static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, - const struct object_id *oid) +struct inmemory_object { + enum object_type type; + const void *buf; + unsigned long size; +}; + +static const struct inmemory_object *find_cached_object(struct odb_source_inmemory *source, + const struct object_id *oid) { - static const struct cached_object empty_tree = { + static const struct inmemory_object empty_tree = { .type = OBJ_TREE, .buf = "", }; - const struct cached_object_entry *co = source->objects; + const struct inmemory_object *object; - for (size_t i = 0; i < source->objects_nr; i++, co++) - if (oideq(&co->oid, oid)) - return &co->value; + if (source->objects) { + object = oidtree_get(source->objects, oid); + if (object) + return object; + } if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) return &empty_tree; @@ -29,7 +38,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, enum object_info_flags flags UNUSED) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - const struct cached_object *object; + const struct inmemory_object *object; object = find_cached_object(inmemory, oid); if (!object) @@ -85,7 +94,7 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); struct odb_read_stream_inmemory *stream; - const struct cached_object *object; + const struct inmemory_object *object; object = find_cached_object(inmemory, oid); if (!object) @@ -110,15 +119,21 @@ static int odb_source_inmemory_write_object(struct odb_source *source, enum odb_write_object_flags flags UNUSED) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - struct cached_object_entry *object; + struct inmemory_object *object; - ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, - inmemory->objects_alloc); - object = &inmemory->objects[inmemory->objects_nr++]; - object->value.size = len; - object->value.type = type; - object->value.buf = xmemdupz(buf, len); - oidcpy(&object->oid, oid); + if (!inmemory->objects) { + CALLOC_ARRAY(inmemory->objects, 1); + oidtree_init(inmemory->objects); + } else if (oidtree_contains(inmemory->objects, oid)) { + return 0; + } + + CALLOC_ARRAY(object, 1); + object->size = len; + object->type = type; + object->buf = xmemdupz(buf, len); + + oidtree_insert(inmemory->objects, oid, object); return 0; } @@ -162,12 +177,29 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source, return ret; } +static int inmemory_object_free(const struct object_id *oid UNUSED, + void *node_data, + void *cb_data UNUSED) +{ + struct inmemory_object *object = node_data; + free((void *) object->buf); + free(object); + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - for (size_t i = 0; i < inmemory->objects_nr; i++) - free((char *) inmemory->objects[i].value.buf); - free(inmemory->objects); + + if (inmemory->objects) { + struct object_id null_oid = { 0 }; + + oidtree_each(inmemory->objects, &null_oid, 0, + inmemory_object_free, NULL); + oidtree_clear(inmemory->objects); + free(inmemory->objects); + } + free(inmemory->base.path); free(inmemory); } diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h index 14dc06f7c3..02cf586b63 100644 --- a/odb/source-inmemory.h +++ b/odb/source-inmemory.h @@ -3,14 +3,7 @@ #include "odb/source.h" -struct cached_object_entry { - struct object_id oid; - struct cached_object { - enum object_type type; - const void *buf; - unsigned long size; - } value; -}; +struct oidtree; /* * An inmemory source that you can write objects to that shall be made @@ -20,9 +13,7 @@ struct cached_object_entry { */ struct odb_source_inmemory { struct odb_source base; - - struct cached_object_entry *objects; - size_t objects_nr, objects_alloc; + struct oidtree *objects; }; /* Create a new in-memory object database source. */ -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (9 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 10/16] odb/source-inmemory: convert to use oidtree Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt ` (7 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Implement the `for_each_object()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 86 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 70 insertions(+), 16 deletions(-) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 22bae6927e..0ac20df323 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -32,6 +32,28 @@ static const struct inmemory_object *find_cached_object(struct odb_source_inmemo return NULL; } +static void populate_object_info(struct odb_source_inmemory *source, + struct object_info *oi, + const struct inmemory_object *object) +{ + if (!oi) + return; + + if (oi->typep) + *(oi->typep) = object->type; + if (oi->sizep) + *(oi->sizep) = object->size; + if (oi->disk_sizep) + *(oi->disk_sizep) = 0; + if (oi->delta_base_oid) + oidclr(oi->delta_base_oid, source->base.odb->repo->hash_algo); + if (oi->contentp) + *oi->contentp = xmemdupz(object->buf, object->size); + if (oi->mtimep) + *oi->mtimep = 0; + oi->whence = OI_CACHED; +} + static int odb_source_inmemory_read_object_info(struct odb_source *source, const struct object_id *oid, struct object_info *oi, @@ -44,22 +66,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, if (!object) return -1; - if (oi) { - if (oi->typep) - *(oi->typep) = object->type; - if (oi->sizep) - *(oi->sizep) = object->size; - if (oi->disk_sizep) - *(oi->disk_sizep) = 0; - if (oi->delta_base_oid) - oidclr(oi->delta_base_oid, source->odb->repo->hash_algo); - if (oi->contentp) - *oi->contentp = xmemdupz(object->buf, object->size); - if (oi->mtimep) - *oi->mtimep = 0; - oi->whence = OI_CACHED; - } - + populate_object_info(inmemory, oi, object); return 0; } @@ -111,6 +118,52 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, return 0; } +struct odb_source_inmemory_for_each_object_data { + struct odb_source_inmemory *inmemory; + const struct object_info *request; + odb_for_each_object_cb cb; + void *cb_data; +}; + +static int odb_source_inmemory_for_each_object_cb(const struct object_id *oid, + void *node_data, void *cb_data) +{ + struct odb_source_inmemory_for_each_object_data *data = cb_data; + struct inmemory_object *object = node_data; + + if (data->request) { + struct object_info oi = *data->request; + populate_object_info(data->inmemory, &oi, object); + return data->cb(oid, &oi, data->cb_data); + } else { + return data->cb(oid, NULL, data->cb_data); + } +} + +static int odb_source_inmemory_for_each_object(struct odb_source *source, + const struct object_info *request, + odb_for_each_object_cb cb, + void *cb_data, + const struct odb_for_each_object_options *opts) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct odb_source_inmemory_for_each_object_data payload = { + .inmemory = inmemory, + .request = request, + .cb = cb, + .cb_data = cb_data, + }; + struct object_id null_oid = { 0 }; + + if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) || + (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local)) + return 0; + + return oidtree_each(inmemory->objects, + opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len, + odb_source_inmemory_for_each_object_cb, &payload); +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -214,6 +267,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; + source->base.for_each_object = odb_source_inmemory_for_each_object; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (10 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt @ 2026-04-03 6:01 ` Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt ` (6 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:01 UTC (permalink / raw) To: git Implement the `find_abbrev_len()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 0ac20df323..16182bded3 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -164,6 +164,44 @@ static int odb_source_inmemory_for_each_object(struct odb_source *source, odb_source_inmemory_for_each_object_cb, &payload); } +struct find_abbrev_len_data { + const struct object_id *oid; + unsigned len; +}; + +static int find_abbrev_len_cb(const struct object_id *oid, + struct object_info *oi UNUSED, + void *cb_data) +{ + struct find_abbrev_len_data *data = cb_data; + unsigned len = oid_common_prefix_hexlen(oid, data->oid); + if (len != hash_algos[oid->algo].hexsz && len >= data->len) + data->len = len + 1; + return 0; +} + +static int odb_source_inmemory_find_abbrev_len(struct odb_source *source, + const struct object_id *oid, + unsigned min_len, + unsigned *out) +{ + struct odb_for_each_object_options opts = { + .prefix = oid, + .prefix_hex_len = min_len, + }; + struct find_abbrev_len_data data = { + .oid = oid, + .len = min_len, + }; + int ret; + + ret = odb_source_inmemory_for_each_object(source, NULL, find_abbrev_len_cb, + &data, &opts); + *out = data.len; + + return ret; +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -268,6 +306,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; + source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (11 preceding siblings ...) 2026-04-03 6:01 ` [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt @ 2026-04-03 6:02 ` Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt ` (5 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:02 UTC (permalink / raw) To: git Implement the `count_objects()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 16182bded3..bd89a7ef14 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -202,6 +202,25 @@ static int odb_source_inmemory_find_abbrev_len(struct odb_source *source, return ret; } +static int count_objects_cb(const struct object_id *oid UNUSED, + struct object_info *oi UNUSED, + void *cb_data) +{ + unsigned long *counter = cb_data; + (*counter)++; + return 0; +} + +static int odb_source_inmemory_count_objects(struct odb_source *source, + enum odb_count_objects_flags flags UNUSED, + unsigned long *out) +{ + struct odb_for_each_object_options opts = { 0 }; + *out = 0; + return odb_source_inmemory_for_each_object(source, NULL, count_objects_cb, + out, &opts); +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -307,6 +326,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len; + source->base.count_objects = odb_source_inmemory_count_objects; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (12 preceding siblings ...) 2026-04-03 6:02 ` [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt @ 2026-04-03 6:02 ` Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 15/16] odb/source-inmemory: stub out remaining functions Patrick Steinhardt ` (4 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:02 UTC (permalink / raw) To: git Implement the `freshen_object()` callback function for the inmemory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index bd89a7ef14..c5249d04bc 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -287,6 +287,15 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source, return ret; } +static int odb_source_inmemory_freshen_object(struct odb_source *source, + const struct object_id *oid) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + if (find_cached_object(inmemory, oid)) + return 1; + return 0; +} + static int inmemory_object_free(const struct object_id *oid UNUSED, void *node_data, void *cb_data UNUSED) @@ -329,6 +338,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.count_objects = odb_source_inmemory_count_objects; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; + source->base.freshen_object = odb_source_inmemory_freshen_object; return source; } -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 15/16] odb/source-inmemory: stub out remaining functions 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (13 preceding siblings ...) 2026-04-03 6:02 ` [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt @ 2026-04-03 6:02 ` Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 16/16] odb: generic inmemory source Patrick Steinhardt ` (3 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:02 UTC (permalink / raw) To: git Stub out remaining functions that we either don't need or that are basically no-ops. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index c5249d04bc..53009be032 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -296,6 +296,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source, return 0; } +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED, + struct odb_transaction **out UNUSED) +{ + return error("inmemory source does not support transactions"); +} + +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED, + struct strvec *out UNUSED) +{ + return 0; +} + +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED, + const char *alternate UNUSED) +{ + return error("inmemory source does not support alternates"); +} + +static void odb_source_inmemory_close(struct odb_source *source UNUSED) +{ +} + +static void odb_source_inmemory_reprepare(struct odb_source *source UNUSED) +{ +} + static int inmemory_object_free(const struct object_id *oid UNUSED, void *node_data, void *cb_data UNUSED) @@ -331,6 +357,8 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); source->base.free = odb_source_inmemory_free; + source->base.close = odb_source_inmemory_close; + source->base.reprepare = odb_source_inmemory_reprepare; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; @@ -339,6 +367,9 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; source->base.freshen_object = odb_source_inmemory_freshen_object; + source->base.begin_transaction = odb_source_inmemory_begin_transaction; + source->base.read_alternates = odb_source_inmemory_read_alternates; + source->base.write_alternate = odb_source_inmemory_write_alternate; return source; } -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 16/16] odb: generic inmemory source 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (14 preceding siblings ...) 2026-04-03 6:02 ` [PATCH 15/16] odb/source-inmemory: stub out remaining functions Patrick Steinhardt @ 2026-04-03 6:02 ` Patrick Steinhardt 2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano ` (2 subsequent siblings) 18 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-03 6:02 UTC (permalink / raw) To: git Make the in-memory source generic. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 8 ++++---- odb.h | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/odb.c b/odb.c index 34228c0cd5..70c59fef91 100644 --- a/odb.c +++ b/odb.c @@ -560,7 +560,7 @@ static int do_oid_object_info_extended(struct object_database *odb, if (is_null_oid(real)) return -1; - if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags)) + if (!odb_source_read_object_info(odb->inmemory_objects, oid, oi, flags)) return 0; odb_prepare_alternates(odb); @@ -737,7 +737,7 @@ int odb_pretend_object(struct object_database *odb, if (odb_has_object(odb, oid, 0)) return 0; - return odb_source_write_object(&odb->inmemory_objects->base, + return odb_source_write_object(odb->inmemory_objects, buf, len, type, oid, NULL, 0); } @@ -1020,7 +1020,7 @@ struct object_database *odb_new(struct repository *repo, o->sources = odb_source_new(o, primary_source, true); o->sources_tail = &o->sources->next; o->alternate_db = xstrdup_or_null(secondary_sources); - o->inmemory_objects = odb_source_inmemory_new(o); + o->inmemory_objects = &odb_source_inmemory_new(o)->base; free(to_free); @@ -1045,7 +1045,7 @@ static void odb_free_sources(struct object_database *o) o->sources = next; } - odb_source_free(&o->inmemory_objects->base); + odb_source_free(o->inmemory_objects); o->inmemory_objects = NULL; kh_destroy_odb_path_map(o->source_by_path); diff --git a/odb.h b/odb.h index 3d20270a05..e3211ad8d4 100644 --- a/odb.h +++ b/odb.h @@ -99,7 +99,7 @@ struct object_database { * to write them into the object store (e.g. a browse-only * application). */ - struct odb_source_inmemory *inmemory_objects; + struct odb_source *inmemory_objects; /* * A fast, rough count of the number of objects in the repository. -- 2.53.0.1323.g189a785ab5.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 00/16] odb: introduce "inmemory" source 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (15 preceding siblings ...) 2026-04-03 6:02 ` [PATCH 16/16] odb: generic inmemory source Patrick Steinhardt @ 2026-04-03 15:41 ` Junio C Hamano 2026-04-08 8:22 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt 18 siblings, 1 reply; 85+ messages in thread From: Junio C Hamano @ 2026-04-03 15:41 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Patrick Steinhardt <ps@pks.im> writes: > this patch series introduces the second object database source type, > which is the "inmemory" source. I cannot read the word without a hyphen, i.e.e.g., "in-memory". > This source may seem somewhat odd at first: it always starts out empty, > and any object written into it will only exist in memory until the > process exits. But the source already serves a purpose in our codebase, > where some commands, for example git-blame(1), write an in-memory > worktree commit. Intermediate tree and blob objects you need while making an octopus merge may also benefit from this feature, to stay only in-core without having to get written out to the outside world. I understand that this is not meant to be used as a "we create them only for ourselves and they are available only to us while we work, but once we are satisfied we make them available to others", which is much better done by creating an on-disk ephemeral object store, write such objects in them, and then decide at the end of the process between discarding the ephemeral store and moving objects from there to the main object store. > Furthermore, I think that going forward it can serve more purposes as we > now have an easy way to write and read objects that will not get > persisted. I could see that this may be useful when for example > re-merging diffs. But eventually, once we have the object storage format > extension wired up, callers might even want to manually set up an > in-memory database as the primary ODB for write operations so that no > data will be persisted in an arbitrary write. ;-) > Last but not least, this patch series also serves the purpose of > eventually getting rid of the `struct object_info::whence` member. > Instead, we'll simply yield the ODB source a specific object has been > read from, together with some backend-specific data, which gives > strictly more information compared to the status quo. > > The series is based on cf2139f8e1 (The 24th batch, 2026-04-01) with > ps/odb-cleanup at 109bcb7d1d (odb: drop unneeded headers and forward > decls, 2026-04-01) merged into it. > > Thanks! > > Patrick > > --- > Patrick Steinhardt (16): > odb: introduce "inmemory" source > odb/source-inmemory: implement `free()` callback > odb: fix unnecessary call to `find_cached_object()` > odb/source-inmemory: implement `read_object_info()` callback > odb/source-inmemory: implement `read_object_stream()` callback > odb/source-inmemory: implement `write_object()` callback > odb/source-inmemory: implement `write_object_stream()` callback > cbtree: allow using arbitrary wrapper structures for nodes > oidtree: add ability to store data > odb/source-inmemory: convert to use oidtree > odb/source-inmemory: implement `for_each_object()` callback > odb/source-inmemory: implement `find_abbrev_len()` callback > odb/source-inmemory: implement `count_objects()` callback > odb/source-inmemory: implement `freshen_object()` callback > odb/source-inmemory: stub out remaining functions > odb: generic inmemory source > > Makefile | 1 + > cbtree.c | 25 +++- > cbtree.h | 11 +- > loose.c | 2 +- > meson.build | 1 + > object-file.c | 3 +- > odb.c | 82 ++--------- > odb.h | 4 +- > odb/source-inmemory.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++ > odb/source-inmemory.h | 33 +++++ > odb/source.h | 3 + > oidtree.c | 66 ++++++--- > oidtree.h | 12 +- > t/unit-tests/u-oidtree.c | 26 +++- > 14 files changed, 529 insertions(+), 115 deletions(-) > > > --- > base-commit: 3d05c3e2906489caa9f12f0af18dc233a6b8032c > change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43 ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 00/16] odb: introduce "inmemory" source 2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano @ 2026-04-08 8:22 ` Patrick Steinhardt 2026-04-08 21:48 ` Junio C Hamano 0 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-08 8:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Fri, Apr 03, 2026 at 08:41:16AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > this patch series introduces the second object database source type, > > which is the "inmemory" source. > > I cannot read the word without a hyphen, i.e.e.g., "in-memory". Fair. I think I'll keep it as `odb_source_inmemory` in the sources, which I find easier ot parse than `odb_source_in_memory`, but will adapt to "in-memory" in prose. I already did this for most of the part, but not in the cover letter indeed. > > This source may seem somewhat odd at first: it always starts out empty, > > and any object written into it will only exist in memory until the > > process exits. But the source already serves a purpose in our codebase, > > where some commands, for example git-blame(1), write an in-memory > > worktree commit. > > Intermediate tree and blob objects you need while making an octopus > merge may also benefit from this feature, to stay only in-core > without having to get written out to the outside world. > > I understand that this is not meant to be used as a "we create them > only for ourselves and they are available only to us while we work, > but once we are satisfied we make them available to others", which > is much better done by creating an on-disk ephemeral object store, > write such objects in them, and then decide at the end of the > process between discarding the ephemeral store and moving objects > from there to the main object store. Yeah, I think there's a bunch of use cases where this could be useful going forward. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 00/16] odb: introduce "inmemory" source 2026-04-08 8:22 ` Patrick Steinhardt @ 2026-04-08 21:48 ` Junio C Hamano 2026-04-09 5:22 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Junio C Hamano @ 2026-04-08 21:48 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Patrick Steinhardt <ps@pks.im> writes: > On Fri, Apr 03, 2026 at 08:41:16AM -0700, Junio C Hamano wrote: >> Patrick Steinhardt <ps@pks.im> writes: >> >> > this patch series introduces the second object database source type, >> > which is the "inmemory" source. >> >> I cannot read the word without a hyphen, i.e.e.g., "in-memory". > > Fair. I think I'll keep it as `odb_source_inmemory` in the sources, > which I find easier ot parse than `odb_source_in_memory`, but will adapt > to "in-memory" in prose. I already did this for most of the part, but > not in the cover letter indeed. Fair. FWIW, we do the same for "in core" or "in-core" in prose, and "incore" in identifier names, so the above is understandable position to take. But stepping back a bit, does this new "in memory" refer to a concept that is different from what the rest of the system uses "in core" to represent? ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 00/16] odb: introduce "inmemory" source 2026-04-08 21:48 ` Junio C Hamano @ 2026-04-09 5:22 ` Patrick Steinhardt 2026-04-09 13:46 ` Junio C Hamano 0 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 5:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Wed, Apr 08, 2026 at 02:48:52PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > On Fri, Apr 03, 2026 at 08:41:16AM -0700, Junio C Hamano wrote: > >> Patrick Steinhardt <ps@pks.im> writes: > >> > >> > this patch series introduces the second object database source type, > >> > which is the "inmemory" source. > >> > >> I cannot read the word without a hyphen, i.e.e.g., "in-memory". > > > > Fair. I think I'll keep it as `odb_source_inmemory` in the sources, > > which I find easier ot parse than `odb_source_in_memory`, but will adapt > > to "in-memory" in prose. I already did this for most of the part, but > > not in the cover letter indeed. > > Fair. > > FWIW, we do the same for "in core" or "in-core" in prose, and > "incore" in identifier names, so the above is understandable > position to take. > > But stepping back a bit, does this new "in memory" refer to a > concept that is different from what the rest of the system uses "in > core" to represent? No, in principle it's not any different. One of the reasons I decided to go with "in memory" though is that this backend may eventually be (power-)user-facing via the planned "objectStorage" extension. This extension will work similar to how the "refStorage" extension works, where every backend has a schema followed by an optional payload. So for the files backend it would be "files://<path>", and if one wants to configure a temporary ODB source that doesn't store objects it would be "inmemory://". And overall, I think that "inmemory" is a lot easier to understand intuitively compared to "incore". The counter argument may be that this really only is for power users anyway, as it's a rather risky thing to do (e.g. you must not update any refs), and such power users may understand the concept of "in-core". But even there I feel like it makes sense to rather say "in-memory". Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 00/16] odb: introduce "inmemory" source 2026-04-09 5:22 ` Patrick Steinhardt @ 2026-04-09 13:46 ` Junio C Hamano 2026-04-10 4:53 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Junio C Hamano @ 2026-04-09 13:46 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Patrick Steinhardt <ps@pks.im> writes: >> But stepping back a bit, does this new "in memory" refer to a >> concept that is different from what the rest of the system uses "in >> core" to represent? > > No, in principle it's not any different. One of the reasons I decided to > go with "in memory" though is that this backend may eventually be > (power-)user-facing via the planned "objectStorage" extension. Doesn't git grep -E -e 'in[- ]?core' -- ':!Documentation/RelNotes' ':!t' give many hits that we want to be in line with in the codebase anyway, and even in some user-facing things? I just noticed an option "--no-kept-objects=in-core" (which I didn't know about ;-). ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 00/16] odb: introduce "inmemory" source 2026-04-09 13:46 ` Junio C Hamano @ 2026-04-10 4:53 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 4:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, Apr 09, 2026 at 06:46:30AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > >> But stepping back a bit, does this new "in memory" refer to a > >> concept that is different from what the rest of the system uses "in > >> core" to represent? > > > > No, in principle it's not any different. One of the reasons I decided to > > go with "in memory" though is that this backend may eventually be > > (power-)user-facing via the planned "objectStorage" extension. > > Doesn't > > git grep -E -e 'in[- ]?core' -- ':!Documentation/RelNotes' ':!t' > > give many hits that we want to be in line with in the codebase > anyway, and even in some user-facing things? I just noticed an > option "--no-kept-objects=in-core" (which I didn't know about ;-). Most of the hits are in our code though, and end users wouldn't typically see those. So what I think is more relevant is documentation or options like the one you pointed out. But "--no-kept-objects=" is not even documented, which basically leaves us with the following hits: Documentation/gitformat-pack.adoc:write a cruft pack. Crucially, the set of in-core kept packs is exactly the set Documentation/technical/parallel-checkout.adoc:parallelize the work of uncompressing the blobs, applying in-core Documentation/technical/racy-git.adoc:because in-core timestamps can have finer granularity than Documentation/technical/racy-git.adoc:([PATCH] Sync in core time granularity with filesystems, I think that these hits are all related to what we're doing here, as we're talking about object data that we handle in-core. I initially said "it's not any different", but thinking a bit more about it I think there is a slight difference: in-core could be any object that we have parsed from the object database, even if it's backed by an actual on-disk object. In-memory objects may not even have been parsed at all, so technically speaking they may not even be in-core. So in summary: - I think that end users have not really been exposed to the concept of "in-core". - The concepts of "in-core" and the ODB source here are slightly different, as any object is treated as "in-core" that has been parsed. - The concept of "in-memory" is easier for the end user to understand in the context of the ODB, as it's a more general concept compared to the very Git-specific "in-core" term". Hope that makes sense :) Thanks! Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 00/17] odb: introduce "in-memory" source 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (16 preceding siblings ...) 2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 01/17] " Patrick Steinhardt ` (17 more replies) 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt 18 siblings, 18 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Hi, this patch series introduces the second object database source type, which is the "in-memory" source. This source may seem somewhat odd at first: it always starts out empty, and any object written into it will only exist in memory until the process exits. But the source already serves a purpose in our codebase, where some commands, for example git-blame(1), write an in-memory worktree commit. Furthermore, I think that going forward it can serve more purposes as we now have an easy way to write and read objects that will not get persisted. I could see that this may be useful when for example re-merging diffs. But eventually, once we have the object storage format extension wired up, callers might even want to manually set up an in-memory database as the primary ODB for write operations so that no data will be persisted in an arbitrary write. Last but not least, this patch series also serves the purpose of eventually getting rid of the `struct object_info::whence` member. Instead, we'll simply yield the ODB source a specific object has been read from, together with some backend-specific data, which gives strictly more information compared to the status quo. The series is based onb15384c06f (A bit more post -rc1, 2026-04-08) with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make `write_object_stream()` pluggable, 2026-04-02) merged into it. Changes in v2: - Fix handling of object IDs when writing objects. - I've changed the base of this series to include Justin's refactorings for the ODB write streams. I've updated the above paragraph detailing the merge base accordingly. @Junio: I'm fine to defer this patch series a bit until Justin's patch series has been merged to `next` in case this causes inconvenience. - Use "in-memory" instead of "inmemory" in commit messages. - Link to v1: https://patch.msgid.link/20260403-b4-pks-odb-source-inmemory-v1-0-8b8d1abaa25e@pks.im Thanks! Patrick --- Patrick Steinhardt (17): odb: introduce "in-memory" source odb/source-inmemory: implement `free()` callback odb: fix unnecessary call to `find_cached_object()` odb/source-inmemory: implement `read_object_info()` callback odb/source-inmemory: implement `read_object_stream()` callback odb/source-inmemory: implement `write_object()` callback odb/source-inmemory: implement `write_object()` callback odb/source-inmemory: implement `write_object_stream()` callback cbtree: allow using arbitrary wrapper structures for nodes oidtree: add ability to store data odb/source-inmemory: convert to use oidtree odb/source-inmemory: implement `for_each_object()` callback odb/source-inmemory: implement `find_abbrev_len()` callback odb/source-inmemory: implement `count_objects()` callback odb/source-inmemory: implement `freshen_object()` callback odb/source-inmemory: stub out remaining functions odb: generic in-memory source Makefile | 1 + cbtree.c | 25 +++- cbtree.h | 11 +- loose.c | 2 +- meson.build | 1 + object-file.c | 3 +- odb.c | 82 ++-------- odb.h | 4 +- odb/source-inmemory.c | 378 +++++++++++++++++++++++++++++++++++++++++++++++ odb/source-inmemory.h | 33 +++++ odb/source.h | 3 + oidtree.c | 66 ++++++--- oidtree.h | 12 +- t/unit-tests/u-oidtree.c | 26 +++- 14 files changed, 532 insertions(+), 115 deletions(-) Range-diff versus v1: 1: b7cd1ae8d1 ! 1: df8567d908 odb: introduce "inmemory" source @@ Metadata Author: Patrick Steinhardt <ps@pks.im> ## Commit message ## - odb: introduce "inmemory" source + odb: introduce "in-memory" source Next to our typical object database sources, each object database also has an implicit source of "cached" objects. These cached objects only @@ Commit message for example the empty tree. - They can be used to store temporary objects that we don't want to - persist to disk. + persist to disk, which is used by git-blame(1) to create a fake + worktree commit. Overall, their use is somewhat restricted though. For example, we don't provide the ability to use it as a temporary object database source that @@ Commit message as one. This is about to change over the following commits, where we will turn - cached objects into a new "inmemory" source. This will allow us to use + cached objects into a new "in-memory" source. This will allow us to use it exactly the same as any other source by providing the same common interface as the "files" source. - For now, the inmemory source only hosts the cached objects and doesn't + For now, the in-memory source only hosts the cached objects and doesn't provide any logic yet. This will change with subsequent commits, where we move respective functionality into the source. @@ Makefile: LIB_OBJS += object.o LIB_OBJS += odb/source-files.o +LIB_OBJS += odb/source-inmemory.o LIB_OBJS += odb/streaming.o + LIB_OBJS += odb/transaction.o LIB_OBJS += oid-array.o - LIB_OBJS += oidmap.o ## meson.build ## @@ meson.build: libgit_sources = [ @@ meson.build: libgit_sources = [ 'odb/source-files.c', + 'odb/source-inmemory.c', 'odb/streaming.c', + 'odb/transaction.c', 'oid-array.c', - 'oidmap.c', ## odb.c ## @@ 2: 298758b4d5 ! 2: e1ffe26ca9 odb/source-inmemory: implement `free()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `free()` callback - Implement the `free()` callback function for the "inmemory" source. + Implement the `free()` callback function for the "in-memory" source. Note that this requires us to define `struct cached_object_entry` in "odb/source-inmemory.h", as it is accessed in both "odb.c" and 3: b57997d027 = 3: f58424bb80 odb: fix unnecessary call to `find_cached_object()` 4: 9ae26b9aa1 ! 4: 786a240391 odb/source-inmemory: implement `read_object_info()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `read_object_info()` callback - Implement the `read_object_info()` callback function for the inmemory + Implement the `read_object_info()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> 5: 5d9781009e ! 5: 22d3e7134b odb/source-inmemory: implement `read_object_stream()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `read_object_stream()` callback - Implement the `read_object_stream()` callback function for the inmemory + Implement the `read_object_stream()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> 6: bc9620c608 ! 6: 139e7f2beb odb/source-inmemory: implement `write_object()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `write_object()` callback - Implement the `write_object()` callback function for the inmemory + Implement the `write_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> -: ---------- > 7: 7f5ab16d1c odb/source-inmemory: implement `write_object()` callback 7: 6d9f8634e1 ! 8: 6006f5e782 odb/source-inmemory: implement `write_object_stream()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `write_object_stream()` callback - Implement the `write_object_stream()` callback function for the inmemory + Implement the `write_object_stream()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so + size_t len, + struct object_id *oid) +{ ++ char buf[16384]; + size_t total_read = 0; + char *data; + int ret; + + CALLOC_ARRAY(data, len); + while (!stream->is_finished) { -+ unsigned long bytes_read; -+ const void *in; ++ ssize_t bytes_read; + -+ in = stream->read(stream, &bytes_read); ++ bytes_read = odb_write_stream_read(stream, buf, sizeof(buf)); + if (total_read + bytes_read > len) { + ret = error("object stream yielded more bytes than expected"); + goto out; + } + -+ memcpy(data, in, bytes_read); ++ memcpy(data, buf, bytes_read); + total_read += bytes_read; + } + 8: 45f9c761ce = 9: 392d9bf6ed cbtree: allow using arbitrary wrapper structures for nodes 9: 5eb7742886 = 10: 9fd88ffd16 oidtree: add ability to store data 10: 4f95cd0a51 ! 11: 6d4a77b47c odb/source-inmemory: convert to use oidtree @@ Metadata ## Commit message ## odb/source-inmemory: convert to use oidtree - The inmemory source stores its objects in a simple array that we grow as + The in-memory source stores its objects in a simple array that we grow as needed. This has a couple of downsides: - The object lookup is O(n). This doesn't matter in practice because @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so - struct cached_object_entry *object; + struct inmemory_object *object; + hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); + - ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, - inmemory->objects_alloc); - object = &inmemory->objects[inmemory->objects_nr++]; 11: fc231e22dc ! 12: 5f345d76ef odb/source-inmemory: implement `for_each_object()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `for_each_object()` callback - Implement the `for_each_object()` callback function for the inmemory + Implement the `for_each_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> 12: c2437b2ba5 ! 13: b428a1760b odb/source-inmemory: implement `find_abbrev_len()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `find_abbrev_len()` callback - Implement the `find_abbrev_len()` callback function for the inmemory + Implement the `find_abbrev_len()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> 13: fee0586da7 ! 14: 564cc60392 odb/source-inmemory: implement `count_objects()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `count_objects()` callback - Implement the `count_objects()` callback function for the inmemory + Implement the `count_objects()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> 14: 634392eaf9 ! 15: 9ddfb6f67b odb/source-inmemory: implement `freshen_object()` callback @@ Metadata ## Commit message ## odb/source-inmemory: implement `freshen_object()` callback - Implement the `freshen_object()` callback function for the inmemory + Implement the `freshen_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> 15: 3d1f08a849 = 16: d76329a424 odb/source-inmemory: stub out remaining functions 16: 29deff493d ! 17: 41cd562975 odb: generic inmemory source @@ Metadata Author: Patrick Steinhardt <ps@pks.im> ## Commit message ## - odb: generic inmemory source + odb: generic in-memory source Make the in-memory source generic. --- base-commit: a3ebc5a08e67ccac4c915622049a968a31e48662 change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43 ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 01/17] odb: introduce "in-memory" source 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 9:26 ` Karthik Nayak 2026-04-09 7:24 ` [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt ` (16 subsequent siblings) 17 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Next to our typical object database sources, each object database also has an implicit source of "cached" objects. These cached objects only exist in memory and some use cases: - They contain evergreen objects that we expect to always exist, like for example the empty tree. - They can be used to store temporary objects that we don't want to persist to disk, which is used by git-blame(1) to create a fake worktree commit. Overall, their use is somewhat restricted though. For example, we don't provide the ability to use it as a temporary object database source that allows the user to write objects, but discard them after Git exists. So while these cached objects behave almost like a source, they aren't used as one. This is about to change over the following commits, where we will turn cached objects into a new "in-memory" source. This will allow us to use it exactly the same as any other source by providing the same common interface as the "files" source. For now, the in-memory source only hosts the cached objects and doesn't provide any logic yet. This will change with subsequent commits, where we move respective functionality into the source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Makefile | 1 + meson.build | 1 + odb.c | 21 +++++++++++++-------- odb.h | 4 ++-- odb/source-inmemory.c | 12 ++++++++++++ odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++ odb/source.h | 3 +++ 7 files changed, 67 insertions(+), 10 deletions(-) diff --git a/Makefile b/Makefile index 22a8993482..3cda12c455 100644 --- a/Makefile +++ b/Makefile @@ -1218,6 +1218,7 @@ LIB_OBJS += object.o LIB_OBJS += odb.o LIB_OBJS += odb/source.o LIB_OBJS += odb/source-files.o +LIB_OBJS += odb/source-inmemory.o LIB_OBJS += odb/streaming.o LIB_OBJS += odb/transaction.o LIB_OBJS += oid-array.o diff --git a/meson.build b/meson.build index 6dc23b3af2..ffa73ce7ce 100644 --- a/meson.build +++ b/meson.build @@ -404,6 +404,7 @@ libgit_sources = [ 'odb.c', 'odb/source.c', 'odb/source-files.c', + 'odb/source-inmemory.c', 'odb/streaming.c', 'odb/transaction.c', 'oid-array.c', diff --git a/odb.c b/odb.c index 40a5e9c4e0..60e1eead25 100644 --- a/odb.c +++ b/odb.c @@ -14,6 +14,7 @@ #include "object-file.h" #include "object-name.h" #include "odb.h" +#include "odb/source-inmemory.h" #include "packfile.h" #include "path.h" #include "promisor-remote.h" @@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob .type = OBJ_TREE, .buf = "", }; - const struct cached_object_entry *co = object_store->cached_objects; + const struct cached_object_entry *co = object_store->inmemory_objects->objects; - for (size_t i = 0; i < object_store->cached_object_nr; i++, co++) + for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++) if (oideq(&co->oid, oid)) return &co->value; @@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb, find_cached_object(odb, oid)) return 0; - ALLOC_GROW(odb->cached_objects, - odb->cached_object_nr + 1, odb->cached_object_alloc); - co = &odb->cached_objects[odb->cached_object_nr++]; + ALLOC_GROW(odb->inmemory_objects->objects, + odb->inmemory_objects->objects_nr + 1, + odb->inmemory_objects->objects_alloc); + co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++]; co->value.size = len; co->value.type = type; co_buf = xmalloc(len); @@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo, o->sources = odb_source_new(o, primary_source, true); o->sources_tail = &o->sources->next; o->alternate_db = xstrdup_or_null(secondary_sources); + o->inmemory_objects = odb_source_inmemory_new(o); free(to_free); @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o) odb_close(o); odb_free_sources(o); - for (size_t i = 0; i < o->cached_object_nr; i++) - free((char *) o->cached_objects[i].value.buf); - free(o->cached_objects); + for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) + free((char *) o->inmemory_objects->objects[i].value.buf); + free(o->inmemory_objects->objects); + free(o->inmemory_objects->base.path); + free(o->inmemory_objects); string_list_clear(&o->submodule_source_paths, 0); diff --git a/odb.h b/odb.h index 9eb8355aca..c3a7edf9c8 100644 --- a/odb.h +++ b/odb.h @@ -8,6 +8,7 @@ #include "thread-utils.h" struct cached_object_entry; +struct odb_source_inmemory; struct packed_git; struct repository; struct strbuf; @@ -80,8 +81,7 @@ struct object_database { * to write them into the object store (e.g. a browse-only * application). */ - struct cached_object_entry *cached_objects; - size_t cached_object_nr, cached_object_alloc; + struct odb_source_inmemory *inmemory_objects; /* * A fast, rough count of the number of objects in the repository. diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c new file mode 100644 index 0000000000..c7ac5c24f0 --- /dev/null +++ b/odb/source-inmemory.c @@ -0,0 +1,12 @@ +#include "git-compat-util.h" +#include "odb/source-inmemory.h" + +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) +{ + struct odb_source_inmemory *source; + + CALLOC_ARRAY(source, 1); + odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); + + return source; +} diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h new file mode 100644 index 0000000000..95477bf36d --- /dev/null +++ b/odb/source-inmemory.h @@ -0,0 +1,35 @@ +#ifndef ODB_SOURCE_INMEMORY_H +#define ODB_SOURCE_INMEMORY_H + +#include "odb/source.h" + +struct cached_object_entry; + +/* + * An inmemory source that you can write objects to that shall be made + * available for reading, but that shouldn't ever be persisted to disk. Note + * that any objects written to this source will be stored in memory, so the + * number of objects you can store is limited by available system memory. + */ +struct odb_source_inmemory { + struct odb_source base; + + struct cached_object_entry *objects; + size_t objects_nr, objects_alloc; +}; + +/* Create a new in-memory object database source. */ +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb); + +/* + * Cast the given object database source to the inmemory backend. This will + * cause a BUG in case the source doesn't use this backend. + */ +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source) +{ + if (source->type != ODB_SOURCE_INMEMORY) + BUG("trying to downcast source of type '%d' to inmemory", source->type); + return container_of(source, struct odb_source_inmemory, base); +} + +#endif diff --git a/odb/source.h b/odb/source.h index f706e0608a..cd14f9e046 100644 --- a/odb/source.h +++ b/odb/source.h @@ -13,6 +13,9 @@ enum odb_source_type { /* The "files" backend that uses loose objects and packfiles. */ ODB_SOURCE_FILES, + + /* The "inmemory" backend that stores objects in memory. */ + ODB_SOURCE_INMEMORY, }; struct object_id; -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v2 01/17] odb: introduce "in-memory" source 2026-04-09 7:24 ` [PATCH v2 01/17] " Patrick Steinhardt @ 2026-04-09 9:26 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Karthik Nayak @ 2026-04-09 9:26 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 6426 bytes --] Patrick Steinhardt <ps@pks.im> writes: > Next to our typical object database sources, each object database also > has an implicit source of "cached" objects. These cached objects only > exist in memory and some use cases: > > - They contain evergreen objects that we expect to always exist, like > for example the empty tree. > > - They can be used to store temporary objects that we don't want to > persist to disk, which is used by git-blame(1) to create a fake > worktree commit. > > Overall, their use is somewhat restricted though. For example, we don't > provide the ability to use it as a temporary object database source that > allows the user to write objects, but discard them after Git exists. So > while these cached objects behave almost like a source, they aren't used > as one. > > This is about to change over the following commits, where we will turn > cached objects into a new "in-memory" source. This will allow us to use > it exactly the same as any other source by providing the same common > interface as the "files" source. > > For now, the in-memory source only hosts the cached objects and doesn't > provide any logic yet. This will change with subsequent commits, where > we move respective functionality into the source. [snip] > diff --git a/odb.c b/odb.c > index 40a5e9c4e0..60e1eead25 100644 > --- a/odb.c > +++ b/odb.c > @@ -14,6 +14,7 @@ > #include "object-file.h" > #include "object-name.h" > #include "odb.h" > +#include "odb/source-inmemory.h" > #include "packfile.h" > #include "path.h" > #include "promisor-remote.h" > @@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob > .type = OBJ_TREE, > .buf = "", > }; > - const struct cached_object_entry *co = object_store->cached_objects; > + const struct cached_object_entry *co = object_store->inmemory_objects->objects; > > - for (size_t i = 0; i < object_store->cached_object_nr; i++, co++) > + for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++) > if (oideq(&co->oid, oid)) > return &co->value; > > @@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb, > find_cached_object(odb, oid)) > return 0; > > - ALLOC_GROW(odb->cached_objects, > - odb->cached_object_nr + 1, odb->cached_object_alloc); > - co = &odb->cached_objects[odb->cached_object_nr++]; > + ALLOC_GROW(odb->inmemory_objects->objects, > + odb->inmemory_objects->objects_nr + 1, > + odb->inmemory_objects->objects_alloc); > + co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++]; Okay so we introduce the inmemory object storage and directly write objects to it. I guess in the upcoming commits, we'll swap to using the API as we implement them. Makes sense for now. > co->value.size = len; > co->value.type = type; > co_buf = xmalloc(len); > @@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo, > o->sources = odb_source_new(o, primary_source, true); > o->sources_tail = &o->sources->next; > o->alternate_db = xstrdup_or_null(secondary_sources); > + o->inmemory_objects = odb_source_inmemory_new(o); > > free(to_free); > > @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o) > odb_close(o); > odb_free_sources(o); > > - for (size_t i = 0; i < o->cached_object_nr; i++) > - free((char *) o->cached_objects[i].value.buf); > - free(o->cached_objects); > + for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) > + free((char *) o->inmemory_objects->objects[i].value.buf); > + free(o->inmemory_objects->objects); > + free(o->inmemory_objects->base.path); > + free(o->inmemory_objects); > > string_list_clear(&o->submodule_source_paths, 0); > > diff --git a/odb.h b/odb.h > index 9eb8355aca..c3a7edf9c8 100644 > --- a/odb.h > +++ b/odb.h > @@ -8,6 +8,7 @@ > #include "thread-utils.h" > > struct cached_object_entry; > +struct odb_source_inmemory; > struct packed_git; > struct repository; > struct strbuf; > @@ -80,8 +81,7 @@ struct object_database { > * to write them into the object store (e.g. a browse-only > * application). > */ > - struct cached_object_entry *cached_objects; > - size_t cached_object_nr, cached_object_alloc; > + struct odb_source_inmemory *inmemory_objects; > > /* > * A fast, rough count of the number of objects in the repository. > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > new file mode 100644 > index 0000000000..c7ac5c24f0 > --- /dev/null > +++ b/odb/source-inmemory.c > @@ -0,0 +1,12 @@ > +#include "git-compat-util.h" > +#include "odb/source-inmemory.h" > + > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) > +{ > + struct odb_source_inmemory *source; > + > + CALLOC_ARRAY(source, 1); > + odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); > + > + return source; > +} > diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h > new file mode 100644 > index 0000000000..95477bf36d > --- /dev/null > +++ b/odb/source-inmemory.h > @@ -0,0 +1,35 @@ > +#ifndef ODB_SOURCE_INMEMORY_H > +#define ODB_SOURCE_INMEMORY_H > + > +#include "odb/source.h" > + > +struct cached_object_entry; > + > +/* > + * An inmemory source that you can write objects to that shall be made > + * available for reading, but that shouldn't ever be persisted to disk. Note > + * that any objects written to this source will be stored in memory, so the > + * number of objects you can store is limited by available system memory. > + */ > +struct odb_source_inmemory { > + struct odb_source base; > + > + struct cached_object_entry *objects; > + size_t objects_nr, objects_alloc; > +}; > + > +/* Create a new in-memory object database source. */ > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb); > + > +/* > + * Cast the given object database source to the inmemory backend. This will > + * cause a BUG in case the source doesn't use this backend. > + */ > +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source) > +{ > + if (source->type != ODB_SOURCE_INMEMORY) > + BUG("trying to downcast source of type '%d' to inmemory", source->type); > + return container_of(source, struct odb_source_inmemory, base); > +} > + Interesting, in the refs namespace the downcast functions are added to the source file (.c). This works too, is there any reason though? [snip] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 01/17] odb: introduce "in-memory" source 2026-04-09 9:26 ` Karthik Nayak @ 2026-04-09 10:41 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler On Thu, Apr 09, 2026 at 05:26:50AM -0400, Karthik Nayak wrote: > > diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h > > new file mode 100644 > > index 0000000000..95477bf36d > > --- /dev/null > > +++ b/odb/source-inmemory.h > > @@ -0,0 +1,35 @@ > > +#ifndef ODB_SOURCE_INMEMORY_H > > +#define ODB_SOURCE_INMEMORY_H > > + > > +#include "odb/source.h" > > + > > +struct cached_object_entry; > > + > > +/* > > + * An inmemory source that you can write objects to that shall be made > > + * available for reading, but that shouldn't ever be persisted to disk. Note > > + * that any objects written to this source will be stored in memory, so the > > + * number of objects you can store is limited by available system memory. > > + */ > > +struct odb_source_inmemory { > > + struct odb_source base; > > + > > + struct cached_object_entry *objects; > > + size_t objects_nr, objects_alloc; > > +}; > > + > > +/* Create a new in-memory object database source. */ > > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb); > > + > > +/* > > + * Cast the given object database source to the inmemory backend. This will > > + * cause a BUG in case the source doesn't use this backend. > > + */ > > +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source) > > +{ > > + if (source->type != ODB_SOURCE_INMEMORY) > > + BUG("trying to downcast source of type '%d' to inmemory", source->type); > > + return container_of(source, struct odb_source_inmemory, base); > > +} > > + > > Interesting, in the refs namespace the downcast functions are added to > the source file (.c). This works too, is there any reason though? By having it static inline over here we can basically ensure that the compiler can inline this call everywhere. I doubt that it really matters in the end, but I guess it doesn't hurt, either. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 01/17] " Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt ` (15 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `free()` callback function for the "in-memory" source. Note that this requires us to define `struct cached_object_entry` in "odb/source-inmemory.h", as it is accessed in both "odb.c" and "odb/source-inmemory.c" now. This will be fixed in subsequent commits though. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 25 ++++--------------------- odb/source-inmemory.c | 12 ++++++++++++ odb/source-inmemory.h | 9 ++++++++- 3 files changed, 24 insertions(+), 22 deletions(-) diff --git a/odb.c b/odb.c index 60e1eead25..1d65825ed3 100644 --- a/odb.c +++ b/odb.c @@ -32,21 +32,6 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, struct odb_source *, 1, fspathhash, fspatheq) -/* - * This is meant to hold a *small* number of objects that you would - * want odb_read_object() to be able to return, but yet you do not want - * to write them into the object store (e.g. a browse-only - * application). - */ -struct cached_object_entry { - struct object_id oid; - struct cached_object { - enum object_type type; - const void *buf; - unsigned long size; - } value; -}; - static const struct cached_object *find_cached_object(struct object_database *object_store, const struct object_id *oid) { @@ -1109,6 +1094,10 @@ static void odb_free_sources(struct object_database *o) odb_source_free(o->sources); o->sources = next; } + + odb_source_free(&o->inmemory_objects->base); + o->inmemory_objects = NULL; + kh_destroy_odb_path_map(o->source_by_path); o->source_by_path = NULL; } @@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o) odb_close(o); odb_free_sources(o); - for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) - free((char *) o->inmemory_objects->objects[i].value.buf); - free(o->inmemory_objects->objects); - free(o->inmemory_objects->base.path); - free(o->inmemory_objects); - string_list_clear(&o->submodule_source_paths, 0); free(o); diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index c7ac5c24f0..ccbb622eae 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,6 +1,16 @@ #include "git-compat-util.h" #include "odb/source-inmemory.h" +static void odb_source_inmemory_free(struct odb_source *source) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + for (size_t i = 0; i < inmemory->objects_nr; i++) + free((char *) inmemory->objects[i].value.buf); + free(inmemory->objects); + free(inmemory->base.path); + free(inmemory); +} + struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) { struct odb_source_inmemory *source; @@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) CALLOC_ARRAY(source, 1); odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); + source->base.free = odb_source_inmemory_free; + return source; } diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h index 95477bf36d..14dc06f7c3 100644 --- a/odb/source-inmemory.h +++ b/odb/source-inmemory.h @@ -3,7 +3,14 @@ #include "odb/source.h" -struct cached_object_entry; +struct cached_object_entry { + struct object_id oid; + struct cached_object { + enum object_type type; + const void *buf; + unsigned long size; + } value; +}; /* * An inmemory source that you can write objects to that shall be made -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 01/17] " Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt ` (14 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The function `odb_pretend_object()` writes an object into the in-memory object database source. The effect of this is that the object will now become readable, but it won't ever be persisted to disk. Before storing the object, we first verify whether the object already exists. This is done by calling `odb_has_object()` to check all sources, followed by `find_cached_object()` to check whether we have already stored the object in our in-memory source. This is unnecessary though, as `odb_has_object()` already checks the in-memory source transitively via: - `odb_has_object()` - `odb_read_object_info_extended()` - `do_oid_object_info_extended()` - `find_cached_object()` Drop the explicit call to `find_cached_object()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/odb.c b/odb.c index 1d65825ed3..ea3fcf5e11 100644 --- a/odb.c +++ b/odb.c @@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb, char *co_buf; hash_object_file(odb->repo->hash_algo, buf, len, type, oid); - if (odb_has_object(odb, oid, 0) || - find_cached_object(odb, oid)) + if (odb_has_object(odb, oid, 0)) return 0; ALLOC_GROW(odb->inmemory_objects->objects, -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (2 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 9:40 ` Karthik Nayak 2026-04-09 7:24 ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt ` (13 subsequent siblings) 17 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `read_object_info()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 39 +------------------------------------ odb/source-inmemory.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+), 38 deletions(-) diff --git a/odb.c b/odb.c index ea3fcf5e11..6a3912adac 100644 --- a/odb.c +++ b/odb.c @@ -32,25 +32,6 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, struct odb_source *, 1, fspathhash, fspatheq) -static const struct cached_object *find_cached_object(struct object_database *object_store, - const struct object_id *oid) -{ - static const struct cached_object empty_tree = { - .type = OBJ_TREE, - .buf = "", - }; - const struct cached_object_entry *co = object_store->inmemory_objects->objects; - - for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++) - if (oideq(&co->oid, oid)) - return &co->value; - - if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) - return &empty_tree; - - return NULL; -} - int odb_mkstemp(struct object_database *odb, struct strbuf *temp_filename, const char *pattern) { @@ -570,7 +551,6 @@ static int do_oid_object_info_extended(struct object_database *odb, const struct object_id *oid, struct object_info *oi, unsigned flags) { - const struct cached_object *co; const struct object_id *real = oid; int already_retried = 0; @@ -580,25 +560,8 @@ static int do_oid_object_info_extended(struct object_database *odb, if (is_null_oid(real)) return -1; - co = find_cached_object(odb, real); - if (co) { - if (oi) { - if (oi->typep) - *(oi->typep) = co->type; - if (oi->sizep) - *(oi->sizep) = co->size; - if (oi->disk_sizep) - *(oi->disk_sizep) = 0; - if (oi->delta_base_oid) - oidclr(oi->delta_base_oid, odb->repo->hash_algo); - if (oi->contentp) - *oi->contentp = xmemdupz(co->buf, co->size); - if (oi->mtimep) - *oi->mtimep = 0; - oi->whence = OI_CACHED; - } + if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags)) return 0; - } odb_prepare_alternates(odb); diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index ccbb622eae..12c80f9b34 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,5 +1,57 @@ #include "git-compat-util.h" +#include "odb.h" #include "odb/source-inmemory.h" +#include "repository.h" + +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, + const struct object_id *oid) +{ + static const struct cached_object empty_tree = { + .type = OBJ_TREE, + .buf = "", + }; + const struct cached_object_entry *co = source->objects; + + for (size_t i = 0; i < source->objects_nr; i++, co++) + if (oideq(&co->oid, oid)) + return &co->value; + + if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) + return &empty_tree; + + return NULL; +} + +static int odb_source_inmemory_read_object_info(struct odb_source *source, + const struct object_id *oid, + struct object_info *oi, + enum object_info_flags flags UNUSED) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + const struct cached_object *object; + + object = find_cached_object(inmemory, oid); + if (!object) + return -1; + + if (oi) { + if (oi->typep) + *(oi->typep) = object->type; + if (oi->sizep) + *(oi->sizep) = object->size; + if (oi->disk_sizep) + *(oi->disk_sizep) = 0; + if (oi->delta_base_oid) + oidclr(oi->delta_base_oid, source->odb->repo->hash_algo); + if (oi->contentp) + *oi->contentp = xmemdupz(object->buf, object->size); + if (oi->mtimep) + *oi->mtimep = 0; + oi->whence = OI_CACHED; + } + + return 0; +} static void odb_source_inmemory_free(struct odb_source *source) { @@ -19,6 +71,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); source->base.free = odb_source_inmemory_free; + source->base.read_object_info = odb_source_inmemory_read_object_info; return source; } -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback 2026-04-09 7:24 ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt @ 2026-04-09 9:40 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Karthik Nayak @ 2026-04-09 9:40 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 995 bytes --] Patrick Steinhardt <ps@pks.im> writes: [snip] > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > index ccbb622eae..12c80f9b34 100644 > --- a/odb/source-inmemory.c > +++ b/odb/source-inmemory.c > @@ -1,5 +1,57 @@ > #include "git-compat-util.h" > +#include "odb.h" > #include "odb/source-inmemory.h" > +#include "repository.h" > + > +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, > + const struct object_id *oid) > +{ > + static const struct cached_object empty_tree = { > + .type = OBJ_TREE, > + .buf = "", > + }; > + const struct cached_object_entry *co = source->objects; > + > + for (size_t i = 0; i < source->objects_nr; i++, co++) > + if (oideq(&co->oid, oid)) > + return &co->value; > + > + if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) > + return &empty_tree; > + Silly questiong, would it make more sense to check for empty_tree before iterating over all objects? The rest looks good [snip] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback 2026-04-09 9:40 ` Karthik Nayak @ 2026-04-09 10:41 ` Patrick Steinhardt 2026-04-09 11:22 ` Karthik Nayak 0 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler On Thu, Apr 09, 2026 at 05:40:01AM -0400, Karthik Nayak wrote: > Patrick Steinhardt <ps@pks.im> writes: > > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > > index ccbb622eae..12c80f9b34 100644 > > --- a/odb/source-inmemory.c > > +++ b/odb/source-inmemory.c > > @@ -1,5 +1,57 @@ > > #include "git-compat-util.h" > > +#include "odb.h" > > #include "odb/source-inmemory.h" > > +#include "repository.h" > > + > > +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, > > + const struct object_id *oid) > > +{ > > + static const struct cached_object empty_tree = { > > + .type = OBJ_TREE, > > + .buf = "", > > + }; > > + const struct cached_object_entry *co = source->objects; > > + > > + for (size_t i = 0; i < source->objects_nr; i++, co++) > > + if (oideq(&co->oid, oid)) > > + return &co->value; > > + > > + if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) > > + return &empty_tree; > > + > > Silly questiong, would it make more sense to check for empty_tree before > iterating over all objects? > > The rest looks good Maybe? I guess for now reading the empty tree is the most important use case we have for the in-memory backend, as we only write in-memory objects in a single caller. On the other hand, `source->objects_nr` would be zero in all the other cases, and jumping over the loop should be fast enough to not matter in practice. An alternative I was thinking about is to store the empty tree the same way as we store all the other objects so that we don't have to special case anything. That has the benefit that we can actually modify the tree object, too, which may eventually become relevant with regards to an object's mtime that we may want to update. The downside is that we have another allocation here and need to eagerly initialize the data structure that stores the objects. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback 2026-04-09 10:41 ` Patrick Steinhardt @ 2026-04-09 11:22 ` Karthik Nayak 0 siblings, 0 replies; 85+ messages in thread From: Karthik Nayak @ 2026-04-09 11:22 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 2148 bytes --] Patrick Steinhardt <ps@pks.im> writes: > On Thu, Apr 09, 2026 at 05:40:01AM -0400, Karthik Nayak wrote: >> Patrick Steinhardt <ps@pks.im> writes: >> > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c >> > index ccbb622eae..12c80f9b34 100644 >> > --- a/odb/source-inmemory.c >> > +++ b/odb/source-inmemory.c >> > @@ -1,5 +1,57 @@ >> > #include "git-compat-util.h" >> > +#include "odb.h" >> > #include "odb/source-inmemory.h" >> > +#include "repository.h" >> > + >> > +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, >> > + const struct object_id *oid) >> > +{ >> > + static const struct cached_object empty_tree = { >> > + .type = OBJ_TREE, >> > + .buf = "", >> > + }; >> > + const struct cached_object_entry *co = source->objects; >> > + >> > + for (size_t i = 0; i < source->objects_nr; i++, co++) >> > + if (oideq(&co->oid, oid)) >> > + return &co->value; >> > + >> > + if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) >> > + return &empty_tree; >> > + >> >> Silly questiong, would it make more sense to check for empty_tree before >> iterating over all objects? >> >> The rest looks good > > Maybe? I guess for now reading the empty tree is the most important use > case we have for the in-memory backend, as we only write in-memory > objects in a single caller. On the other hand, `source->objects_nr` > would be zero in all the other cases, and jumping over the loop should > be fast enough to not matter in practice. > That was what I understood, okay so it's fine as is. > An alternative I was thinking about is to store the empty tree the same > way as we store all the other objects so that we don't have to special > case anything. That has the benefit that we can actually modify the tree > object, too, which may eventually become relevant with regards to an > object's mtime that we may want to update. The downside is that we have > another allocation here and need to eagerly initialize the data > structure that stores the objects. > > Patrick That would be good too, but I also think maybe it is fine to just leave it. This is simple enough. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (3 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 9:49 ` Karthik Nayak 2026-04-09 7:24 ` [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt ` (12 subsequent siblings) 17 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `read_object_stream()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 12c80f9b34..4a68169430 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,6 +1,7 @@ #include "git-compat-util.h" #include "odb.h" #include "odb/source-inmemory.h" +#include "odb/streaming.h" #include "repository.h" static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, @@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, return 0; } +struct odb_read_stream_inmemory { + struct odb_read_stream base; + const void *buf; + size_t offset; +}; + +static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream, + char *buf, size_t buf_len) +{ + struct odb_read_stream_inmemory *inmemory = + container_of(stream, struct odb_read_stream_inmemory, base); + size_t bytes = buf_len; + + if (buf_len > inmemory->base.size - inmemory->offset) + bytes = inmemory->base.size - inmemory->offset; + memcpy(buf, inmemory->buf, bytes); + + return bytes; +} + +static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED) +{ + return 0; +} + +static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, + struct odb_source *source, + const struct object_id *oid) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct odb_read_stream_inmemory *stream; + const struct cached_object *object; + + object = find_cached_object(inmemory, oid); + if (!object) + return -1; + + CALLOC_ARRAY(stream, 1); + stream->base.read = odb_read_stream_inmemory_read; + stream->base.close = odb_read_stream_inmemory_close; + stream->base.size = object->size; + stream->base.type = object->type; + stream->buf = object->buf; + + *out = &stream->base; + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -72,6 +121,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; + source->base.read_object_stream = odb_source_inmemory_read_object_stream; return source; } -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback 2026-04-09 7:24 ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt @ 2026-04-09 9:49 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Karthik Nayak @ 2026-04-09 9:49 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 2494 bytes --] Patrick Steinhardt <ps@pks.im> writes: > Implement the `read_object_stream()` callback function for the in-memory > source. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 50 insertions(+) > > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > index 12c80f9b34..4a68169430 100644 > --- a/odb/source-inmemory.c > +++ b/odb/source-inmemory.c > @@ -1,6 +1,7 @@ > #include "git-compat-util.h" > #include "odb.h" > #include "odb/source-inmemory.h" > +#include "odb/streaming.h" > #include "repository.h" > > static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, > @@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, > return 0; > } > > +struct odb_read_stream_inmemory { > + struct odb_read_stream base; > + const void *buf; > + size_t offset; > +}; > + To stream objects, we have a new structure which is used in the callback. > +static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream, > + char *buf, size_t buf_len) > +{ > + struct odb_read_stream_inmemory *inmemory = > + container_of(stream, struct odb_read_stream_inmemory, base); > + size_t bytes = buf_len; > + if (buf_len > inmemory->base.size - inmemory->offset) > + bytes = inmemory->base.size - inmemory->offset; > + memcpy(buf, inmemory->buf, bytes); > + Shouldn't the offset also be set and we only memcpy offset onwards? > + return bytes; > +} > + > +static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED) > +{ > + return 0; > +} > + > +static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, > + struct odb_source *source, > + const struct object_id *oid) > +{ > + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); > + struct odb_read_stream_inmemory *stream; > + const struct cached_object *object; > + > + object = find_cached_object(inmemory, oid); > + if (!object) > + return -1; > + > + CALLOC_ARRAY(stream, 1); > + stream->base.read = odb_read_stream_inmemory_read; > + stream->base.close = odb_read_stream_inmemory_close; > + stream->base.size = object->size; > + stream->base.type = object->type; > + stream->buf = object->buf; > + So the object is simply mapped to the structure which is propagated in `read()`. Since we don't copy any new data over, `close()` has nothing to do. [snip] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback 2026-04-09 9:49 ` Karthik Nayak @ 2026-04-09 10:41 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler On Thu, Apr 09, 2026 at 05:49:32AM -0400, Karthik Nayak wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > Implement the `read_object_stream()` callback function for the in-memory > > source. > > > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > odb/source-inmemory.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 50 insertions(+) > > > > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > > index 12c80f9b34..4a68169430 100644 > > --- a/odb/source-inmemory.c > > +++ b/odb/source-inmemory.c > > @@ -53,6 +54,54 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, > > return 0; > > } > > > > +struct odb_read_stream_inmemory { > > + struct odb_read_stream base; > > + const void *buf; > > + size_t offset; > > +}; > > + > > To stream objects, we have a new structure which is used in the callback. > > > +static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream, > > + char *buf, size_t buf_len) > > +{ > > + struct odb_read_stream_inmemory *inmemory = > > + container_of(stream, struct odb_read_stream_inmemory, base); > > + size_t bytes = buf_len; > > > > > + if (buf_len > inmemory->base.size - inmemory->offset) > > + bytes = inmemory->base.size - inmemory->offset; > > + memcpy(buf, inmemory->buf, bytes); > > + > > Shouldn't the offset also be set and we only memcpy offset onwards? Oh, good catch. We don't have any users of this API yet, which is why it went undetected. Will fix, thanks. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (4 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 07/17] " Patrick Steinhardt ` (11 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `write_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 16 ++-------------- odb/source-inmemory.c | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/odb.c b/odb.c index 6a3912adac..24e929f03c 100644 --- a/odb.c +++ b/odb.c @@ -733,24 +733,12 @@ int odb_pretend_object(struct object_database *odb, void *buf, unsigned long len, enum object_type type, struct object_id *oid) { - struct cached_object_entry *co; - char *co_buf; - hash_object_file(odb->repo->hash_algo, buf, len, type, oid); if (odb_has_object(odb, oid, 0)) return 0; - ALLOC_GROW(odb->inmemory_objects->objects, - odb->inmemory_objects->objects_nr + 1, - odb->inmemory_objects->objects_alloc); - co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++]; - co->value.size = len; - co->value.type = type; - co_buf = xmalloc(len); - memcpy(co_buf, buf, len); - co->value.buf = co_buf; - oidcpy(&co->oid, oid); - return 0; + return odb_source_write_object(&odb->inmemory_objects->base, + buf, len, type, oid, NULL, 0); } void *odb_read_object(struct object_database *odb, diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 4a68169430..d2fc4c4054 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -102,6 +102,27 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, return 0; } +static int odb_source_inmemory_write_object(struct odb_source *source, + const void *buf, unsigned long len, + enum object_type type, + struct object_id *oid, + struct object_id *compat_oid UNUSED, + enum odb_write_object_flags flags UNUSED) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct cached_object_entry *object; + + ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, + inmemory->objects_alloc); + object = &inmemory->objects[inmemory->objects_nr++]; + object->value.size = len; + object->value.type = type; + object->value.buf = xmemdupz(buf, len); + oidcpy(&object->oid, oid); + + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -122,6 +143,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; + source->base.write_object = odb_source_inmemory_write_object; return source; } -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 07/17] odb/source-inmemory: implement `write_object()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (5 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 10:27 ` Karthik Nayak 2026-04-09 7:24 ` [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt ` (10 subsequent siblings) 17 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `write_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index d2fc4c4054..96e8efd327 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,4 +1,5 @@ #include "git-compat-util.h" +#include "object-file.h" #include "odb.h" #include "odb/source-inmemory.h" #include "odb/streaming.h" @@ -112,6 +113,8 @@ static int odb_source_inmemory_write_object(struct odb_source *source, struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); struct cached_object_entry *object; + hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); + ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, inmemory->objects_alloc); object = &inmemory->objects[inmemory->objects_nr++]; -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v2 07/17] odb/source-inmemory: implement `write_object()` callback 2026-04-09 7:24 ` [PATCH v2 07/17] " Patrick Steinhardt @ 2026-04-09 10:27 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Karthik Nayak @ 2026-04-09 10:27 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 1120 bytes --] Patrick Steinhardt <ps@pks.im> writes: > Implement the `write_object()` callback function for the in-memory > source. > rebase error? Seems like the commit message as the last commit. > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > odb/source-inmemory.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > index d2fc4c4054..96e8efd327 100644 > --- a/odb/source-inmemory.c > +++ b/odb/source-inmemory.c > @@ -1,4 +1,5 @@ > #include "git-compat-util.h" > +#include "object-file.h" > #include "odb.h" > #include "odb/source-inmemory.h" > #include "odb/streaming.h" > @@ -112,6 +113,8 @@ static int odb_source_inmemory_write_object(struct odb_source *source, > struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); > struct cached_object_entry *object; > > + hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); > + > ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, > inmemory->objects_alloc); > object = &inmemory->objects[inmemory->objects_nr++]; > > -- > 2.54.0.rc0.680.geaeac8ef83.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 07/17] odb/source-inmemory: implement `write_object()` callback 2026-04-09 10:27 ` Karthik Nayak @ 2026-04-09 10:41 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 10:41 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler On Thu, Apr 09, 2026 at 06:27:27AM -0400, Karthik Nayak wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > Implement the `write_object()` callback function for the in-memory > > source. > > > > rebase error? Seems like the commit message as the last commit. I saw the empty new commit in the range diff, but somehow didn't get what was happening. But yes, this obviously needs to be squashed into the preceding commit, thanks! Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (6 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 07/17] " Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt ` (9 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `write_object_stream()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 96e8efd327..578ceea550 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -126,6 +126,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source, return 0; } +static int odb_source_inmemory_write_object_stream(struct odb_source *source, + struct odb_write_stream *stream, + size_t len, + struct object_id *oid) +{ + char buf[16384]; + size_t total_read = 0; + char *data; + int ret; + + CALLOC_ARRAY(data, len); + while (!stream->is_finished) { + ssize_t bytes_read; + + bytes_read = odb_write_stream_read(stream, buf, sizeof(buf)); + if (total_read + bytes_read > len) { + ret = error("object stream yielded more bytes than expected"); + goto out; + } + + memcpy(data, buf, bytes_read); + total_read += bytes_read; + } + + if (total_read != len) { + ret = error("object stream yielded less bytes than expected"); + goto out; + } + + ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid, + NULL, 0); + if (ret < 0) + goto out; + +out: + free(data); + return ret; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -147,6 +186,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.write_object = odb_source_inmemory_write_object; + source->base.write_object_stream = odb_source_inmemory_write_object_stream; return source; } -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (7 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 11:36 ` Karthik Nayak 2026-04-09 7:24 ` [PATCH v2 10/17] oidtree: add ability to store data Patrick Steinhardt ` (8 subsequent siblings) 17 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The cbtree subsystem allows the user to store arbitrary data in a prefix-free set of strings. This is used by us to store object IDs in a way that we can easily iterate through them in lexicograph order, and so that we can easily perform lookups with shortened object IDs. In its current form, it is not easily possible to store arbitrary data with the tree nodes. There are a couple of approaches such a caller could try to use, but none of them really work: - One may embed the `struct cb_node` in a custom structure. This does not work though as `struct cb_node` contains a flex array, and embedding such a struct in another struct is forbidden. - One may use a `union` over `struct cb_node` and ones own data type, which _is_ allowed even if the struct contains a flex array. This does not work though, as the compiler may align members of the struct so that the node key would not immediately start where the flex array starts. - One may allocate `struct cb_node` such that it has room for both its key and the custom data. This has the downside though that if the custom data is itself a pointer to allocated memory, then the leak checker will not consider the pointer to be alive anymore. Refactor the cbtree to drop the flex array and instead take in an explicit offset for where to find the key, which allows the caller to embed `struct cb_node` is a wrapper struct. Note that this change has the downside that we now have a bit of padding in our structure, which grows the size from 60 to 64 bytes on a 64 bit system. On the other hand though, it allows us to get rid of the memory copies that we previously had to do to ensure proper alignment. This seems like a reasonable tradeoff. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- cbtree.c | 25 ++++++++++++++++++------- cbtree.h | 11 ++++++----- oidtree.c | 33 ++++++++++++++------------------- 3 files changed, 38 insertions(+), 31 deletions(-) diff --git a/cbtree.c b/cbtree.c index 4ab794bddc..8f5edbb80a 100644 --- a/cbtree.c +++ b/cbtree.c @@ -7,6 +7,11 @@ #include "git-compat-util.h" #include "cbtree.h" +static inline uint8_t *cb_node_key(struct cb_tree *t, struct cb_node *node) +{ + return (uint8_t *) node + t->key_offset; +} + static struct cb_node *cb_node_of(const void *p) { return (struct cb_node *)((uintptr_t)p - 1); @@ -33,6 +38,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) uint8_t c; int newdirection; struct cb_node **wherep, *p; + uint8_t *node_key, *p_key; assert(!((uintptr_t)node & 1)); /* allocations must be aligned */ @@ -41,23 +47,26 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) return NULL; /* success */ } + node_key = cb_node_key(t, node); + /* see if a node already exists */ - p = cb_internal_best_match(t->root, node->k, klen); + p = cb_internal_best_match(t->root, node_key, klen); + p_key = cb_node_key(t, p); /* find first differing byte */ for (newbyte = 0; newbyte < klen; newbyte++) { - if (p->k[newbyte] != node->k[newbyte]) + if (p_key[newbyte] != node_key[newbyte]) goto different_byte_found; } return p; /* element exists, let user deal with it */ different_byte_found: - newotherbits = p->k[newbyte] ^ node->k[newbyte]; + newotherbits = p_key[newbyte] ^ node_key[newbyte]; newotherbits |= newotherbits >> 1; newotherbits |= newotherbits >> 2; newotherbits |= newotherbits >> 4; newotherbits = (newotherbits & ~(newotherbits >> 1)) ^ 255; - c = p->k[newbyte]; + c = p_key[newbyte]; newdirection = (1 + (newotherbits | c)) >> 8; node->byte = newbyte; @@ -78,7 +87,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) break; if (q->byte == newbyte && q->otherbits > newotherbits) break; - c = q->byte < klen ? node->k[q->byte] : 0; + c = q->byte < klen ? node_key[q->byte] : 0; direction = (1 + (q->otherbits | c)) >> 8; wherep = q->child + direction; } @@ -93,7 +102,7 @@ struct cb_node *cb_lookup(struct cb_tree *t, const uint8_t *k, size_t klen) { struct cb_node *p = cb_internal_best_match(t->root, k, klen); - return p && !memcmp(p->k, k, klen) ? p : NULL; + return p && !memcmp(cb_node_key(t, p), k, klen) ? p : NULL; } static int cb_descend(struct cb_node *p, cb_iter fn, void *arg) @@ -115,6 +124,7 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, struct cb_node *p = t->root; struct cb_node *top = p; size_t i = 0; + uint8_t *p_key; if (!p) return 0; /* empty tree */ @@ -130,8 +140,9 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, top = p; } + p_key = cb_node_key(t, p); for (i = 0; i < klen; i++) { - if (p->k[i] != kpfx[i]) + if (p_key[i] != kpfx[i]) return 0; /* "best" match failed */ } diff --git a/cbtree.h b/cbtree.h index c374b1b3db..3ce0d6b287 100644 --- a/cbtree.h +++ b/cbtree.h @@ -23,18 +23,19 @@ struct cb_node { */ uint32_t byte; uint8_t otherbits; - uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */ }; struct cb_tree { struct cb_node *root; + ptrdiff_t key_offset; }; -#define CBTREE_INIT { 0 } - -static inline void cb_init(struct cb_tree *t) +static inline void cb_init(struct cb_tree *t, + ptrdiff_t key_offset) { - struct cb_tree blank = CBTREE_INIT; + struct cb_tree blank = { + .key_offset = key_offset, + }; memcpy(t, &blank, sizeof(*t)); } diff --git a/oidtree.c b/oidtree.c index ab9fe7ec7a..117649753f 100644 --- a/oidtree.c +++ b/oidtree.c @@ -6,9 +6,14 @@ #include "oidtree.h" #include "hash.h" +struct oidtree_node { + struct cb_node base; + struct object_id key; +}; + void oidtree_init(struct oidtree *ot) { - cb_init(&ot->tree); + cb_init(&ot->tree, offsetof(struct oidtree_node, key)); mem_pool_init(&ot->mem_pool, 0); } @@ -22,20 +27,13 @@ void oidtree_clear(struct oidtree *ot) void oidtree_insert(struct oidtree *ot, const struct object_id *oid) { - struct cb_node *on; - struct object_id k; + struct oidtree_node *on; if (!oid->algo) BUG("oidtree_insert requires oid->algo"); - on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid)); - - /* - * Clear the padding and copy the result in separate steps to - * respect the 4-byte alignment needed by struct object_id. - */ - oidcpy(&k, oid); - memcpy(on->k, &k, sizeof(k)); + on = mem_pool_alloc(&ot->mem_pool, sizeof(*on)); + oidcpy(&on->key, oid); /* * n.b. Current callers won't get us duplicates, here. If a @@ -43,7 +41,7 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid) * that won't be freed until oidtree_clear. Currently it's not * worth maintaining a free list */ - cb_insert(&ot->tree, on, sizeof(*oid)); + cb_insert(&ot->tree, &on->base, sizeof(*oid)); } bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) @@ -73,21 +71,18 @@ struct oidtree_each_data { static int iter(struct cb_node *n, void *cb_data) { + struct oidtree_node *node = container_of(n, struct oidtree_node, base); struct oidtree_each_data *data = cb_data; - struct object_id k; - - /* Copy to provide 4-byte alignment needed by struct object_id. */ - memcpy(&k, n->k, sizeof(k)); - if (data->algo != GIT_HASH_UNKNOWN && data->algo != k.algo) + if (data->algo != GIT_HASH_UNKNOWN && data->algo != node->key.algo) return 0; if (data->last_nibble_at) { - if ((k.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0) + if ((node->key.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0) return 0; } - return data->cb(&k, data->cb_data); + return data->cb(&node->key, data->cb_data); } int oidtree_each(struct oidtree *ot, const struct object_id *prefix, -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes 2026-04-09 7:24 ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt @ 2026-04-09 11:36 ` Karthik Nayak 2026-04-09 11:46 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Karthik Nayak @ 2026-04-09 11:36 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 428 bytes --] Patrick Steinhardt <ps@pks.im> writes: [snip] > diff --git a/cbtree.h b/cbtree.h > index c374b1b3db..3ce0d6b287 100644 > --- a/cbtree.h > +++ b/cbtree.h > @@ -23,18 +23,19 @@ struct cb_node { > */ > uint32_t byte; > uint8_t otherbits; > - uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */ > }; > Seems like we need to update the comments at the top of the header file which still talks about this field. [snip] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes 2026-04-09 11:36 ` Karthik Nayak @ 2026-04-09 11:46 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 11:46 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler On Thu, Apr 09, 2026 at 07:36:50AM -0400, Karthik Nayak wrote: > Patrick Steinhardt <ps@pks.im> writes: > > [snip] > > > diff --git a/cbtree.h b/cbtree.h > > index c374b1b3db..3ce0d6b287 100644 > > --- a/cbtree.h > > +++ b/cbtree.h > > @@ -23,18 +23,19 @@ struct cb_node { > > */ > > uint32_t byte; > > uint8_t otherbits; > > - uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */ > > }; > > > > Seems like we need to update the comments at the top of the header file > which still talks about this field. Good eyes, will adapt. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 10/17] oidtree: add ability to store data 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (8 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt ` (7 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The oidtree data structure is currently only used to store object IDs, without any associated data. So consequently, it can only really be used to track which object IDs exist, and we can use the tree structure to efficiently operate on OID prefixes. But there are valid use cases where we want to both: - Store object IDs in a sorted order. - Associated arbitrary data with them. Refactor the oidtree interface so that it allows us to store arbitrary payloads within the respective nodes. This will be used in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- loose.c | 2 +- object-file.c | 3 ++- oidtree.c | 37 ++++++++++++++++++++++++++++++++----- oidtree.h | 12 ++++++++++-- t/unit-tests/u-oidtree.c | 26 +++++++++++++++++++++++--- 5 files changed, 68 insertions(+), 12 deletions(-) diff --git a/loose.c b/loose.c index 07333be696..f7a3dd1a72 100644 --- a/loose.c +++ b/loose.c @@ -57,7 +57,7 @@ static int insert_loose_map(struct odb_source *source, inserted |= insert_oid_pair(map->to_compat, oid, compat_oid); inserted |= insert_oid_pair(map->to_storage, compat_oid, oid); if (inserted) - oidtree_insert(files->loose->cache, compat_oid); + oidtree_insert(files->loose->cache, compat_oid, NULL); return inserted; } diff --git a/object-file.c b/object-file.c index 3e70e5d668..d04ab57253 100644 --- a/object-file.c +++ b/object-file.c @@ -1857,6 +1857,7 @@ static int for_each_object_wrapper_cb(const struct object_id *oid, } static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid, + void *node_data UNUSED, void *cb_data) { struct for_each_object_wrapper_data *data = cb_data; @@ -2002,7 +2003,7 @@ static int append_loose_object(const struct object_id *oid, const char *path UNUSED, void *data) { - oidtree_insert(data, oid); + oidtree_insert(data, oid, NULL); return 0; } diff --git a/oidtree.c b/oidtree.c index 117649753f..e43f18026e 100644 --- a/oidtree.c +++ b/oidtree.c @@ -9,6 +9,7 @@ struct oidtree_node { struct cb_node base; struct object_id key; + void *data; }; void oidtree_init(struct oidtree *ot) @@ -25,15 +26,22 @@ void oidtree_clear(struct oidtree *ot) } } -void oidtree_insert(struct oidtree *ot, const struct object_id *oid) +struct oidtree_data { + struct object_id oid; +}; + +void oidtree_insert(struct oidtree *ot, const struct object_id *oid, + void *data) { struct oidtree_node *on; + struct cb_node *node; if (!oid->algo) BUG("oidtree_insert requires oid->algo"); on = mem_pool_alloc(&ot->mem_pool, sizeof(*on)); oidcpy(&on->key, oid); + on->data = data; /* * n.b. Current callers won't get us duplicates, here. If a @@ -41,13 +49,19 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid) * that won't be freed until oidtree_clear. Currently it's not * worth maintaining a free list */ - cb_insert(&ot->tree, &on->base, sizeof(*oid)); + node = cb_insert(&ot->tree, &on->base, sizeof(*oid)); + if (node) { + struct oidtree_node *preexisting = container_of(node, struct oidtree_node, base); + preexisting->data = data; + } } -bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) +static struct oidtree_node *oidtree_lookup(struct oidtree *ot, + const struct object_id *oid) { struct object_id k; size_t klen = sizeof(k); + struct cb_node *node; oidcpy(&k, oid); @@ -58,7 +72,20 @@ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) klen += BUILD_ASSERT_OR_ZERO(offsetof(struct object_id, hash) < offsetof(struct object_id, algo)); - return !!cb_lookup(&ot->tree, (const uint8_t *)&k, klen); + node = cb_lookup(&ot->tree, (const uint8_t *)&k, klen); + return node ? container_of(node, struct oidtree_node, base) : NULL; +} + +bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) +{ + struct oidtree_node *node = oidtree_lookup(ot, oid); + return node ? 1 : 0; +} + +void *oidtree_get(struct oidtree *ot, const struct object_id *oid) +{ + struct oidtree_node *node = oidtree_lookup(ot, oid); + return node ? node->data : NULL; } struct oidtree_each_data { @@ -82,7 +109,7 @@ static int iter(struct cb_node *n, void *cb_data) return 0; } - return data->cb(&node->key, data->cb_data); + return data->cb(&node->key, node->data, data->cb_data); } int oidtree_each(struct oidtree *ot, const struct object_id *prefix, diff --git a/oidtree.h b/oidtree.h index 2b7bad2e60..baa5a436ea 100644 --- a/oidtree.h +++ b/oidtree.h @@ -29,18 +29,26 @@ void oidtree_init(struct oidtree *ot); */ void oidtree_clear(struct oidtree *ot); -/* Insert the object ID into the tree. */ -void oidtree_insert(struct oidtree *ot, const struct object_id *oid); +/* + * Insert the object ID into the tree and store the given pointer alongside + * with it. The data pointer of any preexisting entry will be overwritten. + */ +void oidtree_insert(struct oidtree *ot, const struct object_id *oid, + void *data); /* Check whether the tree contains the given object ID. */ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid); +/* Get the payload stored with the given object ID. */ +void *oidtree_get(struct oidtree *ot, const struct object_id *oid); + /* * Callback function used for `oidtree_each()`. Returning a non-zero exit code * will cause iteration to stop. The exit code will be propagated to the caller * of `oidtree_each()`. */ typedef int (*oidtree_each_cb)(const struct object_id *oid, + void *node_data, void *cb_data); /* diff --git a/t/unit-tests/u-oidtree.c b/t/unit-tests/u-oidtree.c index d4d05c7dc3..f0d5ebb733 100644 --- a/t/unit-tests/u-oidtree.c +++ b/t/unit-tests/u-oidtree.c @@ -19,7 +19,7 @@ static int fill_tree_loc(struct oidtree *ot, const char *hexes[], size_t n) for (size_t i = 0; i < n; i++) { struct object_id oid; cl_parse_any_oid(hexes[i], &oid); - oidtree_insert(ot, &oid); + oidtree_insert(ot, &oid, NULL); } return 0; } @@ -38,9 +38,9 @@ struct expected_hex_iter { const char *query; }; -static int check_each_cb(const struct object_id *oid, void *data) +static int check_each_cb(const struct object_id *oid, void *node_data UNUSED, void *cb_data) { - struct expected_hex_iter *hex_iter = data; + struct expected_hex_iter *hex_iter = cb_data; struct object_id expected; cl_assert(hex_iter->i < hex_iter->expected_hexes.nr); @@ -105,3 +105,23 @@ void test_oidtree__each(void) check_each(&ot, "32100", "321", NULL); check_each(&ot, "32", "320", "321", NULL); } + +void test_oidtree__insert_overwrites_data(void) +{ + struct object_id oid; + struct oidtree ot; + int a, b; + + cl_parse_any_oid("1", &oid); + + oidtree_init(&ot); + + oidtree_insert(&ot, &oid, NULL); + cl_assert_equal_p(oidtree_get(&ot, &oid), NULL); + oidtree_insert(&ot, &oid, &a); + cl_assert_equal_p(oidtree_get(&ot, &oid), &a); + oidtree_insert(&ot, &oid, &b); + cl_assert_equal_p(oidtree_get(&ot, &oid), &b); + + oidtree_clear(&ot); +} -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (9 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 10/17] oidtree: add ability to store data Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt ` (6 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The in-memory source stores its objects in a simple array that we grow as needed. This has a couple of downsides: - The object lookup is O(n). This doesn't matter in practice because we only store a small number of objects. - We don't have an easy way to iterate over all objects in lexicographic order. - We don't have an easy way to compute unique object ID prefixes. Refactor the code to use an oidtree instead. This is the same data structure used by our loose object source, and thus it means we get a bunch of functionality for free. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 72 +++++++++++++++++++++++++++++++++++++-------------- odb/source-inmemory.h | 13 ++-------- 2 files changed, 54 insertions(+), 31 deletions(-) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 578ceea550..0420b98d00 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -3,20 +3,29 @@ #include "odb.h" #include "odb/source-inmemory.h" #include "odb/streaming.h" +#include "oidtree.h" #include "repository.h" -static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, - const struct object_id *oid) +struct inmemory_object { + enum object_type type; + const void *buf; + unsigned long size; +}; + +static const struct inmemory_object *find_cached_object(struct odb_source_inmemory *source, + const struct object_id *oid) { - static const struct cached_object empty_tree = { + static const struct inmemory_object empty_tree = { .type = OBJ_TREE, .buf = "", }; - const struct cached_object_entry *co = source->objects; + const struct inmemory_object *object; - for (size_t i = 0; i < source->objects_nr; i++, co++) - if (oideq(&co->oid, oid)) - return &co->value; + if (source->objects) { + object = oidtree_get(source->objects, oid); + if (object) + return object; + } if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) return &empty_tree; @@ -30,7 +39,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, enum object_info_flags flags UNUSED) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - const struct cached_object *object; + const struct inmemory_object *object; object = find_cached_object(inmemory, oid); if (!object) @@ -86,7 +95,7 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); struct odb_read_stream_inmemory *stream; - const struct cached_object *object; + const struct inmemory_object *object; object = find_cached_object(inmemory, oid); if (!object) @@ -111,17 +120,23 @@ static int odb_source_inmemory_write_object(struct odb_source *source, enum odb_write_object_flags flags UNUSED) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - struct cached_object_entry *object; + struct inmemory_object *object; hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); - ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, - inmemory->objects_alloc); - object = &inmemory->objects[inmemory->objects_nr++]; - object->value.size = len; - object->value.type = type; - object->value.buf = xmemdupz(buf, len); - oidcpy(&object->oid, oid); + if (!inmemory->objects) { + CALLOC_ARRAY(inmemory->objects, 1); + oidtree_init(inmemory->objects); + } else if (oidtree_contains(inmemory->objects, oid)) { + return 0; + } + + CALLOC_ARRAY(object, 1); + object->size = len; + object->type = type; + object->buf = xmemdupz(buf, len); + + oidtree_insert(inmemory->objects, oid, object); return 0; } @@ -165,12 +180,29 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source, return ret; } +static int inmemory_object_free(const struct object_id *oid UNUSED, + void *node_data, + void *cb_data UNUSED) +{ + struct inmemory_object *object = node_data; + free((void *) object->buf); + free(object); + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - for (size_t i = 0; i < inmemory->objects_nr; i++) - free((char *) inmemory->objects[i].value.buf); - free(inmemory->objects); + + if (inmemory->objects) { + struct object_id null_oid = { 0 }; + + oidtree_each(inmemory->objects, &null_oid, 0, + inmemory_object_free, NULL); + oidtree_clear(inmemory->objects); + free(inmemory->objects); + } + free(inmemory->base.path); free(inmemory); } diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h index 14dc06f7c3..02cf586b63 100644 --- a/odb/source-inmemory.h +++ b/odb/source-inmemory.h @@ -3,14 +3,7 @@ #include "odb/source.h" -struct cached_object_entry { - struct object_id oid; - struct cached_object { - enum object_type type; - const void *buf; - unsigned long size; - } value; -}; +struct oidtree; /* * An inmemory source that you can write objects to that shall be made @@ -20,9 +13,7 @@ struct cached_object_entry { */ struct odb_source_inmemory { struct odb_source base; - - struct cached_object_entry *objects; - size_t objects_nr, objects_alloc; + struct oidtree *objects; }; /* Create a new in-memory object database source. */ -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (10 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt ` (5 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `for_each_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 86 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 70 insertions(+), 16 deletions(-) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 0420b98d00..d1674836cc 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -33,6 +33,28 @@ static const struct inmemory_object *find_cached_object(struct odb_source_inmemo return NULL; } +static void populate_object_info(struct odb_source_inmemory *source, + struct object_info *oi, + const struct inmemory_object *object) +{ + if (!oi) + return; + + if (oi->typep) + *(oi->typep) = object->type; + if (oi->sizep) + *(oi->sizep) = object->size; + if (oi->disk_sizep) + *(oi->disk_sizep) = 0; + if (oi->delta_base_oid) + oidclr(oi->delta_base_oid, source->base.odb->repo->hash_algo); + if (oi->contentp) + *oi->contentp = xmemdupz(object->buf, object->size); + if (oi->mtimep) + *oi->mtimep = 0; + oi->whence = OI_CACHED; +} + static int odb_source_inmemory_read_object_info(struct odb_source *source, const struct object_id *oid, struct object_info *oi, @@ -45,22 +67,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, if (!object) return -1; - if (oi) { - if (oi->typep) - *(oi->typep) = object->type; - if (oi->sizep) - *(oi->sizep) = object->size; - if (oi->disk_sizep) - *(oi->disk_sizep) = 0; - if (oi->delta_base_oid) - oidclr(oi->delta_base_oid, source->odb->repo->hash_algo); - if (oi->contentp) - *oi->contentp = xmemdupz(object->buf, object->size); - if (oi->mtimep) - *oi->mtimep = 0; - oi->whence = OI_CACHED; - } - + populate_object_info(inmemory, oi, object); return 0; } @@ -112,6 +119,52 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, return 0; } +struct odb_source_inmemory_for_each_object_data { + struct odb_source_inmemory *inmemory; + const struct object_info *request; + odb_for_each_object_cb cb; + void *cb_data; +}; + +static int odb_source_inmemory_for_each_object_cb(const struct object_id *oid, + void *node_data, void *cb_data) +{ + struct odb_source_inmemory_for_each_object_data *data = cb_data; + struct inmemory_object *object = node_data; + + if (data->request) { + struct object_info oi = *data->request; + populate_object_info(data->inmemory, &oi, object); + return data->cb(oid, &oi, data->cb_data); + } else { + return data->cb(oid, NULL, data->cb_data); + } +} + +static int odb_source_inmemory_for_each_object(struct odb_source *source, + const struct object_info *request, + odb_for_each_object_cb cb, + void *cb_data, + const struct odb_for_each_object_options *opts) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct odb_source_inmemory_for_each_object_data payload = { + .inmemory = inmemory, + .request = request, + .cb = cb, + .cb_data = cb_data, + }; + struct object_id null_oid = { 0 }; + + if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) || + (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local)) + return 0; + + return oidtree_each(inmemory->objects, + opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len, + odb_source_inmemory_for_each_object_cb, &payload); +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -217,6 +270,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; + source->base.for_each_object = odb_source_inmemory_for_each_object; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (11 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt ` (4 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `find_abbrev_len()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index d1674836cc..a8eba373ee 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -165,6 +165,44 @@ static int odb_source_inmemory_for_each_object(struct odb_source *source, odb_source_inmemory_for_each_object_cb, &payload); } +struct find_abbrev_len_data { + const struct object_id *oid; + unsigned len; +}; + +static int find_abbrev_len_cb(const struct object_id *oid, + struct object_info *oi UNUSED, + void *cb_data) +{ + struct find_abbrev_len_data *data = cb_data; + unsigned len = oid_common_prefix_hexlen(oid, data->oid); + if (len != hash_algos[oid->algo].hexsz && len >= data->len) + data->len = len + 1; + return 0; +} + +static int odb_source_inmemory_find_abbrev_len(struct odb_source *source, + const struct object_id *oid, + unsigned min_len, + unsigned *out) +{ + struct odb_for_each_object_options opts = { + .prefix = oid, + .prefix_hex_len = min_len, + }; + struct find_abbrev_len_data data = { + .oid = oid, + .len = min_len, + }; + int ret; + + ret = odb_source_inmemory_for_each_object(source, NULL, find_abbrev_len_cb, + &data, &opts); + *out = data.len; + + return ret; +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -271,6 +309,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; + source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (12 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt ` (3 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `count_objects()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index a8eba373ee..f038debaa3 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -203,6 +203,25 @@ static int odb_source_inmemory_find_abbrev_len(struct odb_source *source, return ret; } +static int count_objects_cb(const struct object_id *oid UNUSED, + struct object_info *oi UNUSED, + void *cb_data) +{ + unsigned long *counter = cb_data; + (*counter)++; + return 0; +} + +static int odb_source_inmemory_count_objects(struct odb_source *source, + enum odb_count_objects_flags flags UNUSED, + unsigned long *out) +{ + struct odb_for_each_object_options opts = { 0 }; + *out = 0; + return odb_source_inmemory_for_each_object(source, NULL, count_objects_cb, + out, &opts); +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -310,6 +329,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len; + source->base.count_objects = odb_source_inmemory_count_objects; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (13 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt ` (2 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `freshen_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index f038debaa3..15a6a5ae64 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -290,6 +290,15 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source, return ret; } +static int odb_source_inmemory_freshen_object(struct odb_source *source, + const struct object_id *oid) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + if (find_cached_object(inmemory, oid)) + return 1; + return 0; +} + static int inmemory_object_free(const struct object_id *oid UNUSED, void *node_data, void *cb_data UNUSED) @@ -332,6 +341,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.count_objects = odb_source_inmemory_count_objects; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; + source->base.freshen_object = odb_source_inmemory_freshen_object; return source; } -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (14 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 19:39 ` Junio C Hamano 2026-04-09 7:24 ` [PATCH v2 17/17] odb: generic in-memory source Patrick Steinhardt 2026-04-09 11:44 ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak 17 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Stub out remaining functions that we either don't need or that are basically no-ops. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 15a6a5ae64..1140b1b916 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -299,6 +299,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source, return 0; } +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED, + struct odb_transaction **out UNUSED) +{ + return error("inmemory source does not support transactions"); +} + +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED, + struct strvec *out UNUSED) +{ + return 0; +} + +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED, + const char *alternate UNUSED) +{ + return error("inmemory source does not support alternates"); +} + +static void odb_source_inmemory_close(struct odb_source *source UNUSED) +{ +} + +static void odb_source_inmemory_reprepare(struct odb_source *source UNUSED) +{ +} + static int inmemory_object_free(const struct object_id *oid UNUSED, void *node_data, void *cb_data UNUSED) @@ -334,6 +360,8 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); source->base.free = odb_source_inmemory_free; + source->base.close = odb_source_inmemory_close; + source->base.reprepare = odb_source_inmemory_reprepare; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; @@ -342,6 +370,9 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; source->base.freshen_object = odb_source_inmemory_freshen_object; + source->base.begin_transaction = odb_source_inmemory_begin_transaction; + source->base.read_alternates = odb_source_inmemory_read_alternates; + source->base.write_alternate = odb_source_inmemory_write_alternate; return source; } -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions 2026-04-09 7:24 ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt @ 2026-04-09 19:39 ` Junio C Hamano 2026-04-10 4:53 ` Patrick Steinhardt 0 siblings, 1 reply; 85+ messages in thread From: Junio C Hamano @ 2026-04-09 19:39 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Justin Tobler Patrick Steinhardt <ps@pks.im> writes: > Stub out remaining functions that we either don't need or that are > basically no-ops. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > index 15a6a5ae64..1140b1b916 100644 > --- a/odb/source-inmemory.c > +++ b/odb/source-inmemory.c > @@ -299,6 +299,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source, > return 0; > } > > +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED, > + struct odb_transaction **out UNUSED) > +{ > + return error("inmemory source does not support transactions"); > +} > + > +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED, > + struct strvec *out UNUSED) > +{ > + return 0; > +} > + > +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED, > + const char *alternate UNUSED) > +{ > + return error("inmemory source does not support alternates"); > +} OK, 00/17 said it only fixed log message, but the messages or anything end-user facing should consistently say "in-memory". Or, "in-core", if "incore" is chosen as part of identifiers to be consistent with the rest of the system. ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions 2026-04-09 19:39 ` Junio C Hamano @ 2026-04-10 4:53 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 4:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Justin Tobler On Thu, Apr 09, 2026 at 12:39:00PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > Stub out remaining functions that we either don't need or that are > > basically no-ops. > > > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++ > > 1 file changed, 31 insertions(+) > > > > diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c > > index 15a6a5ae64..1140b1b916 100644 > > --- a/odb/source-inmemory.c > > +++ b/odb/source-inmemory.c > > @@ -299,6 +299,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source, > > return 0; > > } > > > > +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED, > > + struct odb_transaction **out UNUSED) > > +{ > > + return error("inmemory source does not support transactions"); > > +} > > + > > +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED, > > + struct strvec *out UNUSED) > > +{ > > + return 0; > > +} > > + > > +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED, > > + const char *alternate UNUSED) > > +{ > > + return error("inmemory source does not support alternates"); > > +} > > OK, 00/17 said it only fixed log message, but the messages or > anything end-user facing should consistently say "in-memory". > > Or, "in-core", if "incore" is chosen as part of identifiers to be > consistent with the rest of the system. Good catch, will fix. Thanks! Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v2 17/17] odb: generic in-memory source 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (15 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt @ 2026-04-09 7:24 ` Patrick Steinhardt 2026-04-09 11:44 ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 7:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Make the in-memory source generic. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 8 ++++---- odb.h | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/odb.c b/odb.c index 24e929f03c..965ef68e4e 100644 --- a/odb.c +++ b/odb.c @@ -560,7 +560,7 @@ static int do_oid_object_info_extended(struct object_database *odb, if (is_null_oid(real)) return -1; - if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags)) + if (!odb_source_read_object_info(odb->inmemory_objects, oid, oi, flags)) return 0; odb_prepare_alternates(odb); @@ -737,7 +737,7 @@ int odb_pretend_object(struct object_database *odb, if (odb_has_object(odb, oid, 0)) return 0; - return odb_source_write_object(&odb->inmemory_objects->base, + return odb_source_write_object(odb->inmemory_objects, buf, len, type, oid, NULL, 0); } @@ -1020,7 +1020,7 @@ struct object_database *odb_new(struct repository *repo, o->sources = odb_source_new(o, primary_source, true); o->sources_tail = &o->sources->next; o->alternate_db = xstrdup_or_null(secondary_sources); - o->inmemory_objects = odb_source_inmemory_new(o); + o->inmemory_objects = &odb_source_inmemory_new(o)->base; free(to_free); @@ -1045,7 +1045,7 @@ static void odb_free_sources(struct object_database *o) o->sources = next; } - odb_source_free(&o->inmemory_objects->base); + odb_source_free(o->inmemory_objects); o->inmemory_objects = NULL; kh_destroy_odb_path_map(o->source_by_path); diff --git a/odb.h b/odb.h index c3a7edf9c8..73553ed5a7 100644 --- a/odb.h +++ b/odb.h @@ -81,7 +81,7 @@ struct object_database { * to write them into the object store (e.g. a browse-only * application). */ - struct odb_source_inmemory *inmemory_objects; + struct odb_source *inmemory_objects; /* * A fast, rough count of the number of objects in the repository. -- 2.54.0.rc0.680.geaeac8ef83.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v2 00/17] odb: introduce "in-memory" source 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt ` (16 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 17/17] odb: generic in-memory source Patrick Steinhardt @ 2026-04-09 11:44 ` Karthik Nayak 2026-04-09 11:48 ` Patrick Steinhardt 17 siblings, 1 reply; 85+ messages in thread From: Karthik Nayak @ 2026-04-09 11:44 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 1560 bytes --] Patrick Steinhardt <ps@pks.im> writes: > Hi, > > this patch series introduces the second object database source type, > which is the "in-memory" source. > > This source may seem somewhat odd at first: it always starts out empty, > and any object written into it will only exist in memory until the > process exits. But the source already serves a purpose in our codebase, > where some commands, for example git-blame(1), write an in-memory > worktree commit. > > Furthermore, I think that going forward it can serve more purposes as we > now have an easy way to write and read objects that will not get > persisted. I could see that this may be useful when for example > re-merging diffs. But eventually, once we have the object storage format > extension wired up, callers might even want to manually set up an > in-memory database as the primary ODB for write operations so that no > data will be persisted in an arbitrary write. > > Last but not least, this patch series also serves the purpose of > eventually getting rid of the `struct object_info::whence` member. > Instead, we'll simply yield the ODB source a specific object has been > read from, together with some backend-specific data, which gives > strictly more information compared to the status quo. > > The series is based onb15384c06f (A bit more post -rc1, 2026-04-08) > with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make > `write_object_stream()` pluggable, 2026-04-02) merged into it. > Was a nice read, only a few comments from me. Should be good with a re-roll! [snip] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v2 00/17] odb: introduce "in-memory" source 2026-04-09 11:44 ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak @ 2026-04-09 11:48 ` Patrick Steinhardt 0 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-09 11:48 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, Junio C Hamano, Justin Tobler On Thu, Apr 09, 2026 at 07:44:17AM -0400, Karthik Nayak wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > Hi, > > > > this patch series introduces the second object database source type, > > which is the "in-memory" source. > > > > This source may seem somewhat odd at first: it always starts out empty, > > and any object written into it will only exist in memory until the > > process exits. But the source already serves a purpose in our codebase, > > where some commands, for example git-blame(1), write an in-memory > > worktree commit. > > > > Furthermore, I think that going forward it can serve more purposes as we > > now have an easy way to write and read objects that will not get > > persisted. I could see that this may be useful when for example > > re-merging diffs. But eventually, once we have the object storage format > > extension wired up, callers might even want to manually set up an > > in-memory database as the primary ODB for write operations so that no > > data will be persisted in an arbitrary write. > > > > Last but not least, this patch series also serves the purpose of > > eventually getting rid of the `struct object_info::whence` member. > > Instead, we'll simply yield the ODB source a specific object has been > > read from, together with some backend-specific data, which gives > > strictly more information compared to the status quo. > > > > The series is based onb15384c06f (A bit more post -rc1, 2026-04-08) > > with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make > > `write_object_stream()` pluggable, 2026-04-02) merged into it. > > > > Was a nice read, only a few comments from me. Should be good with a > re-roll! Thanks! Will send the new version tomorrow to wait for some more feedback. Patrick ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v3 00/17] odb: introduce "in-memory" source 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt ` (17 preceding siblings ...) 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 01/17] " Patrick Steinhardt ` (17 more replies) 18 siblings, 18 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Hi, this patch series introduces the second object database source type, which is the "in-memory" source. This source may seem somewhat odd at first: it always starts out empty, and any object written into it will only exist in memory until the process exits. But the source already serves a purpose in our codebase, where some commands, for example git-blame(1), write an in-memory worktree commit. Furthermore, I think that going forward it can serve more purposes as we now have an easy way to write and read objects that will not get persisted. I could see that this may be useful when for example re-merging diffs. But eventually, once we have the object storage format extension wired up, callers might even want to manually set up an in-memory database as the primary ODB for write operations so that no data will be persisted in an arbitrary write. Last but not least, this patch series also serves the purpose of eventually getting rid of the `struct object_info::whence` member. Instead, we'll simply yield the ODB source a specific object has been read from, together with some backend-specific data, which gives strictly more information compared to the status quo. The series is based onb15384c06f (A bit more post -rc1, 2026-04-08) with jt/odb-transaction-write at ddf6aee9c6 (odb/transaction: make `write_object_stream()` pluggable, 2026-04-02) merged into it. Changes in v2: - Fix handling of object IDs when writing objects. - I've changed the base of this series to include Justin's refactorings for the ODB write streams. I've updated the above paragraph detailing the merge base accordingly. @Junio: I'm fine to defer this patch series a bit until Justin's patch series has been merged to `next` in case this causes inconvenience. - Use "in-memory" instead of "inmemory" in commit messages. - Link to v1: https://patch.msgid.link/20260403-b4-pks-odb-source-inmemory-v1-0-8b8d1abaa25e@pks.im Changes in v3: - Fix a couple more instances where we were saying "inmemory" in prose. - Fix streaming interface when reading an object. - Add unit tests to exercise full functionality of the new source. Some of the functionality isn't exercised in our code base yet, so this allows us to verify that things work as expected. - Link to v2: https://patch.msgid.link/20260409-b4-pks-odb-source-inmemory-v2-0-f02b4f1c0f13@pks.im Thanks! Patrick --- Patrick Steinhardt (17): odb: introduce "in-memory" source odb/source-inmemory: implement `free()` callback odb: fix unnecessary call to `find_cached_object()` odb/source-inmemory: implement `read_object_info()` callback odb/source-inmemory: implement `read_object_stream()` callback odb/source-inmemory: implement `write_object()` callback odb/source-inmemory: implement `write_object_stream()` callback cbtree: allow using arbitrary wrapper structures for nodes oidtree: add ability to store data odb/source-inmemory: convert to use oidtree odb/source-inmemory: implement `for_each_object()` callback odb/source-inmemory: implement `find_abbrev_len()` callback odb/source-inmemory: implement `count_objects()` callback odb/source-inmemory: implement `freshen_object()` callback odb/source-inmemory: stub out remaining functions odb: generic in-memory source t/unit-tests: add tests for the in-memory object source Makefile | 2 + cbtree.c | 25 ++- cbtree.h | 17 +- loose.c | 2 +- meson.build | 1 + object-file.c | 3 +- odb.c | 82 ++------- odb.h | 4 +- odb/source-inmemory.c | 382 ++++++++++++++++++++++++++++++++++++++++++ odb/source-inmemory.h | 33 ++++ odb/source.h | 3 + oidtree.c | 66 +++++--- oidtree.h | 12 +- t/meson.build | 1 + t/unit-tests/u-odb-inmemory.c | 313 ++++++++++++++++++++++++++++++++++ t/unit-tests/u-oidtree.c | 26 ++- 16 files changed, 854 insertions(+), 118 deletions(-) Range-diff versus v2: 1: b18e427c69 ! 1: 155b2cdf81 odb: introduce "in-memory" source @@ odb/source-inmemory.h (new) +struct cached_object_entry; + +/* -+ * An inmemory source that you can write objects to that shall be made ++ * An in-memory source that you can write objects to that shall be made + * available for reading, but that shouldn't ever be persisted to disk. Note + * that any objects written to this source will be stored in memory, so the + * number of objects you can store is limited by available system memory. @@ odb/source-inmemory.h (new) +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb); + +/* -+ * Cast the given object database source to the inmemory backend. This will ++ * Cast the given object database source to the in-memory backend. This will + * cause a BUG in case the source doesn't use this backend. + */ +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source) +{ + if (source->type != ODB_SOURCE_INMEMORY) -+ BUG("trying to downcast source of type '%d' to inmemory", source->type); ++ BUG("trying to downcast source of type '%d' to in-memory", source->type); + return container_of(source, struct odb_source_inmemory, base); +} + @@ odb/source.h: enum odb_source_type { /* The "files" backend that uses loose objects and packfiles. */ ODB_SOURCE_FILES, + -+ /* The "inmemory" backend that stores objects in memory. */ ++ /* The "in-memory" backend that stores objects in memory. */ + ODB_SOURCE_INMEMORY, }; 2: 8fd337da90 ! 2: c66edd10a8 odb/source-inmemory: implement `free()` callback @@ odb/source-inmemory.h +}; /* - * An inmemory source that you can write objects to that shall be made + * An in-memory source that you can write objects to that shall be made 3: f4ae2a2bde = 3: a86549f39c odb: fix unnecessary call to `find_cached_object()` 4: 8600b88530 = 4: 49ac739dd2 odb/source-inmemory: implement `read_object_info()` callback 5: ab33c0b7ee ! 5: 321ef11be3 odb/source-inmemory: implement `read_object_stream()` callback @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od +struct odb_read_stream_inmemory { + struct odb_read_stream base; -+ const void *buf; ++ const unsigned char *buf; + size_t offset; +}; + @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od + + if (buf_len > inmemory->base.size - inmemory->offset) + bytes = inmemory->base.size - inmemory->offset; -+ memcpy(buf, inmemory->buf, bytes); ++ ++ memcpy(buf, inmemory->buf + inmemory->offset, bytes); ++ inmemory->offset += bytes; + + return bytes; +} 6: 983f886eeb ! 6: 506df5e488 odb/source-inmemory: implement `write_object()` callback @@ odb.c: int odb_pretend_object(struct object_database *odb, void *odb_read_object(struct object_database *odb, ## odb/source-inmemory.c ## +@@ + #include "git-compat-util.h" ++#include "object-file.h" + #include "odb.h" + #include "odb/source-inmemory.h" + #include "odb/streaming.h" @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, return 0; } @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct cached_object_entry *object; + ++ hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); ++ + ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, + inmemory->objects_alloc); + object = &inmemory->objects[inmemory->objects_nr++]; 7: 68edefa269 < -: ---------- odb/source-inmemory: implement `write_object()` callback 8: 18d451152b ! 7: 21eef34c1b odb/source-inmemory: implement `write_object_stream()` callback @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so + goto out; + } + -+ memcpy(data, buf, bytes_read); ++ memcpy(data + total_read, buf, bytes_read); + total_read += bytes_read; + } + 9: cee53b9853 ! 8: 504e34d116 cbtree: allow using arbitrary wrapper structures for nodes @@ cbtree.c: int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, ## cbtree.h ## +@@ + * + * This is adapted to store arbitrary data (not just NUL-terminated C strings + * and allocates no memory internally. The user needs to allocate +- * "struct cb_node" and fill cb_node.k[] with arbitrary match data +- * for memcmp. +- * If "klen" is variable, then it should be embedded into "c_node.k[]" ++ * "struct cb_node" and provide `key_offset` to indicate where the key can be ++ * found relative to the `struct cb_node` for memcmp. ++ * If "klen" is variable, then it should be embedded into the key. + * Recursion is bound by the maximum value of "klen" used. + */ + #ifndef CBTREE_H @@ cbtree.h: struct cb_node { */ uint32_t byte; 10: 8ad5b81b13 = 9: 9bdd475a92 oidtree: add ability to store data 11: 1ed2d23137 ! 10: 956b989529 odb/source-inmemory: convert to use oidtree @@ odb/source-inmemory.h +struct oidtree; /* - * An inmemory source that you can write objects to that shall be made + * An in-memory source that you can write objects to that shall be made @@ odb/source-inmemory.h: struct cached_object_entry { */ struct odb_source_inmemory { 12: 99fbb1cc35 ! 11: bec1428116 odb/source-inmemory: implement `for_each_object()` callback @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct + if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) || + (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local)) + return 0; ++ if (!inmemory->objects) ++ return 0; + + return oidtree_each(inmemory->objects, + opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len, 13: c87a621f39 = 12: 32dada3c27 odb/source-inmemory: implement `find_abbrev_len()` callback 14: 9b88f0c07b = 13: 43127840c0 odb/source-inmemory: implement `count_objects()` callback 15: 3c9493f2bb = 14: 439acbd068 odb/source-inmemory: implement `freshen_object()` callback 16: f2b6317104 ! 15: 12c1b6ffd2 odb/source-inmemory: stub out remaining functions @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_ +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED, + struct odb_transaction **out UNUSED) +{ -+ return error("inmemory source does not support transactions"); ++ return error("in-memory source does not support transactions"); +} + +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED, @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_ +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED, + const char *alternate UNUSED) +{ -+ return error("inmemory source does not support alternates"); ++ return error("in-memory source does not support alternates"); +} + +static void odb_source_inmemory_close(struct odb_source *source UNUSED) 17: 81da5d5048 = 16: ef37a61e7f odb: generic in-memory source -: ---------- > 17: 51b51e0382 t/unit-tests: add tests for the in-memory object source --- base-commit: a3ebc5a08e67ccac4c915622049a968a31e48662 change-id: 20260401-b4-pks-odb-source-inmemory-7b17c83d9e43 ^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH v3 01/17] odb: introduce "in-memory" source 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt ` (16 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Next to our typical object database sources, each object database also has an implicit source of "cached" objects. These cached objects only exist in memory and some use cases: - They contain evergreen objects that we expect to always exist, like for example the empty tree. - They can be used to store temporary objects that we don't want to persist to disk, which is used by git-blame(1) to create a fake worktree commit. Overall, their use is somewhat restricted though. For example, we don't provide the ability to use it as a temporary object database source that allows the user to write objects, but discard them after Git exists. So while these cached objects behave almost like a source, they aren't used as one. This is about to change over the following commits, where we will turn cached objects into a new "in-memory" source. This will allow us to use it exactly the same as any other source by providing the same common interface as the "files" source. For now, the in-memory source only hosts the cached objects and doesn't provide any logic yet. This will change with subsequent commits, where we move respective functionality into the source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Makefile | 1 + meson.build | 1 + odb.c | 21 +++++++++++++-------- odb.h | 4 ++-- odb/source-inmemory.c | 12 ++++++++++++ odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++ odb/source.h | 3 +++ 7 files changed, 67 insertions(+), 10 deletions(-) diff --git a/Makefile b/Makefile index 22a8993482..3cda12c455 100644 --- a/Makefile +++ b/Makefile @@ -1218,6 +1218,7 @@ LIB_OBJS += object.o LIB_OBJS += odb.o LIB_OBJS += odb/source.o LIB_OBJS += odb/source-files.o +LIB_OBJS += odb/source-inmemory.o LIB_OBJS += odb/streaming.o LIB_OBJS += odb/transaction.o LIB_OBJS += oid-array.o diff --git a/meson.build b/meson.build index 6dc23b3af2..ffa73ce7ce 100644 --- a/meson.build +++ b/meson.build @@ -404,6 +404,7 @@ libgit_sources = [ 'odb.c', 'odb/source.c', 'odb/source-files.c', + 'odb/source-inmemory.c', 'odb/streaming.c', 'odb/transaction.c', 'oid-array.c', diff --git a/odb.c b/odb.c index 40a5e9c4e0..60e1eead25 100644 --- a/odb.c +++ b/odb.c @@ -14,6 +14,7 @@ #include "object-file.h" #include "object-name.h" #include "odb.h" +#include "odb/source-inmemory.h" #include "packfile.h" #include "path.h" #include "promisor-remote.h" @@ -53,9 +54,9 @@ static const struct cached_object *find_cached_object(struct object_database *ob .type = OBJ_TREE, .buf = "", }; - const struct cached_object_entry *co = object_store->cached_objects; + const struct cached_object_entry *co = object_store->inmemory_objects->objects; - for (size_t i = 0; i < object_store->cached_object_nr; i++, co++) + for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++) if (oideq(&co->oid, oid)) return &co->value; @@ -792,9 +793,10 @@ int odb_pretend_object(struct object_database *odb, find_cached_object(odb, oid)) return 0; - ALLOC_GROW(odb->cached_objects, - odb->cached_object_nr + 1, odb->cached_object_alloc); - co = &odb->cached_objects[odb->cached_object_nr++]; + ALLOC_GROW(odb->inmemory_objects->objects, + odb->inmemory_objects->objects_nr + 1, + odb->inmemory_objects->objects_alloc); + co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++]; co->value.size = len; co->value.type = type; co_buf = xmalloc(len); @@ -1083,6 +1085,7 @@ struct object_database *odb_new(struct repository *repo, o->sources = odb_source_new(o, primary_source, true); o->sources_tail = &o->sources->next; o->alternate_db = xstrdup_or_null(secondary_sources); + o->inmemory_objects = odb_source_inmemory_new(o); free(to_free); @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o) odb_close(o); odb_free_sources(o); - for (size_t i = 0; i < o->cached_object_nr; i++) - free((char *) o->cached_objects[i].value.buf); - free(o->cached_objects); + for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) + free((char *) o->inmemory_objects->objects[i].value.buf); + free(o->inmemory_objects->objects); + free(o->inmemory_objects->base.path); + free(o->inmemory_objects); string_list_clear(&o->submodule_source_paths, 0); diff --git a/odb.h b/odb.h index 9eb8355aca..c3a7edf9c8 100644 --- a/odb.h +++ b/odb.h @@ -8,6 +8,7 @@ #include "thread-utils.h" struct cached_object_entry; +struct odb_source_inmemory; struct packed_git; struct repository; struct strbuf; @@ -80,8 +81,7 @@ struct object_database { * to write them into the object store (e.g. a browse-only * application). */ - struct cached_object_entry *cached_objects; - size_t cached_object_nr, cached_object_alloc; + struct odb_source_inmemory *inmemory_objects; /* * A fast, rough count of the number of objects in the repository. diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c new file mode 100644 index 0000000000..c7ac5c24f0 --- /dev/null +++ b/odb/source-inmemory.c @@ -0,0 +1,12 @@ +#include "git-compat-util.h" +#include "odb/source-inmemory.h" + +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) +{ + struct odb_source_inmemory *source; + + CALLOC_ARRAY(source, 1); + odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); + + return source; +} diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h new file mode 100644 index 0000000000..15db068ef7 --- /dev/null +++ b/odb/source-inmemory.h @@ -0,0 +1,35 @@ +#ifndef ODB_SOURCE_INMEMORY_H +#define ODB_SOURCE_INMEMORY_H + +#include "odb/source.h" + +struct cached_object_entry; + +/* + * An in-memory source that you can write objects to that shall be made + * available for reading, but that shouldn't ever be persisted to disk. Note + * that any objects written to this source will be stored in memory, so the + * number of objects you can store is limited by available system memory. + */ +struct odb_source_inmemory { + struct odb_source base; + + struct cached_object_entry *objects; + size_t objects_nr, objects_alloc; +}; + +/* Create a new in-memory object database source. */ +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb); + +/* + * Cast the given object database source to the in-memory backend. This will + * cause a BUG in case the source doesn't use this backend. + */ +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source) +{ + if (source->type != ODB_SOURCE_INMEMORY) + BUG("trying to downcast source of type '%d' to in-memory", source->type); + return container_of(source, struct odb_source_inmemory, base); +} + +#endif diff --git a/odb/source.h b/odb/source.h index f706e0608a..0a440884e4 100644 --- a/odb/source.h +++ b/odb/source.h @@ -13,6 +13,9 @@ enum odb_source_type { /* The "files" backend that uses loose objects and packfiles. */ ODB_SOURCE_FILES, + + /* The "in-memory" backend that stores objects in memory. */ + ODB_SOURCE_INMEMORY, }; struct object_id; -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 01/17] " Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt ` (15 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `free()` callback function for the "in-memory" source. Note that this requires us to define `struct cached_object_entry` in "odb/source-inmemory.h", as it is accessed in both "odb.c" and "odb/source-inmemory.c" now. This will be fixed in subsequent commits though. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 25 ++++--------------------- odb/source-inmemory.c | 12 ++++++++++++ odb/source-inmemory.h | 9 ++++++++- 3 files changed, 24 insertions(+), 22 deletions(-) diff --git a/odb.c b/odb.c index 60e1eead25..1d65825ed3 100644 --- a/odb.c +++ b/odb.c @@ -32,21 +32,6 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, struct odb_source *, 1, fspathhash, fspatheq) -/* - * This is meant to hold a *small* number of objects that you would - * want odb_read_object() to be able to return, but yet you do not want - * to write them into the object store (e.g. a browse-only - * application). - */ -struct cached_object_entry { - struct object_id oid; - struct cached_object { - enum object_type type; - const void *buf; - unsigned long size; - } value; -}; - static const struct cached_object *find_cached_object(struct object_database *object_store, const struct object_id *oid) { @@ -1109,6 +1094,10 @@ static void odb_free_sources(struct object_database *o) odb_source_free(o->sources); o->sources = next; } + + odb_source_free(&o->inmemory_objects->base); + o->inmemory_objects = NULL; + kh_destroy_odb_path_map(o->source_by_path); o->source_by_path = NULL; } @@ -1126,12 +1115,6 @@ void odb_free(struct object_database *o) odb_close(o); odb_free_sources(o); - for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++) - free((char *) o->inmemory_objects->objects[i].value.buf); - free(o->inmemory_objects->objects); - free(o->inmemory_objects->base.path); - free(o->inmemory_objects); - string_list_clear(&o->submodule_source_paths, 0); free(o); diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index c7ac5c24f0..ccbb622eae 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,6 +1,16 @@ #include "git-compat-util.h" #include "odb/source-inmemory.h" +static void odb_source_inmemory_free(struct odb_source *source) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + for (size_t i = 0; i < inmemory->objects_nr; i++) + free((char *) inmemory->objects[i].value.buf); + free(inmemory->objects); + free(inmemory->base.path); + free(inmemory); +} + struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) { struct odb_source_inmemory *source; @@ -8,5 +18,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) CALLOC_ARRAY(source, 1); odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); + source->base.free = odb_source_inmemory_free; + return source; } diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h index 15db068ef7..d1b05a3996 100644 --- a/odb/source-inmemory.h +++ b/odb/source-inmemory.h @@ -3,7 +3,14 @@ #include "odb/source.h" -struct cached_object_entry; +struct cached_object_entry { + struct object_id oid; + struct cached_object { + enum object_type type; + const void *buf; + unsigned long size; + } value; +}; /* * An in-memory source that you can write objects to that shall be made -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 01/17] " Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt ` (14 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The function `odb_pretend_object()` writes an object into the in-memory object database source. The effect of this is that the object will now become readable, but it won't ever be persisted to disk. Before storing the object, we first verify whether the object already exists. This is done by calling `odb_has_object()` to check all sources, followed by `find_cached_object()` to check whether we have already stored the object in our in-memory source. This is unnecessary though, as `odb_has_object()` already checks the in-memory source transitively via: - `odb_has_object()` - `odb_read_object_info_extended()` - `do_oid_object_info_extended()` - `find_cached_object()` Drop the explicit call to `find_cached_object()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/odb.c b/odb.c index 1d65825ed3..ea3fcf5e11 100644 --- a/odb.c +++ b/odb.c @@ -774,8 +774,7 @@ int odb_pretend_object(struct object_database *odb, char *co_buf; hash_object_file(odb->repo->hash_algo, buf, len, type, oid); - if (odb_has_object(odb, oid, 0) || - find_cached_object(odb, oid)) + if (odb_has_object(odb, oid, 0)) return 0; ALLOC_GROW(odb->inmemory_objects->objects, -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (2 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt ` (13 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `read_object_info()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 39 +------------------------------------ odb/source-inmemory.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+), 38 deletions(-) diff --git a/odb.c b/odb.c index ea3fcf5e11..6a3912adac 100644 --- a/odb.c +++ b/odb.c @@ -32,25 +32,6 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, struct odb_source *, 1, fspathhash, fspatheq) -static const struct cached_object *find_cached_object(struct object_database *object_store, - const struct object_id *oid) -{ - static const struct cached_object empty_tree = { - .type = OBJ_TREE, - .buf = "", - }; - const struct cached_object_entry *co = object_store->inmemory_objects->objects; - - for (size_t i = 0; i < object_store->inmemory_objects->objects_nr; i++, co++) - if (oideq(&co->oid, oid)) - return &co->value; - - if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) - return &empty_tree; - - return NULL; -} - int odb_mkstemp(struct object_database *odb, struct strbuf *temp_filename, const char *pattern) { @@ -570,7 +551,6 @@ static int do_oid_object_info_extended(struct object_database *odb, const struct object_id *oid, struct object_info *oi, unsigned flags) { - const struct cached_object *co; const struct object_id *real = oid; int already_retried = 0; @@ -580,25 +560,8 @@ static int do_oid_object_info_extended(struct object_database *odb, if (is_null_oid(real)) return -1; - co = find_cached_object(odb, real); - if (co) { - if (oi) { - if (oi->typep) - *(oi->typep) = co->type; - if (oi->sizep) - *(oi->sizep) = co->size; - if (oi->disk_sizep) - *(oi->disk_sizep) = 0; - if (oi->delta_base_oid) - oidclr(oi->delta_base_oid, odb->repo->hash_algo); - if (oi->contentp) - *oi->contentp = xmemdupz(co->buf, co->size); - if (oi->mtimep) - *oi->mtimep = 0; - oi->whence = OI_CACHED; - } + if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags)) return 0; - } odb_prepare_alternates(odb); diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index ccbb622eae..12c80f9b34 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,5 +1,57 @@ #include "git-compat-util.h" +#include "odb.h" #include "odb/source-inmemory.h" +#include "repository.h" + +static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, + const struct object_id *oid) +{ + static const struct cached_object empty_tree = { + .type = OBJ_TREE, + .buf = "", + }; + const struct cached_object_entry *co = source->objects; + + for (size_t i = 0; i < source->objects_nr; i++, co++) + if (oideq(&co->oid, oid)) + return &co->value; + + if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) + return &empty_tree; + + return NULL; +} + +static int odb_source_inmemory_read_object_info(struct odb_source *source, + const struct object_id *oid, + struct object_info *oi, + enum object_info_flags flags UNUSED) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + const struct cached_object *object; + + object = find_cached_object(inmemory, oid); + if (!object) + return -1; + + if (oi) { + if (oi->typep) + *(oi->typep) = object->type; + if (oi->sizep) + *(oi->sizep) = object->size; + if (oi->disk_sizep) + *(oi->disk_sizep) = 0; + if (oi->delta_base_oid) + oidclr(oi->delta_base_oid, source->odb->repo->hash_algo); + if (oi->contentp) + *oi->contentp = xmemdupz(object->buf, object->size); + if (oi->mtimep) + *oi->mtimep = 0; + oi->whence = OI_CACHED; + } + + return 0; +} static void odb_source_inmemory_free(struct odb_source *source) { @@ -19,6 +71,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); source->base.free = odb_source_inmemory_free; + source->base.read_object_info = odb_source_inmemory_read_object_info; return source; } -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (3 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt ` (12 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `read_object_stream()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 12c80f9b34..39f0e799c7 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,6 +1,7 @@ #include "git-compat-util.h" #include "odb.h" #include "odb/source-inmemory.h" +#include "odb/streaming.h" #include "repository.h" static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, @@ -53,6 +54,56 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, return 0; } +struct odb_read_stream_inmemory { + struct odb_read_stream base; + const unsigned char *buf; + size_t offset; +}; + +static ssize_t odb_read_stream_inmemory_read(struct odb_read_stream *stream, + char *buf, size_t buf_len) +{ + struct odb_read_stream_inmemory *inmemory = + container_of(stream, struct odb_read_stream_inmemory, base); + size_t bytes = buf_len; + + if (buf_len > inmemory->base.size - inmemory->offset) + bytes = inmemory->base.size - inmemory->offset; + + memcpy(buf, inmemory->buf + inmemory->offset, bytes); + inmemory->offset += bytes; + + return bytes; +} + +static int odb_read_stream_inmemory_close(struct odb_read_stream *stream UNUSED) +{ + return 0; +} + +static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, + struct odb_source *source, + const struct object_id *oid) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct odb_read_stream_inmemory *stream; + const struct cached_object *object; + + object = find_cached_object(inmemory, oid); + if (!object) + return -1; + + CALLOC_ARRAY(stream, 1); + stream->base.read = odb_read_stream_inmemory_read; + stream->base.close = odb_read_stream_inmemory_close; + stream->base.size = object->size; + stream->base.type = object->type; + stream->buf = object->buf; + + *out = &stream->base; + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -72,6 +123,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; + source->base.read_object_stream = odb_source_inmemory_read_object_stream; return source; } -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (4 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt ` (11 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `write_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 16 ++-------------- odb/source-inmemory.c | 25 +++++++++++++++++++++++++ 2 files changed, 27 insertions(+), 14 deletions(-) diff --git a/odb.c b/odb.c index 6a3912adac..24e929f03c 100644 --- a/odb.c +++ b/odb.c @@ -733,24 +733,12 @@ int odb_pretend_object(struct object_database *odb, void *buf, unsigned long len, enum object_type type, struct object_id *oid) { - struct cached_object_entry *co; - char *co_buf; - hash_object_file(odb->repo->hash_algo, buf, len, type, oid); if (odb_has_object(odb, oid, 0)) return 0; - ALLOC_GROW(odb->inmemory_objects->objects, - odb->inmemory_objects->objects_nr + 1, - odb->inmemory_objects->objects_alloc); - co = &odb->inmemory_objects->objects[odb->inmemory_objects->objects_nr++]; - co->value.size = len; - co->value.type = type; - co_buf = xmalloc(len); - memcpy(co_buf, buf, len); - co->value.buf = co_buf; - oidcpy(&co->oid, oid); - return 0; + return odb_source_write_object(&odb->inmemory_objects->base, + buf, len, type, oid, NULL, 0); } void *odb_read_object(struct object_database *odb, diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 39f0e799c7..4848011df5 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -1,4 +1,5 @@ #include "git-compat-util.h" +#include "object-file.h" #include "odb.h" #include "odb/source-inmemory.h" #include "odb/streaming.h" @@ -104,6 +105,29 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, return 0; } +static int odb_source_inmemory_write_object(struct odb_source *source, + const void *buf, unsigned long len, + enum object_type type, + struct object_id *oid, + struct object_id *compat_oid UNUSED, + enum odb_write_object_flags flags UNUSED) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct cached_object_entry *object; + + hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); + + ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, + inmemory->objects_alloc); + object = &inmemory->objects[inmemory->objects_nr++]; + object->value.size = len; + object->value.type = type; + object->value.buf = xmemdupz(buf, len); + oidcpy(&object->oid, oid); + + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -124,6 +148,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; + source->base.write_object = odb_source_inmemory_write_object; return source; } -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (5 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt ` (10 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `write_object_stream()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 4848011df5..d05a13df45 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -128,6 +128,45 @@ static int odb_source_inmemory_write_object(struct odb_source *source, return 0; } +static int odb_source_inmemory_write_object_stream(struct odb_source *source, + struct odb_write_stream *stream, + size_t len, + struct object_id *oid) +{ + char buf[16384]; + size_t total_read = 0; + char *data; + int ret; + + CALLOC_ARRAY(data, len); + while (!stream->is_finished) { + ssize_t bytes_read; + + bytes_read = odb_write_stream_read(stream, buf, sizeof(buf)); + if (total_read + bytes_read > len) { + ret = error("object stream yielded more bytes than expected"); + goto out; + } + + memcpy(data + total_read, buf, bytes_read); + total_read += bytes_read; + } + + if (total_read != len) { + ret = error("object stream yielded less bytes than expected"); + goto out; + } + + ret = odb_source_inmemory_write_object(source, data, len, OBJ_BLOB, oid, + NULL, 0); + if (ret < 0) + goto out; + +out: + free(data); + return ret; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); @@ -149,6 +188,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.write_object = odb_source_inmemory_write_object; + source->base.write_object_stream = odb_source_inmemory_write_object_stream; return source; } -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (6 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 09/17] oidtree: add ability to store data Patrick Steinhardt ` (9 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The cbtree subsystem allows the user to store arbitrary data in a prefix-free set of strings. This is used by us to store object IDs in a way that we can easily iterate through them in lexicograph order, and so that we can easily perform lookups with shortened object IDs. In its current form, it is not easily possible to store arbitrary data with the tree nodes. There are a couple of approaches such a caller could try to use, but none of them really work: - One may embed the `struct cb_node` in a custom structure. This does not work though as `struct cb_node` contains a flex array, and embedding such a struct in another struct is forbidden. - One may use a `union` over `struct cb_node` and ones own data type, which _is_ allowed even if the struct contains a flex array. This does not work though, as the compiler may align members of the struct so that the node key would not immediately start where the flex array starts. - One may allocate `struct cb_node` such that it has room for both its key and the custom data. This has the downside though that if the custom data is itself a pointer to allocated memory, then the leak checker will not consider the pointer to be alive anymore. Refactor the cbtree to drop the flex array and instead take in an explicit offset for where to find the key, which allows the caller to embed `struct cb_node` is a wrapper struct. Note that this change has the downside that we now have a bit of padding in our structure, which grows the size from 60 to 64 bytes on a 64 bit system. On the other hand though, it allows us to get rid of the memory copies that we previously had to do to ensure proper alignment. This seems like a reasonable tradeoff. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- cbtree.c | 25 ++++++++++++++++++------- cbtree.h | 17 +++++++++-------- oidtree.c | 33 ++++++++++++++------------------- 3 files changed, 41 insertions(+), 34 deletions(-) diff --git a/cbtree.c b/cbtree.c index 4ab794bddc..8f5edbb80a 100644 --- a/cbtree.c +++ b/cbtree.c @@ -7,6 +7,11 @@ #include "git-compat-util.h" #include "cbtree.h" +static inline uint8_t *cb_node_key(struct cb_tree *t, struct cb_node *node) +{ + return (uint8_t *) node + t->key_offset; +} + static struct cb_node *cb_node_of(const void *p) { return (struct cb_node *)((uintptr_t)p - 1); @@ -33,6 +38,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) uint8_t c; int newdirection; struct cb_node **wherep, *p; + uint8_t *node_key, *p_key; assert(!((uintptr_t)node & 1)); /* allocations must be aligned */ @@ -41,23 +47,26 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) return NULL; /* success */ } + node_key = cb_node_key(t, node); + /* see if a node already exists */ - p = cb_internal_best_match(t->root, node->k, klen); + p = cb_internal_best_match(t->root, node_key, klen); + p_key = cb_node_key(t, p); /* find first differing byte */ for (newbyte = 0; newbyte < klen; newbyte++) { - if (p->k[newbyte] != node->k[newbyte]) + if (p_key[newbyte] != node_key[newbyte]) goto different_byte_found; } return p; /* element exists, let user deal with it */ different_byte_found: - newotherbits = p->k[newbyte] ^ node->k[newbyte]; + newotherbits = p_key[newbyte] ^ node_key[newbyte]; newotherbits |= newotherbits >> 1; newotherbits |= newotherbits >> 2; newotherbits |= newotherbits >> 4; newotherbits = (newotherbits & ~(newotherbits >> 1)) ^ 255; - c = p->k[newbyte]; + c = p_key[newbyte]; newdirection = (1 + (newotherbits | c)) >> 8; node->byte = newbyte; @@ -78,7 +87,7 @@ struct cb_node *cb_insert(struct cb_tree *t, struct cb_node *node, size_t klen) break; if (q->byte == newbyte && q->otherbits > newotherbits) break; - c = q->byte < klen ? node->k[q->byte] : 0; + c = q->byte < klen ? node_key[q->byte] : 0; direction = (1 + (q->otherbits | c)) >> 8; wherep = q->child + direction; } @@ -93,7 +102,7 @@ struct cb_node *cb_lookup(struct cb_tree *t, const uint8_t *k, size_t klen) { struct cb_node *p = cb_internal_best_match(t->root, k, klen); - return p && !memcmp(p->k, k, klen) ? p : NULL; + return p && !memcmp(cb_node_key(t, p), k, klen) ? p : NULL; } static int cb_descend(struct cb_node *p, cb_iter fn, void *arg) @@ -115,6 +124,7 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, struct cb_node *p = t->root; struct cb_node *top = p; size_t i = 0; + uint8_t *p_key; if (!p) return 0; /* empty tree */ @@ -130,8 +140,9 @@ int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, top = p; } + p_key = cb_node_key(t, p); for (i = 0; i < klen; i++) { - if (p->k[i] != kpfx[i]) + if (p_key[i] != kpfx[i]) return 0; /* "best" match failed */ } diff --git a/cbtree.h b/cbtree.h index c374b1b3db..4647d4a32f 100644 --- a/cbtree.h +++ b/cbtree.h @@ -6,9 +6,9 @@ * * This is adapted to store arbitrary data (not just NUL-terminated C strings * and allocates no memory internally. The user needs to allocate - * "struct cb_node" and fill cb_node.k[] with arbitrary match data - * for memcmp. - * If "klen" is variable, then it should be embedded into "c_node.k[]" + * "struct cb_node" and provide `key_offset` to indicate where the key can be + * found relative to the `struct cb_node` for memcmp. + * If "klen" is variable, then it should be embedded into the key. * Recursion is bound by the maximum value of "klen" used. */ #ifndef CBTREE_H @@ -23,18 +23,19 @@ struct cb_node { */ uint32_t byte; uint8_t otherbits; - uint8_t k[FLEX_ARRAY]; /* arbitrary data, unaligned */ }; struct cb_tree { struct cb_node *root; + ptrdiff_t key_offset; }; -#define CBTREE_INIT { 0 } - -static inline void cb_init(struct cb_tree *t) +static inline void cb_init(struct cb_tree *t, + ptrdiff_t key_offset) { - struct cb_tree blank = CBTREE_INIT; + struct cb_tree blank = { + .key_offset = key_offset, + }; memcpy(t, &blank, sizeof(*t)); } diff --git a/oidtree.c b/oidtree.c index ab9fe7ec7a..117649753f 100644 --- a/oidtree.c +++ b/oidtree.c @@ -6,9 +6,14 @@ #include "oidtree.h" #include "hash.h" +struct oidtree_node { + struct cb_node base; + struct object_id key; +}; + void oidtree_init(struct oidtree *ot) { - cb_init(&ot->tree); + cb_init(&ot->tree, offsetof(struct oidtree_node, key)); mem_pool_init(&ot->mem_pool, 0); } @@ -22,20 +27,13 @@ void oidtree_clear(struct oidtree *ot) void oidtree_insert(struct oidtree *ot, const struct object_id *oid) { - struct cb_node *on; - struct object_id k; + struct oidtree_node *on; if (!oid->algo) BUG("oidtree_insert requires oid->algo"); - on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid)); - - /* - * Clear the padding and copy the result in separate steps to - * respect the 4-byte alignment needed by struct object_id. - */ - oidcpy(&k, oid); - memcpy(on->k, &k, sizeof(k)); + on = mem_pool_alloc(&ot->mem_pool, sizeof(*on)); + oidcpy(&on->key, oid); /* * n.b. Current callers won't get us duplicates, here. If a @@ -43,7 +41,7 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid) * that won't be freed until oidtree_clear. Currently it's not * worth maintaining a free list */ - cb_insert(&ot->tree, on, sizeof(*oid)); + cb_insert(&ot->tree, &on->base, sizeof(*oid)); } bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) @@ -73,21 +71,18 @@ struct oidtree_each_data { static int iter(struct cb_node *n, void *cb_data) { + struct oidtree_node *node = container_of(n, struct oidtree_node, base); struct oidtree_each_data *data = cb_data; - struct object_id k; - - /* Copy to provide 4-byte alignment needed by struct object_id. */ - memcpy(&k, n->k, sizeof(k)); - if (data->algo != GIT_HASH_UNKNOWN && data->algo != k.algo) + if (data->algo != GIT_HASH_UNKNOWN && data->algo != node->key.algo) return 0; if (data->last_nibble_at) { - if ((k.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0) + if ((node->key.hash[*data->last_nibble_at] ^ data->last_byte) & 0xf0) return 0; } - return data->cb(&k, data->cb_data); + return data->cb(&node->key, data->cb_data); } int oidtree_each(struct oidtree *ot, const struct object_id *prefix, -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 09/17] oidtree: add ability to store data 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (7 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt ` (8 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The oidtree data structure is currently only used to store object IDs, without any associated data. So consequently, it can only really be used to track which object IDs exist, and we can use the tree structure to efficiently operate on OID prefixes. But there are valid use cases where we want to both: - Store object IDs in a sorted order. - Associated arbitrary data with them. Refactor the oidtree interface so that it allows us to store arbitrary payloads within the respective nodes. This will be used in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- loose.c | 2 +- object-file.c | 3 ++- oidtree.c | 37 ++++++++++++++++++++++++++++++++----- oidtree.h | 12 ++++++++++-- t/unit-tests/u-oidtree.c | 26 +++++++++++++++++++++++--- 5 files changed, 68 insertions(+), 12 deletions(-) diff --git a/loose.c b/loose.c index 07333be696..f7a3dd1a72 100644 --- a/loose.c +++ b/loose.c @@ -57,7 +57,7 @@ static int insert_loose_map(struct odb_source *source, inserted |= insert_oid_pair(map->to_compat, oid, compat_oid); inserted |= insert_oid_pair(map->to_storage, compat_oid, oid); if (inserted) - oidtree_insert(files->loose->cache, compat_oid); + oidtree_insert(files->loose->cache, compat_oid, NULL); return inserted; } diff --git a/object-file.c b/object-file.c index 3e70e5d668..d04ab57253 100644 --- a/object-file.c +++ b/object-file.c @@ -1857,6 +1857,7 @@ static int for_each_object_wrapper_cb(const struct object_id *oid, } static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid, + void *node_data UNUSED, void *cb_data) { struct for_each_object_wrapper_data *data = cb_data; @@ -2002,7 +2003,7 @@ static int append_loose_object(const struct object_id *oid, const char *path UNUSED, void *data) { - oidtree_insert(data, oid); + oidtree_insert(data, oid, NULL); return 0; } diff --git a/oidtree.c b/oidtree.c index 117649753f..e43f18026e 100644 --- a/oidtree.c +++ b/oidtree.c @@ -9,6 +9,7 @@ struct oidtree_node { struct cb_node base; struct object_id key; + void *data; }; void oidtree_init(struct oidtree *ot) @@ -25,15 +26,22 @@ void oidtree_clear(struct oidtree *ot) } } -void oidtree_insert(struct oidtree *ot, const struct object_id *oid) +struct oidtree_data { + struct object_id oid; +}; + +void oidtree_insert(struct oidtree *ot, const struct object_id *oid, + void *data) { struct oidtree_node *on; + struct cb_node *node; if (!oid->algo) BUG("oidtree_insert requires oid->algo"); on = mem_pool_alloc(&ot->mem_pool, sizeof(*on)); oidcpy(&on->key, oid); + on->data = data; /* * n.b. Current callers won't get us duplicates, here. If a @@ -41,13 +49,19 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid) * that won't be freed until oidtree_clear. Currently it's not * worth maintaining a free list */ - cb_insert(&ot->tree, &on->base, sizeof(*oid)); + node = cb_insert(&ot->tree, &on->base, sizeof(*oid)); + if (node) { + struct oidtree_node *preexisting = container_of(node, struct oidtree_node, base); + preexisting->data = data; + } } -bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) +static struct oidtree_node *oidtree_lookup(struct oidtree *ot, + const struct object_id *oid) { struct object_id k; size_t klen = sizeof(k); + struct cb_node *node; oidcpy(&k, oid); @@ -58,7 +72,20 @@ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) klen += BUILD_ASSERT_OR_ZERO(offsetof(struct object_id, hash) < offsetof(struct object_id, algo)); - return !!cb_lookup(&ot->tree, (const uint8_t *)&k, klen); + node = cb_lookup(&ot->tree, (const uint8_t *)&k, klen); + return node ? container_of(node, struct oidtree_node, base) : NULL; +} + +bool oidtree_contains(struct oidtree *ot, const struct object_id *oid) +{ + struct oidtree_node *node = oidtree_lookup(ot, oid); + return node ? 1 : 0; +} + +void *oidtree_get(struct oidtree *ot, const struct object_id *oid) +{ + struct oidtree_node *node = oidtree_lookup(ot, oid); + return node ? node->data : NULL; } struct oidtree_each_data { @@ -82,7 +109,7 @@ static int iter(struct cb_node *n, void *cb_data) return 0; } - return data->cb(&node->key, data->cb_data); + return data->cb(&node->key, node->data, data->cb_data); } int oidtree_each(struct oidtree *ot, const struct object_id *prefix, diff --git a/oidtree.h b/oidtree.h index 2b7bad2e60..baa5a436ea 100644 --- a/oidtree.h +++ b/oidtree.h @@ -29,18 +29,26 @@ void oidtree_init(struct oidtree *ot); */ void oidtree_clear(struct oidtree *ot); -/* Insert the object ID into the tree. */ -void oidtree_insert(struct oidtree *ot, const struct object_id *oid); +/* + * Insert the object ID into the tree and store the given pointer alongside + * with it. The data pointer of any preexisting entry will be overwritten. + */ +void oidtree_insert(struct oidtree *ot, const struct object_id *oid, + void *data); /* Check whether the tree contains the given object ID. */ bool oidtree_contains(struct oidtree *ot, const struct object_id *oid); +/* Get the payload stored with the given object ID. */ +void *oidtree_get(struct oidtree *ot, const struct object_id *oid); + /* * Callback function used for `oidtree_each()`. Returning a non-zero exit code * will cause iteration to stop. The exit code will be propagated to the caller * of `oidtree_each()`. */ typedef int (*oidtree_each_cb)(const struct object_id *oid, + void *node_data, void *cb_data); /* diff --git a/t/unit-tests/u-oidtree.c b/t/unit-tests/u-oidtree.c index d4d05c7dc3..f0d5ebb733 100644 --- a/t/unit-tests/u-oidtree.c +++ b/t/unit-tests/u-oidtree.c @@ -19,7 +19,7 @@ static int fill_tree_loc(struct oidtree *ot, const char *hexes[], size_t n) for (size_t i = 0; i < n; i++) { struct object_id oid; cl_parse_any_oid(hexes[i], &oid); - oidtree_insert(ot, &oid); + oidtree_insert(ot, &oid, NULL); } return 0; } @@ -38,9 +38,9 @@ struct expected_hex_iter { const char *query; }; -static int check_each_cb(const struct object_id *oid, void *data) +static int check_each_cb(const struct object_id *oid, void *node_data UNUSED, void *cb_data) { - struct expected_hex_iter *hex_iter = data; + struct expected_hex_iter *hex_iter = cb_data; struct object_id expected; cl_assert(hex_iter->i < hex_iter->expected_hexes.nr); @@ -105,3 +105,23 @@ void test_oidtree__each(void) check_each(&ot, "32100", "321", NULL); check_each(&ot, "32", "320", "321", NULL); } + +void test_oidtree__insert_overwrites_data(void) +{ + struct object_id oid; + struct oidtree ot; + int a, b; + + cl_parse_any_oid("1", &oid); + + oidtree_init(&ot); + + oidtree_insert(&ot, &oid, NULL); + cl_assert_equal_p(oidtree_get(&ot, &oid), NULL); + oidtree_insert(&ot, &oid, &a); + cl_assert_equal_p(oidtree_get(&ot, &oid), &a); + oidtree_insert(&ot, &oid, &b); + cl_assert_equal_p(oidtree_get(&ot, &oid), &b); + + oidtree_clear(&ot); +} -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (8 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 09/17] oidtree: add ability to store data Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt ` (7 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler The in-memory source stores its objects in a simple array that we grow as needed. This has a couple of downsides: - The object lookup is O(n). This doesn't matter in practice because we only store a small number of objects. - We don't have an easy way to iterate over all objects in lexicographic order. - We don't have an easy way to compute unique object ID prefixes. Refactor the code to use an oidtree instead. This is the same data structure used by our loose object source, and thus it means we get a bunch of functionality for free. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 72 +++++++++++++++++++++++++++++++++++++-------------- odb/source-inmemory.h | 13 ++-------- 2 files changed, 54 insertions(+), 31 deletions(-) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index d05a13df45..3b51cc7fef 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -3,20 +3,29 @@ #include "odb.h" #include "odb/source-inmemory.h" #include "odb/streaming.h" +#include "oidtree.h" #include "repository.h" -static const struct cached_object *find_cached_object(struct odb_source_inmemory *source, - const struct object_id *oid) +struct inmemory_object { + enum object_type type; + const void *buf; + unsigned long size; +}; + +static const struct inmemory_object *find_cached_object(struct odb_source_inmemory *source, + const struct object_id *oid) { - static const struct cached_object empty_tree = { + static const struct inmemory_object empty_tree = { .type = OBJ_TREE, .buf = "", }; - const struct cached_object_entry *co = source->objects; + const struct inmemory_object *object; - for (size_t i = 0; i < source->objects_nr; i++, co++) - if (oideq(&co->oid, oid)) - return &co->value; + if (source->objects) { + object = oidtree_get(source->objects, oid); + if (object) + return object; + } if (oid->algo && oideq(oid, hash_algos[oid->algo].empty_tree)) return &empty_tree; @@ -30,7 +39,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, enum object_info_flags flags UNUSED) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - const struct cached_object *object; + const struct inmemory_object *object; object = find_cached_object(inmemory, oid); if (!object) @@ -88,7 +97,7 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); struct odb_read_stream_inmemory *stream; - const struct cached_object *object; + const struct inmemory_object *object; object = find_cached_object(inmemory, oid); if (!object) @@ -113,17 +122,23 @@ static int odb_source_inmemory_write_object(struct odb_source *source, enum odb_write_object_flags flags UNUSED) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - struct cached_object_entry *object; + struct inmemory_object *object; hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); - ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, - inmemory->objects_alloc); - object = &inmemory->objects[inmemory->objects_nr++]; - object->value.size = len; - object->value.type = type; - object->value.buf = xmemdupz(buf, len); - oidcpy(&object->oid, oid); + if (!inmemory->objects) { + CALLOC_ARRAY(inmemory->objects, 1); + oidtree_init(inmemory->objects); + } else if (oidtree_contains(inmemory->objects, oid)) { + return 0; + } + + CALLOC_ARRAY(object, 1); + object->size = len; + object->type = type; + object->buf = xmemdupz(buf, len); + + oidtree_insert(inmemory->objects, oid, object); return 0; } @@ -167,12 +182,29 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source, return ret; } +static int inmemory_object_free(const struct object_id *oid UNUSED, + void *node_data, + void *cb_data UNUSED) +{ + struct inmemory_object *object = node_data; + free((void *) object->buf); + free(object); + return 0; +} + static void odb_source_inmemory_free(struct odb_source *source) { struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); - for (size_t i = 0; i < inmemory->objects_nr; i++) - free((char *) inmemory->objects[i].value.buf); - free(inmemory->objects); + + if (inmemory->objects) { + struct object_id null_oid = { 0 }; + + oidtree_each(inmemory->objects, &null_oid, 0, + inmemory_object_free, NULL); + oidtree_clear(inmemory->objects); + free(inmemory->objects); + } + free(inmemory->base.path); free(inmemory); } diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h index d1b05a3996..a88fc2e320 100644 --- a/odb/source-inmemory.h +++ b/odb/source-inmemory.h @@ -3,14 +3,7 @@ #include "odb/source.h" -struct cached_object_entry { - struct object_id oid; - struct cached_object { - enum object_type type; - const void *buf; - unsigned long size; - } value; -}; +struct oidtree; /* * An in-memory source that you can write objects to that shall be made @@ -20,9 +13,7 @@ struct cached_object_entry { */ struct odb_source_inmemory { struct odb_source base; - - struct cached_object_entry *objects; - size_t objects_nr, objects_alloc; + struct oidtree *objects; }; /* Create a new in-memory object database source. */ -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (9 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt ` (6 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `for_each_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 88 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 72 insertions(+), 16 deletions(-) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 3b51cc7fef..f60eecbdbb 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -33,6 +33,28 @@ static const struct inmemory_object *find_cached_object(struct odb_source_inmemo return NULL; } +static void populate_object_info(struct odb_source_inmemory *source, + struct object_info *oi, + const struct inmemory_object *object) +{ + if (!oi) + return; + + if (oi->typep) + *(oi->typep) = object->type; + if (oi->sizep) + *(oi->sizep) = object->size; + if (oi->disk_sizep) + *(oi->disk_sizep) = 0; + if (oi->delta_base_oid) + oidclr(oi->delta_base_oid, source->base.odb->repo->hash_algo); + if (oi->contentp) + *oi->contentp = xmemdupz(object->buf, object->size); + if (oi->mtimep) + *oi->mtimep = 0; + oi->whence = OI_CACHED; +} + static int odb_source_inmemory_read_object_info(struct odb_source *source, const struct object_id *oid, struct object_info *oi, @@ -45,22 +67,7 @@ static int odb_source_inmemory_read_object_info(struct odb_source *source, if (!object) return -1; - if (oi) { - if (oi->typep) - *(oi->typep) = object->type; - if (oi->sizep) - *(oi->sizep) = object->size; - if (oi->disk_sizep) - *(oi->disk_sizep) = 0; - if (oi->delta_base_oid) - oidclr(oi->delta_base_oid, source->odb->repo->hash_algo); - if (oi->contentp) - *oi->contentp = xmemdupz(object->buf, object->size); - if (oi->mtimep) - *oi->mtimep = 0; - oi->whence = OI_CACHED; - } - + populate_object_info(inmemory, oi, object); return 0; } @@ -114,6 +121,54 @@ static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, return 0; } +struct odb_source_inmemory_for_each_object_data { + struct odb_source_inmemory *inmemory; + const struct object_info *request; + odb_for_each_object_cb cb; + void *cb_data; +}; + +static int odb_source_inmemory_for_each_object_cb(const struct object_id *oid, + void *node_data, void *cb_data) +{ + struct odb_source_inmemory_for_each_object_data *data = cb_data; + struct inmemory_object *object = node_data; + + if (data->request) { + struct object_info oi = *data->request; + populate_object_info(data->inmemory, &oi, object); + return data->cb(oid, &oi, data->cb_data); + } else { + return data->cb(oid, NULL, data->cb_data); + } +} + +static int odb_source_inmemory_for_each_object(struct odb_source *source, + const struct object_info *request, + odb_for_each_object_cb cb, + void *cb_data, + const struct odb_for_each_object_options *opts) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + struct odb_source_inmemory_for_each_object_data payload = { + .inmemory = inmemory, + .request = request, + .cb = cb, + .cb_data = cb_data, + }; + struct object_id null_oid = { 0 }; + + if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) || + (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local)) + return 0; + if (!inmemory->objects) + return 0; + + return oidtree_each(inmemory->objects, + opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len, + odb_source_inmemory_for_each_object_cb, &payload); +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -219,6 +274,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.free = odb_source_inmemory_free; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; + source->base.for_each_object = odb_source_inmemory_for_each_object; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (10 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt ` (5 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `find_abbrev_len()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index f60eecbdbb..44d9bbedec 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -169,6 +169,44 @@ static int odb_source_inmemory_for_each_object(struct odb_source *source, odb_source_inmemory_for_each_object_cb, &payload); } +struct find_abbrev_len_data { + const struct object_id *oid; + unsigned len; +}; + +static int find_abbrev_len_cb(const struct object_id *oid, + struct object_info *oi UNUSED, + void *cb_data) +{ + struct find_abbrev_len_data *data = cb_data; + unsigned len = oid_common_prefix_hexlen(oid, data->oid); + if (len != hash_algos[oid->algo].hexsz && len >= data->len) + data->len = len + 1; + return 0; +} + +static int odb_source_inmemory_find_abbrev_len(struct odb_source *source, + const struct object_id *oid, + unsigned min_len, + unsigned *out) +{ + struct odb_for_each_object_options opts = { + .prefix = oid, + .prefix_hex_len = min_len, + }; + struct find_abbrev_len_data data = { + .oid = oid, + .len = min_len, + }; + int ret; + + ret = odb_source_inmemory_for_each_object(source, NULL, find_abbrev_len_cb, + &data, &opts); + *out = data.len; + + return ret; +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -275,6 +313,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; + source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (11 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt ` (4 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `count_objects()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 44d9bbedec..674dbcad30 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -207,6 +207,25 @@ static int odb_source_inmemory_find_abbrev_len(struct odb_source *source, return ret; } +static int count_objects_cb(const struct object_id *oid UNUSED, + struct object_info *oi UNUSED, + void *cb_data) +{ + unsigned long *counter = cb_data; + (*counter)++; + return 0; +} + +static int odb_source_inmemory_count_objects(struct odb_source *source, + enum odb_count_objects_flags flags UNUSED, + unsigned long *out) +{ + struct odb_for_each_object_options opts = { 0 }; + *out = 0; + return odb_source_inmemory_for_each_object(source, NULL, count_objects_cb, + out, &opts); +} + static int odb_source_inmemory_write_object(struct odb_source *source, const void *buf, unsigned long len, enum object_type type, @@ -314,6 +333,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; source->base.find_abbrev_len = odb_source_inmemory_find_abbrev_len; + source->base.count_objects = odb_source_inmemory_count_objects; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (12 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt ` (3 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Implement the `freshen_object()` callback function for the in-memory source. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 674dbcad30..8934e0f547 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -294,6 +294,15 @@ static int odb_source_inmemory_write_object_stream(struct odb_source *source, return ret; } +static int odb_source_inmemory_freshen_object(struct odb_source *source, + const struct object_id *oid) +{ + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); + if (find_cached_object(inmemory, oid)) + return 1; + return 0; +} + static int inmemory_object_free(const struct object_id *oid UNUSED, void *node_data, void *cb_data UNUSED) @@ -336,6 +345,7 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.count_objects = odb_source_inmemory_count_objects; source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; + source->base.freshen_object = odb_source_inmemory_freshen_object; return source; } -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (13 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 16/17] odb: generic in-memory source Patrick Steinhardt ` (2 subsequent siblings) 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Stub out remaining functions that we either don't need or that are basically no-ops. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb/source-inmemory.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c index 8934e0f547..e004566d76 100644 --- a/odb/source-inmemory.c +++ b/odb/source-inmemory.c @@ -303,6 +303,32 @@ static int odb_source_inmemory_freshen_object(struct odb_source *source, return 0; } +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED, + struct odb_transaction **out UNUSED) +{ + return error("in-memory source does not support transactions"); +} + +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED, + struct strvec *out UNUSED) +{ + return 0; +} + +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED, + const char *alternate UNUSED) +{ + return error("in-memory source does not support alternates"); +} + +static void odb_source_inmemory_close(struct odb_source *source UNUSED) +{ +} + +static void odb_source_inmemory_reprepare(struct odb_source *source UNUSED) +{ +} + static int inmemory_object_free(const struct object_id *oid UNUSED, void *node_data, void *cb_data UNUSED) @@ -338,6 +364,8 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false); source->base.free = odb_source_inmemory_free; + source->base.close = odb_source_inmemory_close; + source->base.reprepare = odb_source_inmemory_reprepare; source->base.read_object_info = odb_source_inmemory_read_object_info; source->base.read_object_stream = odb_source_inmemory_read_object_stream; source->base.for_each_object = odb_source_inmemory_for_each_object; @@ -346,6 +374,9 @@ struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb) source->base.write_object = odb_source_inmemory_write_object; source->base.write_object_stream = odb_source_inmemory_write_object_stream; source->base.freshen_object = odb_source_inmemory_freshen_object; + source->base.begin_transaction = odb_source_inmemory_begin_transaction; + source->base.read_alternates = odb_source_inmemory_read_alternates; + source->base.write_alternate = odb_source_inmemory_write_alternate; return source; } -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 16/17] odb: generic in-memory source 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (14 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt 2026-04-14 8:27 ` [PATCH v3 00/17] odb: introduce "in-memory" source Karthik Nayak 17 siblings, 0 replies; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler Make the in-memory source generic. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- odb.c | 8 ++++---- odb.h | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/odb.c b/odb.c index 24e929f03c..965ef68e4e 100644 --- a/odb.c +++ b/odb.c @@ -560,7 +560,7 @@ static int do_oid_object_info_extended(struct object_database *odb, if (is_null_oid(real)) return -1; - if (!odb_source_read_object_info(&odb->inmemory_objects->base, oid, oi, flags)) + if (!odb_source_read_object_info(odb->inmemory_objects, oid, oi, flags)) return 0; odb_prepare_alternates(odb); @@ -737,7 +737,7 @@ int odb_pretend_object(struct object_database *odb, if (odb_has_object(odb, oid, 0)) return 0; - return odb_source_write_object(&odb->inmemory_objects->base, + return odb_source_write_object(odb->inmemory_objects, buf, len, type, oid, NULL, 0); } @@ -1020,7 +1020,7 @@ struct object_database *odb_new(struct repository *repo, o->sources = odb_source_new(o, primary_source, true); o->sources_tail = &o->sources->next; o->alternate_db = xstrdup_or_null(secondary_sources); - o->inmemory_objects = odb_source_inmemory_new(o); + o->inmemory_objects = &odb_source_inmemory_new(o)->base; free(to_free); @@ -1045,7 +1045,7 @@ static void odb_free_sources(struct object_database *o) o->sources = next; } - odb_source_free(&o->inmemory_objects->base); + odb_source_free(o->inmemory_objects); o->inmemory_objects = NULL; kh_destroy_odb_path_map(o->source_by_path); diff --git a/odb.h b/odb.h index c3a7edf9c8..73553ed5a7 100644 --- a/odb.h +++ b/odb.h @@ -81,7 +81,7 @@ struct object_database { * to write them into the object store (e.g. a browse-only * application). */ - struct odb_source_inmemory *inmemory_objects; + struct odb_source *inmemory_objects; /* * A fast, rough count of the number of objects in the repository. -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (15 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 16/17] odb: generic in-memory source Patrick Steinhardt @ 2026-04-10 12:12 ` Patrick Steinhardt 2026-04-14 8:45 ` Karthik Nayak 2026-04-14 8:27 ` [PATCH v3 00/17] odb: introduce "in-memory" source Karthik Nayak 17 siblings, 1 reply; 85+ messages in thread From: Patrick Steinhardt @ 2026-04-10 12:12 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Justin Tobler While the in-memory object source is a full-fledged source, our code base only exercises parts of its functionality because we only use it in git-blame(1). Implement unit tests to verify that the yet-unused functionality of the backend works as expected. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Makefile | 1 + t/meson.build | 1 + t/unit-tests/u-odb-inmemory.c | 313 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 315 insertions(+) diff --git a/Makefile b/Makefile index 3cda12c455..68b4daa1ad 100644 --- a/Makefile +++ b/Makefile @@ -1529,6 +1529,7 @@ CLAR_TEST_SUITES += u-hash CLAR_TEST_SUITES += u-hashmap CLAR_TEST_SUITES += u-list-objects-filter-options CLAR_TEST_SUITES += u-mem-pool +CLAR_TEST_SUITES += u-odb-inmemory CLAR_TEST_SUITES += u-oid-array CLAR_TEST_SUITES += u-oidmap CLAR_TEST_SUITES += u-oidtree diff --git a/t/meson.build b/t/meson.build index 7528e5cda5..db5e01c49b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -6,6 +6,7 @@ clar_test_suites = [ 'unit-tests/u-hashmap.c', 'unit-tests/u-list-objects-filter-options.c', 'unit-tests/u-mem-pool.c', + 'unit-tests/u-odb-inmemory.c', 'unit-tests/u-oid-array.c', 'unit-tests/u-oidmap.c', 'unit-tests/u-oidtree.c', diff --git a/t/unit-tests/u-odb-inmemory.c b/t/unit-tests/u-odb-inmemory.c new file mode 100644 index 0000000000..482502ef4b --- /dev/null +++ b/t/unit-tests/u-odb-inmemory.c @@ -0,0 +1,313 @@ +#include "unit-test.h" +#include "hex.h" +#include "odb/source-inmemory.h" +#include "odb/streaming.h" +#include "oidset.h" +#include "repository.h" +#include "strbuf.h" + +#define RANDOM_OID "da39a3ee5e6b4b0d3255bfef95601890afd80709" +#define FOOBAR_OID "f6ea0495187600e7b2288c8ac19c5886383a4632" + +static struct repository repo = { + .hash_algo = &hash_algos[GIT_HASH_SHA1], +}; +static struct object_database *odb; + +static void cl_assert_object_info(struct odb_source_inmemory *source, + const struct object_id *oid, + enum object_type expected_type, + const char *expected_content) +{ + enum object_type actual_type; + unsigned long actual_size; + void *actual_content; + struct object_info oi = { + .typep = &actual_type, + .sizep = &actual_size, + .contentp = &actual_content, + }; + + cl_must_pass(odb_source_read_object_info(&source->base, oid, &oi, 0)); + cl_assert_equal_u(actual_size, strlen(expected_content)); + cl_assert_equal_u(actual_type, expected_type); + cl_assert_equal_s((char *) actual_content, expected_content); + + free(actual_content); +} + +void test_odb_inmemory__initialize(void) +{ + odb = odb_new(&repo, "", ""); +} + +void test_odb_inmemory__cleanup(void) +{ + odb_free(odb); +} + +void test_odb_inmemory__new(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + cl_assert_equal_i(source->base.type, ODB_SOURCE_INMEMORY); + odb_source_free(&source->base); +} + +void test_odb_inmemory__read_missing_object(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + struct object_id oid; + const char *end; + + cl_must_pass(parse_oid_hex_algop(RANDOM_OID, &oid, &end, repo.hash_algo)); + cl_must_fail(odb_source_read_object_info(&source->base, &oid, NULL, 0)); + + odb_source_free(&source->base); +} + +void test_odb_inmemory__read_empty_tree(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + cl_assert_object_info(source, repo.hash_algo->empty_tree, OBJ_TREE, ""); + odb_source_free(&source->base); +} + +void test_odb_inmemory__read_written_object(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + const char data[] = "foobar"; + struct object_id written_oid; + + cl_must_pass(odb_source_write_object(&source->base, data, strlen(data), + OBJ_BLOB, &written_oid, NULL, 0)); + cl_assert_equal_s(oid_to_hex(&written_oid), FOOBAR_OID); + cl_assert_object_info(source, &written_oid, OBJ_BLOB, "foobar"); + + odb_source_free(&source->base); +} + +void test_odb_inmemory__read_stream_object(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + struct odb_read_stream *stream; + struct object_id written_oid; + const char data[] = "foobar"; + char buf[3] = { 0 }; + + cl_must_pass(odb_source_write_object(&source->base, data, strlen(data), + OBJ_BLOB, &written_oid, NULL, 0)); + + cl_must_pass(odb_source_read_object_stream(&stream, &source->base, + &written_oid)); + cl_assert_equal_i(stream->type, OBJ_BLOB); + cl_assert_equal_u(stream->size, 6); + + cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 2); + cl_assert_equal_s(buf, "fo"); + cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 2); + cl_assert_equal_s(buf, "ob"); + cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 2); + cl_assert_equal_s(buf, "ar"); + cl_assert_equal_i(odb_read_stream_read(stream, buf, 2), 0); + + odb_read_stream_close(stream); + odb_source_free(&source->base); +} + +static int add_one_object(const struct object_id *oid, + struct object_info *oi UNUSED, + void *payload) +{ + struct oidset *actual_oids = payload; + cl_must_pass(oidset_insert(actual_oids, oid)); + return 0; +} + +void test_odb_inmemory__for_each_object(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + struct odb_for_each_object_options opts = { 0 }; + struct oidset expected_oids = OIDSET_INIT; + struct oidset actual_oids = OIDSET_INIT; + struct strbuf buf = STRBUF_INIT; + + cl_must_pass(odb_source_for_each_object(&source->base, NULL, + add_one_object, &actual_oids, &opts)); + cl_assert_equal_u(oidset_size(&actual_oids), 0); + + for (int i = 0; i < 10; i++) { + struct object_id written_oid; + + strbuf_reset(&buf); + strbuf_addf(&buf, "%d", i); + + cl_must_pass(odb_source_write_object(&source->base, buf.buf, buf.len, + OBJ_BLOB, &written_oid, NULL, 0)); + cl_must_pass(oidset_insert(&expected_oids, &written_oid)); + } + + cl_must_pass(odb_source_for_each_object(&source->base, NULL, + add_one_object, &actual_oids, &opts)); + cl_assert_equal_b(oidset_equal(&expected_oids, &actual_oids), true); + + odb_source_free(&source->base); + oidset_clear(&expected_oids); + oidset_clear(&actual_oids); + strbuf_release(&buf); +} + +static int abort_after_two_objects(const struct object_id *oid UNUSED, + struct object_info *oi UNUSED, + void *payload) +{ + unsigned *counter = payload; + (*counter)++; + if (*counter == 2) + return 123; + return 0; +} + +void test_odb_inmemory__for_each_object_can_abort_iteration(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + struct odb_for_each_object_options opts = { 0 }; + struct object_id written_oid; + unsigned counter = 0; + + cl_must_pass(odb_source_write_object(&source->base, "1", 1, + OBJ_BLOB, &written_oid, NULL, 0)); + cl_must_pass(odb_source_write_object(&source->base, "2", 1, + OBJ_BLOB, &written_oid, NULL, 0)); + cl_must_pass(odb_source_write_object(&source->base, "3", 1, + OBJ_BLOB, &written_oid, NULL, 0)); + + cl_assert_equal_i(odb_source_for_each_object(&source->base, NULL, + abort_after_two_objects, + &counter, &opts), + 123); + cl_assert_equal_u(counter, 2); + + odb_source_free(&source->base); +} + +void test_odb_inmemory__count_objects(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + struct object_id written_oid; + unsigned long count; + + cl_must_pass(odb_source_count_objects(&source->base, 0, &count)); + cl_assert_equal_u(count, 0); + + cl_must_pass(odb_source_write_object(&source->base, "1", 1, + OBJ_BLOB, &written_oid, NULL, 0)); + cl_must_pass(odb_source_write_object(&source->base, "2", 1, + OBJ_BLOB, &written_oid, NULL, 0)); + cl_must_pass(odb_source_write_object(&source->base, "3", 1, + OBJ_BLOB, &written_oid, NULL, 0)); + + cl_must_pass(odb_source_count_objects(&source->base, 0, &count)); + cl_assert_equal_u(count, 3); + + odb_source_free(&source->base); +} + +void test_odb_inmemory__find_abbrev_len(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + struct object_id oid1, oid2; + unsigned abbrev_len; + + /* + * The two blobs we're about to write share the first 10 hex characters + * of their object IDs ("a09f43dc45"), so at least 11 characters are + * needed to tell them apart: + * + * "368317" -> a09f43dc4562d45115583f5094640ae237df55f7 + * "514796" -> a09f43dc45fef837235eb7e6b1a6ca5e169a3981 + * + * With only one blob written we expect a length of 4. + */ + cl_must_pass(odb_source_write_object(&source->base, "368317", strlen("368317"), + OBJ_BLOB, &oid1, NULL, 0)); + cl_must_pass(odb_source_find_abbrev_len(&source->base, &oid1, 4, + &abbrev_len)); + cl_assert_equal_u(abbrev_len, 4); + + /* + * With both objects present, the shared 10-character prefix means we + * need at least 11 characters to uniquely identify either object. + */ + cl_must_pass(odb_source_write_object(&source->base, "514796", strlen("514796"), + OBJ_BLOB, &oid2, NULL, 0)); + cl_must_pass(odb_source_find_abbrev_len(&source->base, &oid1, 4, + &abbrev_len)); + cl_assert_equal_u(abbrev_len, 11); + + odb_source_free(&source->base); +} + +void test_odb_inmemory__freshen_object(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + struct object_id written_oid; + struct object_id oid; + const char *end; + + cl_must_pass(parse_oid_hex_algop(RANDOM_OID, &oid, &end, repo.hash_algo)); + cl_assert_equal_i(odb_source_freshen_object(&source->base, &oid), 0); + + cl_must_pass(odb_source_write_object(&source->base, "foobar", + strlen("foobar"), OBJ_BLOB, + &written_oid, NULL, 0)); + cl_assert_equal_i(odb_source_freshen_object(&source->base, + &written_oid), 1); + + odb_source_free(&source->base); +} + +struct membuf_write_stream { + struct odb_write_stream base; + const char *buf; + size_t offset; + size_t size; +}; + +static ssize_t membuf_write_stream_read(struct odb_write_stream *stream, + unsigned char *buf, size_t len) +{ + struct membuf_write_stream *s = container_of(stream, struct membuf_write_stream, base); + size_t chunk_size = 2; + + if (chunk_size > len) + chunk_size = len; + if (chunk_size > s->size - s->offset) + chunk_size = s->size - s->offset; + + memcpy(buf, s->buf + s->offset, chunk_size); + + s->offset += chunk_size; + if (s->offset == s->size) + s->base.is_finished = 1; + + return chunk_size; +} + +void test_odb_inmemory__write_object_stream(void) +{ + struct odb_source_inmemory *source = odb_source_inmemory_new(odb); + const char data[] = "foobar"; + struct membuf_write_stream stream = { + .base.read = membuf_write_stream_read, + .buf = data, + .size = strlen(data), + }; + struct object_id written_oid; + + cl_must_pass(odb_source_write_object_stream(&source->base, &stream.base, + strlen(data), &written_oid)); + cl_assert_equal_s(oid_to_hex(&written_oid), FOOBAR_OID); + cl_assert_object_info(source, &written_oid, OBJ_BLOB, "foobar"); + + odb_source_free(&source->base); +} -- 2.54.0.rc0.707.g0fbf48f4d6.dirty ^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source 2026-04-10 12:12 ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt @ 2026-04-14 8:45 ` Karthik Nayak 0 siblings, 0 replies; 85+ messages in thread From: Karthik Nayak @ 2026-04-14 8:45 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 381 bytes --] Patrick Steinhardt <ps@pks.im> writes: > While the in-memory object source is a full-fledged source, our code > base only exercises parts of its functionality because we only use it in > git-blame(1). Implement unit tests to verify that the yet-unused > functionality of the backend works as expected. > This patch seems extensive and good! Overall I'm happy with this version. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH v3 00/17] odb: introduce "in-memory" source 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt ` (16 preceding siblings ...) 2026-04-10 12:12 ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt @ 2026-04-14 8:27 ` Karthik Nayak 17 siblings, 0 replies; 85+ messages in thread From: Karthik Nayak @ 2026-04-14 8:27 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Justin Tobler [-- Attachment #1: Type: text/plain, Size: 8315 bytes --] Patrick Steinhardt <ps@pks.im> writes: [snip] > Range-diff versus v2: > > 1: b18e427c69 ! 1: 155b2cdf81 odb: introduce "in-memory" source > @@ odb/source-inmemory.h (new) > +struct cached_object_entry; > + > +/* > -+ * An inmemory source that you can write objects to that shall be made > ++ * An in-memory source that you can write objects to that shall be made > + * available for reading, but that shouldn't ever be persisted to disk. Note > + * that any objects written to this source will be stored in memory, so the > + * number of objects you can store is limited by available system memory. > @@ odb/source-inmemory.h (new) > +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb); > + > +/* > -+ * Cast the given object database source to the inmemory backend. This will > ++ * Cast the given object database source to the in-memory backend. This will > + * cause a BUG in case the source doesn't use this backend. > + */ > +static inline struct odb_source_inmemory *odb_source_inmemory_downcast(struct odb_source *source) > +{ > + if (source->type != ODB_SOURCE_INMEMORY) > -+ BUG("trying to downcast source of type '%d' to inmemory", source->type); > ++ BUG("trying to downcast source of type '%d' to in-memory", source->type); > + return container_of(source, struct odb_source_inmemory, base); > +} > + > @@ odb/source.h: enum odb_source_type { > /* The "files" backend that uses loose objects and packfiles. */ > ODB_SOURCE_FILES, > + > -+ /* The "inmemory" backend that stores objects in memory. */ > ++ /* The "in-memory" backend that stores objects in memory. */ > + ODB_SOURCE_INMEMORY, > }; > > 2: 8fd337da90 ! 2: c66edd10a8 odb/source-inmemory: implement `free()` callback > @@ odb/source-inmemory.h > +}; > > /* > - * An inmemory source that you can write objects to that shall be made > + * An in-memory source that you can write objects to that shall be made > 3: f4ae2a2bde = 3: a86549f39c odb: fix unnecessary call to `find_cached_object()` > 4: 8600b88530 = 4: 49ac739dd2 odb/source-inmemory: implement `read_object_info()` callback > 5: ab33c0b7ee ! 5: 321ef11be3 odb/source-inmemory: implement `read_object_stream()` callback > @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od > > +struct odb_read_stream_inmemory { > + struct odb_read_stream base; > -+ const void *buf; > ++ const unsigned char *buf; Okay this does make more sense. > + size_t offset; > +}; > + > @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_info(struct od > + > + if (buf_len > inmemory->base.size - inmemory->offset) > + bytes = inmemory->base.size - inmemory->offset; > -+ memcpy(buf, inmemory->buf, bytes); > ++ > ++ memcpy(buf, inmemory->buf + inmemory->offset, bytes); > ++ inmemory->offset += bytes; Now, we also use the offset correctly. > + > + return bytes; > +} > 6: 983f886eeb ! 6: 506df5e488 odb/source-inmemory: implement `write_object()` callback > @@ odb.c: int odb_pretend_object(struct object_database *odb, > void *odb_read_object(struct object_database *odb, > > ## odb/source-inmemory.c ## > +@@ > + #include "git-compat-util.h" > ++#include "object-file.h" > + #include "odb.h" > + #include "odb/source-inmemory.h" > + #include "odb/streaming.h" > @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct odb_read_stream **out, > return 0; > } > @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct > + struct odb_source_inmemory *inmemory = odb_source_inmemory_downcast(source); > + struct cached_object_entry *object; > + > ++ hash_object_file(source->odb->repo->hash_algo, buf, len, type, oid); > ++ > + ALLOC_GROW(inmemory->objects, inmemory->objects_nr + 1, > + inmemory->objects_alloc); > + object = &inmemory->objects[inmemory->objects_nr++]; > 7: 68edefa269 < -: ---------- odb/source-inmemory: implement `write_object()` callback > 8: 18d451152b ! 7: 21eef34c1b odb/source-inmemory: implement `write_object_stream()` callback > @@ odb/source-inmemory.c: static int odb_source_inmemory_write_object(struct odb_so > + goto out; > + } > + > -+ memcpy(data, buf, bytes_read); > ++ memcpy(data + total_read, buf, bytes_read); > + total_read += bytes_read; > + } > + > 9: cee53b9853 ! 8: 504e34d116 cbtree: allow using arbitrary wrapper structures for nodes > @@ cbtree.c: int cb_each(struct cb_tree *t, const uint8_t *kpfx, size_t klen, > > > ## cbtree.h ## > +@@ > + * > + * This is adapted to store arbitrary data (not just NUL-terminated C strings > + * and allocates no memory internally. The user needs to allocate > +- * "struct cb_node" and fill cb_node.k[] with arbitrary match data > +- * for memcmp. > +- * If "klen" is variable, then it should be embedded into "c_node.k[]" > ++ * "struct cb_node" and provide `key_offset` to indicate where the key can be > ++ * found relative to the `struct cb_node` for memcmp. > ++ * If "klen" is variable, then it should be embedded into the key. > + * Recursion is bound by the maximum value of "klen" used. > + */ We fix up the comments here also. > + #ifndef CBTREE_H > @@ cbtree.h: struct cb_node { > */ > uint32_t byte; > 10: 8ad5b81b13 = 9: 9bdd475a92 oidtree: add ability to store data > 11: 1ed2d23137 ! 10: 956b989529 odb/source-inmemory: convert to use oidtree > @@ odb/source-inmemory.h > +struct oidtree; > > /* > - * An inmemory source that you can write objects to that shall be made > + * An in-memory source that you can write objects to that shall be made > @@ odb/source-inmemory.h: struct cached_object_entry { > */ > struct odb_source_inmemory { > 12: 99fbb1cc35 ! 11: bec1428116 odb/source-inmemory: implement `for_each_object()` callback > @@ odb/source-inmemory.c: static int odb_source_inmemory_read_object_stream(struct > + if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) || > + (opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY && !source->local)) > + return 0; > ++ if (!inmemory->objects) > ++ return 0; > + > + return oidtree_each(inmemory->objects, > + opts->prefix ? opts->prefix : &null_oid, opts->prefix_hex_len, > 13: c87a621f39 = 12: 32dada3c27 odb/source-inmemory: implement `find_abbrev_len()` callback > 14: 9b88f0c07b = 13: 43127840c0 odb/source-inmemory: implement `count_objects()` callback > 15: 3c9493f2bb = 14: 439acbd068 odb/source-inmemory: implement `freshen_object()` callback > 16: f2b6317104 ! 15: 12c1b6ffd2 odb/source-inmemory: stub out remaining functions > @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_ > +static int odb_source_inmemory_begin_transaction(struct odb_source *source UNUSED, > + struct odb_transaction **out UNUSED) > +{ > -+ return error("inmemory source does not support transactions"); > ++ return error("in-memory source does not support transactions"); > +} > + > +static int odb_source_inmemory_read_alternates(struct odb_source *source UNUSED, > @@ odb/source-inmemory.c: static int odb_source_inmemory_freshen_object(struct odb_ > +static int odb_source_inmemory_write_alternate(struct odb_source *source UNUSED, > + const char *alternate UNUSED) > +{ > -+ return error("inmemory source does not support alternates"); > ++ return error("in-memory source does not support alternates"); > +} > + > +static void odb_source_inmemory_close(struct odb_source *source UNUSED) > 17: 81da5d5048 = 16: ef37a61e7f odb: generic in-memory source > -: ---------- > 17: 51b51e0382 t/unit-tests: add tests for the in-memory object source The range diff looks good. I'll have a look at the unit test patch independently. Thanks [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 85+ messages in thread
end of thread, other threads:[~2026-04-14 8:45 UTC | newest] Thread overview: 85+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-03 6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 01/16] " Patrick Steinhardt 2026-04-08 21:00 ` Justin Tobler 2026-04-09 5:22 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt 2026-04-08 21:05 ` Justin Tobler 2026-04-03 6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt 2026-04-08 21:13 ` Justin Tobler 2026-04-09 5:22 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt 2026-04-08 21:24 ` Justin Tobler 2026-04-09 5:22 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt 2026-04-03 22:11 ` Junio C Hamano 2026-04-08 8:22 ` Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 09/16] oidtree: add ability to store data Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 10/16] odb/source-inmemory: convert to use oidtree Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt 2026-04-03 6:01 ` [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 15/16] odb/source-inmemory: stub out remaining functions Patrick Steinhardt 2026-04-03 6:02 ` [PATCH 16/16] odb: generic inmemory source Patrick Steinhardt 2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano 2026-04-08 8:22 ` Patrick Steinhardt 2026-04-08 21:48 ` Junio C Hamano 2026-04-09 5:22 ` Patrick Steinhardt 2026-04-09 13:46 ` Junio C Hamano 2026-04-10 4:53 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 01/17] " Patrick Steinhardt 2026-04-09 9:26 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt 2026-04-09 9:40 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 2026-04-09 11:22 ` Karthik Nayak 2026-04-09 7:24 ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt 2026-04-09 9:49 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 07/17] " Patrick Steinhardt 2026-04-09 10:27 ` Karthik Nayak 2026-04-09 10:41 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt 2026-04-09 11:36 ` Karthik Nayak 2026-04-09 11:46 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 10/17] oidtree: add ability to store data Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt 2026-04-09 19:39 ` Junio C Hamano 2026-04-10 4:53 ` Patrick Steinhardt 2026-04-09 7:24 ` [PATCH v2 17/17] odb: generic in-memory source Patrick Steinhardt 2026-04-09 11:44 ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak 2026-04-09 11:48 ` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 01/17] " Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 09/17] oidtree: add ability to store data Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 16/17] odb: generic in-memory source Patrick Steinhardt 2026-04-10 12:12 ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt 2026-04-14 8:45 ` Karthik Nayak 2026-04-14 8:27 ` [PATCH v3 00/17] odb: introduce "in-memory" source Karthik Nayak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox