Git development
 help / color / mirror / Atom feed
From: Justin Tobler <jltobler@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 01/16] odb: introduce "inmemory" source
Date: Wed, 8 Apr 2026 16:00:48 -0500	[thread overview]
Message-ID: <ada_W-IWfNKUKnVK@denethor> (raw)
In-Reply-To: <20260403-b4-pks-odb-source-inmemory-v1-1-8b8d1abaa25e@pks.im>

On 26/04/03 08:01AM, Patrick Steinhardt wrote:
> Next to our typical object database sources, each object database also
> has an implicit source of "cached" objects. These cached objects only
> exist in memory and some use cases:
> 
>   - They contain evergreen objects that we expect to always exist, like
>     for example the empty tree.
> 
>   - They can be used to store temporary objects that we don't want to
>     persist to disk.
> 
> Overall, their use is somewhat restricted though. For example, we don't
> provide the ability to use it as a temporary object database source that
> allows the user to write objects, but discard them after Git exists. So
> while these cached objects behave almost like a source, they aren't used
> as one.

I find the wording of the second bullet point and paragraph above a
little confusing. Are there existing uses where new objects are written
to only the cache?

> This is about to change over the following commits, where we will turn
> cached objects into a new "inmemory" source. This will allow us to use
> it exactly the same as any other source by providing the same common
> interface as the "files" source.

Treating the object cache just like any other ODB source seems like a
good direction.

> For now, the inmemory source only hosts the cached objects and doesn't
> provide any logic yet. This will change with subsequent commits, where
> we move respective functionality into the source.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Makefile              |  1 +
>  meson.build           |  1 +
>  odb.c                 | 21 +++++++++++++--------
>  odb.h                 |  4 ++--
>  odb/source-inmemory.c | 12 ++++++++++++
>  odb/source-inmemory.h | 35 +++++++++++++++++++++++++++++++++++
>  odb/source.h          |  3 +++
>  7 files changed, 67 insertions(+), 10 deletions(-)
> 
[snip]
> @@ -1123,9 +1126,11 @@ void odb_free(struct object_database *o)
>  	odb_close(o);
>  	odb_free_sources(o);
>  
> -	for (size_t i = 0; i < o->cached_object_nr; i++)
> -		free((char *) o->cached_objects[i].value.buf);
> -	free(o->cached_objects);
> +	for (size_t i = 0; i < o->inmemory_objects->objects_nr; i++)
> +		free((char *) o->inmemory_objects->objects[i].value.buf);
> +	free(o->inmemory_objects->objects);
> +	free(o->inmemory_objects->base.path);
> +	free(o->inmemory_objects);

Should we have some sort of `odb_source_inmemory_release()`?

>  
>  	string_list_clear(&o->submodule_source_paths, 0);
>  
> diff --git a/odb.h b/odb.h
> index 3a711f6547..3d20270a05 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -8,6 +8,7 @@
>  #include "thread-utils.h"
>  
>  struct cached_object_entry;
> +struct odb_source_inmemory;
>  struct packed_git;
>  struct repository;
>  struct strbuf;
> @@ -98,8 +99,7 @@ struct object_database {
>  	 * to write them into the object store (e.g. a browse-only
>  	 * application).
>  	 */
> -	struct cached_object_entry *cached_objects;
> -	size_t cached_object_nr, cached_object_alloc;
> +	struct odb_source_inmemory *inmemory_objects;

We store an inmemory ODB source instead of the cache object info
directly. Makes sense. 

>  
>  	/*
>  	 * A fast, rough count of the number of objects in the repository.
> diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
> new file mode 100644
> index 0000000000..c7ac5c24f0
> --- /dev/null
> +++ b/odb/source-inmemory.c
> @@ -0,0 +1,12 @@
> +#include "git-compat-util.h"
> +#include "odb/source-inmemory.h"
> +
> +struct odb_source_inmemory *odb_source_inmemory_new(struct object_database *odb)
> +{
> +	struct odb_source_inmemory *source;
> +
> +	CALLOC_ARRAY(source, 1);
> +	odb_source_init(&source->base, odb, ODB_SOURCE_INMEMORY, "source", false);

huh, so we set the path for the `struct odb_source` to "source". In the
context of an inmemory source, a path doesn't make much sense. I suspect
though that storing a path is likely only useful the context of the
files ODB source. Is there reason for us to still keep this around in
the generic ODB source?

> +
> +	return source;
> +}
> diff --git a/odb/source-inmemory.h b/odb/source-inmemory.h
> new file mode 100644
> index 0000000000..95477bf36d
> --- /dev/null
> +++ b/odb/source-inmemory.h
> @@ -0,0 +1,35 @@
> +#ifndef ODB_SOURCE_INMEMORY_H
> +#define ODB_SOURCE_INMEMORY_H
> +
> +#include "odb/source.h"
> +
> +struct cached_object_entry;
> +
> +/*
> + * An inmemory source that you can write objects to that shall be made
> + * available for reading, but that shouldn't ever be persisted to disk. Note
> + * that any objects written to this source will be stored in memory, so the
> + * number of objects you can store is limited by available system memory.
> + */
> +struct odb_source_inmemory {
> +	struct odb_source base;
> +
> +	struct cached_object_entry *objects;
> +	size_t objects_nr, objects_alloc;
> +};

This new ODB source now just contains the object cache info. Looks good.

-Justin

  reply	other threads:[~2026-04-08 21:00 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-03  6:01 [PATCH 00/16] odb: introduce "inmemory" source Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 01/16] " Patrick Steinhardt
2026-04-08 21:00   ` Justin Tobler [this message]
2026-04-09  5:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 02/16] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
2026-04-08 21:05   ` Justin Tobler
2026-04-03  6:01 ` [PATCH 03/16] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
2026-04-08 21:13   ` Justin Tobler
2026-04-09  5:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 04/16] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 05/16] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
2026-04-08 21:24   ` Justin Tobler
2026-04-09  5:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 06/16] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 07/16] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
2026-04-03 22:11   ` Junio C Hamano
2026-04-08  8:22     ` Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 08/16] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 09/16] oidtree: add ability to store data Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 10/16] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 11/16] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
2026-04-03  6:01 ` [PATCH 12/16] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 13/16] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 14/16] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 15/16] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
2026-04-03  6:02 ` [PATCH 16/16] odb: generic inmemory source Patrick Steinhardt
2026-04-03 15:41 ` [PATCH 00/16] odb: introduce "inmemory" source Junio C Hamano
2026-04-08  8:22   ` Patrick Steinhardt
2026-04-08 21:48     ` Junio C Hamano
2026-04-09  5:22       ` Patrick Steinhardt
2026-04-09 13:46         ` Junio C Hamano
2026-04-10  4:53           ` Patrick Steinhardt
2026-04-09  7:24 ` [PATCH v2 00/17] odb: introduce "in-memory" source Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 01/17] " Patrick Steinhardt
2026-04-09  9:26     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
2026-04-09  9:40     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09 11:22         ` Karthik Nayak
2026-04-09  7:24   ` [PATCH v2 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
2026-04-09  9:49     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 07/17] " Patrick Steinhardt
2026-04-09 10:27     ` Karthik Nayak
2026-04-09 10:41       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 08/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 09/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
2026-04-09 11:36     ` Karthik Nayak
2026-04-09 11:46       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 10/17] oidtree: add ability to store data Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 11/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 12/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 13/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 14/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 15/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 16/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
2026-04-09 19:39     ` Junio C Hamano
2026-04-10  4:53       ` Patrick Steinhardt
2026-04-09  7:24   ` [PATCH v2 17/17] odb: generic in-memory source Patrick Steinhardt
2026-04-09 11:44   ` [PATCH v2 00/17] odb: introduce "in-memory" source Karthik Nayak
2026-04-09 11:48     ` Patrick Steinhardt
2026-04-10 12:12 ` [PATCH v3 " Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 01/17] " Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 02/17] odb/source-inmemory: implement `free()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 03/17] odb: fix unnecessary call to `find_cached_object()` Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 04/17] odb/source-inmemory: implement `read_object_info()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 05/17] odb/source-inmemory: implement `read_object_stream()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 06/17] odb/source-inmemory: implement `write_object()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 07/17] odb/source-inmemory: implement `write_object_stream()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 08/17] cbtree: allow using arbitrary wrapper structures for nodes Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 09/17] oidtree: add ability to store data Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 10/17] odb/source-inmemory: convert to use oidtree Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 11/17] odb/source-inmemory: implement `for_each_object()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 12/17] odb/source-inmemory: implement `find_abbrev_len()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 13/17] odb/source-inmemory: implement `count_objects()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 14/17] odb/source-inmemory: implement `freshen_object()` callback Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 15/17] odb/source-inmemory: stub out remaining functions Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 16/17] odb: generic in-memory source Patrick Steinhardt
2026-04-10 12:12   ` [PATCH v3 17/17] t/unit-tests: add tests for the in-memory object source Patrick Steinhardt
2026-04-14  8:45     ` Karthik Nayak
2026-04-14  8:27   ` [PATCH v3 00/17] odb: introduce "in-memory" source Karthik Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ada_W-IWfNKUKnVK@denethor \
    --to=jltobler@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox