git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Karthik Nayak <karthik.188@gmail.com>,
	 Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v2 00/12] Stop depending on `the_repository` in object-related subsystems
Date: Thu, 06 Mar 2025 16:10:24 +0100	[thread overview]
Message-ID: <20250306-b4-pks-objects-without-the-repository-v2-0-f3465327be69@pks.im> (raw)
In-Reply-To: <20250303-b4-pks-objects-without-the-repository-v1-0-c5dd43f2476e@pks.im>

Hi,

this patch series is another step to remove our dependency on the global
`the_repository` variable. The series focusses on subsystems related to
objects.

The intent here is to work towards libification of the whole subsystem
so that we can start splitting out something like an object "backend".
It is thus part of a set of refactorings aimed at allowing pluggable
object databases eventually. I'm not discussing that bigger effort yet,
mostly because it's still taking shape. So these patch series contains
things that make sense standalone, even if pluggable ODBs never get to
be a thing.

Note that this patch series stop short of dropping `the_repository` in
"object-file.c". This is a bigger undertaking, so I'm pushing that into
the next patch series.

The series is built on top of cb0ae672aea (A bit more post -rc0,
2025-02-27) with ps/path-sans-the-repository at 028f618658e (path:
adjust last remaining users of `the_repository`, 2025-02-07) merged into
it.

Changes in v2:
  - Point out why t1050 had to be adapted.
  - Drop the rename of `get_max_object_index()` and
    `get_indexed_object()`.
  - Fix a couple of commit message typos.
  - Link to v1: https://lore.kernel.org/r/20250303-b4-pks-objects-without-the-repository-v1-0-c5dd43f2476e@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (12):
      csum-file: stop depending on `the_repository`
      object: stop depending on `the_repository`
      pack-write: stop depending on `the_repository` and `the_hash_algo`
      environment: move access to "core.bigFileThreshold" into repo settings
      pack-check: stop depending on `the_repository`
      pack-revindex: stop depending on `the_repository`
      pack-bitmap-write: stop depending on `the_repository`
      object-file-convert: stop depending on `the_repository`
      delta-islands: stop depending on `the_repository`
      object-file: split out logic regarding hash algorithms
      hash: fix "-Wsign-compare" warnings
      hash: stop depending on `the_repository` in `null_oid()`

 Makefile                                     |   1 +
 archive.c                                    |   4 +-
 blame.c                                      |   2 +-
 branch.c                                     |   2 +-
 builtin/checkout.c                           |   6 +-
 builtin/clone.c                              |   2 +-
 builtin/describe.c                           |   2 +-
 builtin/diff.c                               |   5 +-
 builtin/fast-export.c                        |  10 +-
 builtin/fast-import.c                        |   8 +-
 builtin/fsck.c                               |   6 +-
 builtin/grep.c                               |   4 +-
 builtin/index-pack.c                         |  16 +-
 builtin/log.c                                |   2 +-
 builtin/ls-files.c                           |   2 +-
 builtin/name-rev.c                           |   4 +-
 builtin/pack-objects.c                       |  17 +-
 builtin/prune.c                              |   2 +-
 builtin/rebase.c                             |   2 +-
 builtin/receive-pack.c                       |   2 +-
 builtin/submodule--helper.c                  |  36 ++--
 builtin/tag.c                                |   2 +-
 builtin/unpack-objects.c                     |   5 +-
 builtin/update-ref.c                         |   2 +-
 builtin/worktree.c                           |   2 +-
 bulk-checkin.c                               |   4 +-
 combine-diff.c                               |   2 +-
 commit-graph.c                               |   9 +-
 commit.c                                     |   2 +-
 config.c                                     |   5 -
 csum-file.c                                  |  28 +--
 csum-file.h                                  |  12 +-
 delta-islands.c                              |  14 +-
 delta-islands.h                              |   2 +-
 diff-lib.c                                   |  10 +-
 diff-no-index.c                              |  28 +--
 diff.c                                       |  14 +-
 diff.h                                       |   2 +-
 dir.c                                        |   2 +-
 environment.c                                |   1 -
 environment.h                                |   1 -
 grep.c                                       |   2 +-
 hash.c                                       | 277 +++++++++++++++++++++++++
 hash.h                                       |   4 +-
 log-tree.c                                   |   2 +-
 merge-ort.c                                  |  26 +--
 merge-recursive.c                            |  12 +-
 meson.build                                  |   1 +
 midx-write.c                                 |  12 +-
 midx.c                                       |   3 +-
 notes-merge.c                                |   2 +-
 notes.c                                      |   2 +-
 object-file-convert.c                        |  29 +--
 object-file-convert.h                        |   3 +-
 object-file.c                                | 292 +--------------------------
 object.c                                     |  21 +-
 object.h                                     |  10 +-
 pack-bitmap-write.c                          |  36 ++--
 pack-bitmap.c                                |  15 +-
 pack-bitmap.h                                |   1 +
 pack-check.c                                 |  12 +-
 pack-revindex.c                              |  35 ++--
 pack-write.c                                 |  55 +++--
 pack.h                                       |  11 +-
 parse-options-cb.c                           |   2 +-
 range-diff.c                                 |   2 +-
 reachable.c                                  |   6 +-
 read-cache.c                                 |   4 +-
 refs.c                                       |  12 +-
 refs/debug.c                                 |   2 +-
 refs/files-backend.c                         |   2 +-
 repo-settings.c                              |  20 ++
 repo-settings.h                              |   5 +
 reset.c                                      |   2 +-
 revision.c                                   |   3 +-
 sequencer.c                                  |  10 +-
 shallow.c                                    |  10 +-
 streaming.c                                  |   3 +-
 submodule-config.c                           |   2 +-
 submodule.c                                  |  28 +--
 t/helper/test-ref-store.c                    |   2 +-
 t/helper/test-submodule-nested-repo-config.c |   2 +-
 t/t1050-large.sh                             |   3 +-
 tree-diff.c                                  |   4 +-
 upload-pack.c                                |  14 +-
 wt-status.c                                  |   4 +-
 xdiff-interface.c                            |   2 +-
 87 files changed, 676 insertions(+), 613 deletions(-)

Range-diff versus v1:

 1:  48ad4678dd2 =  1:  3fefb9537f1 csum-file: stop depending on `the_repository`
 2:  91843ef439a !  2:  11cf55dfa1d object: stop depending on `the_repository`
    @@ builtin/fsck.c: static void check_connectivity(void)
      
      	/* Look up all the requirements, warn about missing objects.. */
     -	max = get_max_object_index();
    -+	max = repo_get_max_object_index(the_repository);
    ++	max = get_max_object_index(the_repository);
      	if (verbose)
      		fprintf_ln(stderr, _("Checking connectivity (%d objects)"), max);
      
      	for (i = 0; i < max; i++) {
     -		struct object *obj = get_indexed_object(i);
    -+		struct object *obj = repo_get_indexed_object(the_repository, i);
    ++		struct object *obj = get_indexed_object(the_repository, i);
      
      		if (obj)
      			check_object(obj);
    @@ builtin/index-pack.c: static unsigned check_objects(void)
      	unsigned i, max, foreign_nr = 0;
      
     -	max = get_max_object_index();
    -+	max = repo_get_max_object_index(the_repository);
    ++	max = get_max_object_index(the_repository);
      
      	if (verbose)
      		progress = start_delayed_progress(the_repository,
    @@ builtin/index-pack.c: static unsigned check_objects(void)
      
      	for (i = 0; i < max; i++) {
     -		foreign_nr += check_object(get_indexed_object(i));
    -+		foreign_nr += check_object(repo_get_indexed_object(the_repository, i));
    ++		foreign_nr += check_object(get_indexed_object(the_repository, i));
      		display_progress(progress, i + 1);
      	}
      
    @@ builtin/name-rev.c: int cmd_name_rev(int argc,
      		int i, max;
      
     -		max = get_max_object_index();
    -+		max = repo_get_max_object_index(the_repository);
    ++		max = get_max_object_index(the_repository);
      		for (i = 0; i < max; i++) {
     -			struct object *obj = get_indexed_object(i);
    -+			struct object *obj = repo_get_indexed_object(the_repository, i);
    ++			struct object *obj = get_indexed_object(the_repository, i);
      			if (!obj || obj->type != OBJ_COMMIT)
      				continue;
      			show_name(obj, NULL,
    @@ object.c
      #include "loose.h"
      
     -unsigned int get_max_object_index(void)
    -+unsigned int repo_get_max_object_index(const struct repository *repo)
    ++unsigned int get_max_object_index(const struct repository *repo)
      {
     -	return the_repository->parsed_objects->obj_hash_size;
     +	return repo->parsed_objects->obj_hash_size;
      }
      
     -struct object *get_indexed_object(unsigned int idx)
    -+struct object *repo_get_indexed_object(const struct repository *repo,
    ++struct object *get_indexed_object(const struct repository *repo,
     +				       unsigned int idx)
      {
     -	return the_repository->parsed_objects->obj_hash[idx];
    @@ object.h: int type_from_string_gently(const char *str, ssize_t, int gentle);
       * Return the current number of buckets in the object hashmap.
       */
     -unsigned int get_max_object_index(void);
    -+unsigned int repo_get_max_object_index(const struct repository *repo);
    ++unsigned int get_max_object_index(const struct repository *repo);
      
      /*
       * Return the object from the specified bucket in the object hashmap.
       */
     -struct object *get_indexed_object(unsigned int);
    -+struct object *repo_get_indexed_object(const struct repository *repo,
    ++struct object *get_indexed_object(const struct repository *repo,
     +				       unsigned int);
      
      /*
    @@ shallow.c: static void paint_down(struct paint_info *info, const struct object_i
      	}
      
     -	nr = get_max_object_index();
    -+	nr = repo_get_max_object_index(the_repository);
    ++	nr = get_max_object_index(the_repository);
      	for (i = 0; i < nr; i++) {
     -		struct object *o = get_indexed_object(i);
    -+		struct object *o = repo_get_indexed_object(the_repository, i);
    ++		struct object *o = get_indexed_object(the_repository, i);
      		if (o && o->type == OBJ_COMMIT)
      			o->flags &= ~SEEN;
      	}
    @@ shallow.c: void assign_shallow_commits_to_refs(struct shallow_info *info,
      	 * (new) shallow commits.
      	 */
     -	nr = get_max_object_index();
    -+	nr = repo_get_max_object_index(the_repository);
    ++	nr = get_max_object_index(the_repository);
      	for (i = 0; i < nr; i++) {
     -		struct object *o = get_indexed_object(i);
    -+		struct object *o = repo_get_indexed_object(the_repository, i);
    ++		struct object *o = get_indexed_object(the_repository, i);
      		if (!o || o->type != OBJ_COMMIT)
      			continue;
      
    @@ upload-pack.c: static int do_reachable_revlist(struct child_process *cmd,
      
     -	for (i = get_max_object_index(); 0 < i; ) {
     -		o = get_indexed_object(--i);
    -+	for (i = repo_get_max_object_index(the_repository); 0 < i; ) {
    -+		o = repo_get_indexed_object(the_repository, --i);
    ++	for (i = get_max_object_index(the_repository); 0 < i; ) {
    ++		o = get_indexed_object(the_repository, --i);
      		if (!o)
      			continue;
      		if (reachable && o->type == OBJ_COMMIT)
    @@ upload-pack.c: static int get_reachable_list(struct upload_pack_data *data,
      	}
     -	for (i = get_max_object_index(); 0 < i; i--) {
     -		o = get_indexed_object(i - 1);
    -+	for (i = repo_get_max_object_index(the_repository); 0 < i; i--) {
    -+		o = repo_get_indexed_object(the_repository, i - 1);
    ++	for (i = get_max_object_index(the_repository); 0 < i; i--) {
    ++		o = get_indexed_object(the_repository, i - 1);
      		if (o && o->type == OBJ_COMMIT &&
      		    (o->flags & TMP_MARK)) {
      			add_object_array(o, NULL, reachable);
 3:  d88c3aa6fc5 =  3:  5f2dcc39b7d pack-write: stop depending on `the_repository` and `the_hash_algo`
 4:  be7bd50c73c !  4:  21677355fed environment: move access to "core.bigFileThreshold" into repo settings
    @@ Commit message
         Refactor the code so that we instead store the value in `struct
         repo_settings`, where the value is computed as-needed and cached.
     
    +    Note that this change requires us to adapt one test in t1050 that
    +    verifies that we die when parsing an invalid "core.bigFileThreshold"
    +    value. The exercised Git command doesn't use the value at all, and thus
    +    it won't hit the new code path that parses the value. This is addressed
    +    by using git-hash-object(1) instead, which does read the value.
    +
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## archive.c ##
 5:  349fd4a74f4 =  5:  def4ca73269 pack-check: stop depending on `the_repository`
 6:  dcc76a793a3 =  6:  cb8c7246af0 pack-revindex: stop depending on `the_repository`
 7:  5964138dad8 =  7:  d97ea8590af pack-bitmap-write: stop depending on `the_repository`
 8:  a35396c8981 !  8:  244d21ba448 object-file-convert: stop depending on `the_repository`
    @@ Commit message
         using `the_hash_algo`. All of these callsites are transitively called
         from `convert_object_file()`, which indeed has no repo as input.
     
    -    Refactor the function so that it receives a repository as parameter and
    -    pass it through to all internal functions to get rid of the dependency.
    -    Remove the `USE_THE_REPOSITORY_VARIABLE` define.
    +    Refactor the function so that it receives a repository as a parameter
    +    and pass it through to all internal functions to get rid of the
    +    dependency. Remove the `USE_THE_REPOSITORY_VARIABLE` define.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
 9:  fdb9aebb23d !  9:  7f44a1ee7d6 delta-islands: stop depending on `the_repository`
    @@ Commit message
         using `the_hash_algo`.
     
         Refactor the code to stop using `the_repository`. In most cases this is
    -    trivial because we already had a repository availabe in the calling
    +    trivial because we already had a repository available in the calling
         context, with the only exception being `propagate_island_marks()`. Adapt
         it so that the repository gets passed in via a parameter.
     
10:  0db58f487f9 = 10:  df2a72c7a16 object-file: split out logic regarding hash algorithms
11:  0ce33b057d3 = 11:  b5cdfff5719 hash: fix "-Wsign-compare" warnings
12:  e5644afa940 = 12:  36d09bc2707 hash: stop depending on `the_repository` in `null_oid()`

---
base-commit: e2cb568e11f4ceb427ba4205e6b8a4426d26be12
change-id: 20250210-b4-pks-objects-without-the-repository-6ba8398f7cc0


  parent reply	other threads:[~2025-03-06 15:10 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-03  8:47 [PATCH 00/12] Stop depending on `the_repository` in object-related subsystems Patrick Steinhardt
2025-03-03  8:47 ` [PATCH 01/12] csum-file: stop depending on `the_repository` Patrick Steinhardt
2025-03-06 10:37   ` Karthik Nayak
2025-03-03  8:47 ` [PATCH 02/12] object: " Patrick Steinhardt
2025-03-06 11:07   ` Karthik Nayak
2025-03-06 14:55     ` Patrick Steinhardt
2025-03-03  8:47 ` [PATCH 03/12] pack-write: stop depending on `the_repository` and `the_hash_algo` Patrick Steinhardt
2025-03-04 18:46   ` Justin Tobler
2025-03-03  8:47 ` [PATCH 04/12] environment: move access to "core.bigFileThreshold" into repo settings Patrick Steinhardt
2025-03-04 19:32   ` Justin Tobler
2025-03-06 14:54     ` Patrick Steinhardt
2025-03-03  8:47 ` [PATCH 05/12] pack-check: stop depending on `the_repository` Patrick Steinhardt
2025-03-06 11:14   ` Karthik Nayak
2025-03-03  8:47 ` [PATCH 06/12] pack-revindex: " Patrick Steinhardt
2025-03-03  8:47 ` [PATCH 07/12] pack-bitmap-write: " Patrick Steinhardt
2025-03-03  8:47 ` [PATCH 08/12] object-file-convert: " Patrick Steinhardt
2025-03-04 19:45   ` Justin Tobler
2025-03-03  8:47 ` [PATCH 09/12] delta-islands: " Patrick Steinhardt
2025-03-04 19:48   ` Justin Tobler
2025-03-03  8:47 ` [PATCH 10/12] object-file: split out logic regarding hash algorithms Patrick Steinhardt
2025-03-03  8:47 ` [PATCH 11/12] hash: fix "-Wsign-compare" warnings Patrick Steinhardt
2025-03-03  8:47 ` [PATCH 12/12] hash: stop depending on `the_repository` in `null_oid()` Patrick Steinhardt
2025-03-04 20:16   ` Justin Tobler
2025-03-06 11:20 ` [PATCH 00/12] Stop depending on `the_repository` in object-related subsystems Karthik Nayak
2025-03-06 15:10 ` Patrick Steinhardt [this message]
2025-03-06 15:10   ` [PATCH v2 01/12] csum-file: stop depending on `the_repository` Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 02/12] object: " Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 03/12] pack-write: stop depending on `the_repository` and `the_hash_algo` Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 04/12] environment: move access to "core.bigFileThreshold" into repo settings Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 05/12] pack-check: stop depending on `the_repository` Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 06/12] pack-revindex: " Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 07/12] pack-bitmap-write: " Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 08/12] object-file-convert: " Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 09/12] delta-islands: " Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 10/12] object-file: split out logic regarding hash algorithms Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 11/12] hash: fix "-Wsign-compare" warnings Patrick Steinhardt
2025-03-06 15:10   ` [PATCH v2 12/12] hash: stop depending on `the_repository` in `null_oid()` Patrick Steinhardt
2025-03-06 19:14     ` Junio C Hamano
2025-03-07  9:08       ` Patrick Steinhardt
2025-03-07 16:53         ` Junio C Hamano
2025-03-06 15:29   ` [PATCH v2 00/12] Stop depending on `the_repository` in object-related subsystems Karthik Nayak
2025-03-07 14:18 ` [PATCH v3 " Patrick Steinhardt
2025-03-07 14:18   ` [PATCH v3 01/12] csum-file: stop depending on `the_repository` Patrick Steinhardt
2025-03-07 14:18   ` [PATCH v3 02/12] object: " Patrick Steinhardt
2025-03-07 14:18   ` [PATCH v3 03/12] pack-write: stop depending on `the_repository` and `the_hash_algo` Patrick Steinhardt
2025-03-07 14:18   ` [PATCH v3 04/12] environment: move access to "core.bigFileThreshold" into repo settings Patrick Steinhardt
2025-03-07 14:18   ` [PATCH v3 05/12] pack-check: stop depending on `the_repository` Patrick Steinhardt
2025-03-07 14:18   ` [PATCH v3 06/12] pack-revindex: " Patrick Steinhardt
2025-03-07 14:19   ` [PATCH v3 07/12] pack-bitmap-write: " Patrick Steinhardt
2025-03-07 14:19   ` [PATCH v3 08/12] object-file-convert: " Patrick Steinhardt
2025-03-07 14:19   ` [PATCH v3 09/12] delta-islands: " Patrick Steinhardt
2025-03-07 14:19   ` [PATCH v3 10/12] object-file: split out logic regarding hash algorithms Patrick Steinhardt
2025-03-07 14:19   ` [PATCH v3 11/12] hash: fix "-Wsign-compare" warnings Patrick Steinhardt
2025-03-07 14:19   ` [PATCH v3 12/12] hash: stop depending on `the_repository` in `null_oid()` Patrick Steinhardt
2025-03-08 16:05     ` Elijah Newren
2025-03-10  7:11       ` Patrick Steinhardt
2025-03-10 22:37         ` Elijah Newren
2025-03-10 15:38       ` Junio C Hamano
2025-03-08 16:11   ` [PATCH v3 00/12] Stop depending on `the_repository` in object-related subsystems Elijah Newren
2025-03-10  7:13 ` [PATCH v4 " Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 01/12] csum-file: stop depending on `the_repository` Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 02/12] object: " Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 03/12] pack-write: stop depending on `the_repository` and `the_hash_algo` Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 04/12] environment: move access to "core.bigFileThreshold" into repo settings Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 05/12] pack-check: stop depending on `the_repository` Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 06/12] pack-revindex: " Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 07/12] pack-bitmap-write: " Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 08/12] object-file-convert: " Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 09/12] delta-islands: " Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 10/12] object-file: split out logic regarding hash algorithms Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 11/12] hash: fix "-Wsign-compare" warnings Patrick Steinhardt
2025-03-10  7:13   ` [PATCH v4 12/12] hash: stop depending on `the_repository` in `null_oid()` Patrick Steinhardt
2025-03-10 22:39   ` [PATCH v4 00/12] Stop depending on `the_repository` in object-related subsystems Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250306-b4-pks-objects-without-the-repository-v2-0-f3465327be69@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    --cc=karthik.188@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).