Git development
 help / color / mirror / Atom feed
* [PATCH v2 7/8] reftable/merged: really reuse buffers to compute record keys
From: Patrick Steinhardt @ 2023-12-28  6:28 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1486 bytes --]

In 829231dc20 (reftable/merged: reuse buffer to compute record keys,
2023-12-11), we have refactored the merged iterator to reuse a set of
buffers that it would otherwise have to reallocate on every single
iteration. Unfortunately, there was a brown-paper-bag-style bug here as
we continue to release these buffers after the iteration, and thus we
have essentially gained nothing.

Fix this performance issue by not releasing those buffers on iteration
anymore, where we instead rely on `merged_iter_close()` to release the
buffers for us.

Using `git show-ref --quiet` in a repository with ~350k refs this leads
to a significant drop in allocations. Before:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/merged.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/reftable/merged.c b/reftable/merged.c
index 556bb5c556..a28bb99aaf 100644
--- a/reftable/merged.c
+++ b/reftable/merged.c
@@ -128,8 +128,6 @@ static int merged_iter_next_entry(struct merged_iter *mi,
 
 done:
 	reftable_record_release(&entry.rec);
-	strbuf_release(&mi->entry_key);
-	strbuf_release(&mi->key);
 	return err;
 }
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 8/8] reftable/merged: transfer ownership of records when iterating
From: Patrick Steinhardt @ 2023-12-28  6:28 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2155 bytes --]

When iterating over recods with the merged iterator we put the records
into a priority queue before yielding them to the caller. This means
that we need to allocate the contents of these records before we can
pass them over to the caller.

The handover to the caller is quite inefficient though because we first
deallocate the record passed in by the caller and then copy over the new
record, which requires us to reallocate memory.

Refactor the code to instead transfer ownership of the new record to the
caller. So instead of reallocating all contents, we now release the old
record and then copy contents of the new record into place.

The following benchmark of `git show-ref --quiet` in a repository with
around 350k refs shows a clear improvement. Before:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 357,007 allocs, 356,814 frees, 24,193,602 bytes allocated

This shows that we now have roundabout a single allocation per record
that we're yielding from the iterator. Ideally, we'd also get rid of
this allocation so that the number of allocations doesn't scale with the
number of refs anymore. This would require some larger surgery though
because the memory is owned by the priority queue before transferring it
over to the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/merged.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/reftable/merged.c b/reftable/merged.c
index a28bb99aaf..a52914d667 100644
--- a/reftable/merged.c
+++ b/reftable/merged.c
@@ -124,10 +124,12 @@ static int merged_iter_next_entry(struct merged_iter *mi,
 		reftable_record_release(&top.rec);
 	}
 
-	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_release(rec);
+	*rec = entry.rec;
 
 done:
-	reftable_record_release(&entry.rec);
+	if (err)
+		reftable_record_release(&entry.rec);
 	return err;
 }
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v4] sideband.c: remove redundant 'NEEDSWORK' tag
From: Chandra Pratap via GitGitGadget @ 2023-12-28  8:01 UTC (permalink / raw)
  To: git; +Cc: Torsten Bögershausen, Chandra Pratap, Chandra Pratap
In-Reply-To: <pull.1625.v3.git.1703672407895.gitgitgadget@gmail.com>

From: Chandra Pratap <chandrapratap3519@gmail.com>

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
---
    sideband.c: replace int with size_t for clarity

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1625

Range-diff vs v3:

 1:  273415aa6a4 ! 1:  8c003256e5b sideband.c: remove redundant 'NEEDSWORK' tag
     @@ sideband.c: void list_config_color_sideband_slots(struct string_list *list, cons
        *
      - * NEEDSWORK: use "size_t n" instead for clarity.
      + * It is fine to use "int n" here instead of "size_t n" as all calls to this
     -+ * function pass an 'int' parameter.
     ++ * function pass an 'int' parameter. Additionally, the buffer involved in
     ++ * storing these 'int' values takes input from a packet via the pkt-line
     ++ * interface, which is capable of transferring only 64kB at a time.
        */
       static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
       {


 sideband.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/sideband.c b/sideband.c
index 6cbfd391c47..266a67342be 100644
--- a/sideband.c
+++ b/sideband.c
@@ -69,7 +69,10 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
  * of the line. This should be called for a single line only, which is
  * passed as the first N characters of the SRC array.
  *
- * NEEDSWORK: use "size_t n" instead for clarity.
+ * It is fine to use "int n" here instead of "size_t n" as all calls to this
+ * function pass an 'int' parameter. Additionally, the buffer involved in
+ * storing these 'int' values takes input from a packet via the pkt-line
+ * interface, which is capable of transferring only 64kB at a time.
  */
 static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
 {

base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
-- 
gitgitgadget

^ permalink raw reply related

* Re: reftable: How to represent deleted referees of symrefs in the reflog?
From: Patrick Steinhardt @ 2023-12-28  8:14 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: git, Jonathan Nieder, Terry Parker, Luca Milanesio
In-Reply-To: <CAOw_e7aDPTjDj_K3yyyatWkw3R-oT+_u1LgZyzQEE=6LnH+pmg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3802 bytes --]

On Wed, Dec 20, 2023 at 08:03:00PM +0100, Han-Wen Nienhuys wrote:
> On Tue, Nov 21, 2023 at 2:46 PM Patrick Steinhardt <ps@pks.im> wrote:
> > To me it seems like deletions in this case only delete a particular log
> > entry instead of the complete log for a particular reference. And some
> > older discussion [1] seems to confirm my hunch that a complete reflog is
> > deleted not with `log_type = 0x0`, but instead by writing the null
> > object ID as new ID.
> 
> No, writing a null OID (more precisely a transition from $SHA1 =>
> $ZERO) means that a branch was at $SHA1, and then was deleted. The
> reflog continues to exist, and new entries may be added by reviving
> the branch. That would add a $ZERO => $NEWSHA transition, but the
> history of the branch prior to its deletion is retained.
> 
> >  # This behaviour is a bit more on the weird side. We delete the
> >  # referee, and that causes the files backend to claim that the reflog
> >  # for HEAD is gone, as well. The reflog still exists though, as
> >  # demonstrated in the next command.
> >  $ git update-ref -m delete-main -d refs/heads/main
> >  $ git reflog show HEAD
> > fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
> 
> This looks wrong to me. HEAD has a history, and that history didn't go
> away because the current branch happened to be deleted.
> 
> > It kind of feels like the second step in the files backend where the
> > reflog is claimed to not exist is buggy -- I'd have expected to still
> 
> right, I agree.
> 
> > see the reflog, as the HEAD reference exists quite alright and has never
> > stopped to exist. And in the third step, I'd have expected to see three
> > reflog entries, including the deletion that is indeed still present in
> > the on-disk logfile.
> >
> > But with the reftable backend the problem becomes worse: we cannot even
> > represent the fact that the reflog still exists, but that the deletion
> > of the referee has caused the HEAD to point to the null OID, because the
> > null OID indicates complete deletion of the reflog.
> 
> This doesn't match my recollection. See
> https://github.com/git/git/pull/1215, or more precisely
> https://github.com/git/git/blob/3b2c5ddb28fa42a8c944373bea2ca756b1723908/refs/reftable-backend.c#L1582
> 
> Removing the entire reflog means removing all the individual entries
> using tombstones.

Yeah, indeed. I was misinterpreting older discussions here, and now use
individual tombstones which does work as expected. It's a bit
unfortunate that deletion of the reflog thus scales with its size, but
for now I think that's okay. We can still iterate on this if this ever
proves to become a problem.

> > Consequentially, if
> > we wrote the null OID, we'd only be able to see the last log entry here.
> >
> > It may totally be that I'm missing something obvious here. But if not,
> > it leaves us with a couple of options for how to fix it:
> >
> >     1. Disregard this edge case and accept that the reftable backend
> >        does not do exactly the same thing as the files backend in very
> >        specific situations like this.
> 
> I remember discussing with Jun that it would be acceptable to have
> slight semantic differences if unavoidable for the reflogs, and there
> should be a record of this in the list.  I think there will always be
> some differences: for example, dropping reflogs because of a dir/file
> conflict seems like a bug rather than a feature.

Initially I'm aiming for matching semantics, because this will make it a
ton easier to land the initial implementation of the backend. But I do
agree that eventually we may want to iterate and allow for diverging
behaviour between the backends.

Thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 01/12] t: introduce DEFAULT_REPO_FORMAT prereq
From: Patrick Steinhardt @ 2023-12-28  8:55 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git
In-Reply-To: <CAOLa=ZSY-FBUMOx9SHkJqf7XteQC=hVoYxo73q=E1wL3QOBYtA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 724 bytes --]

On Fri, Dec 22, 2023 at 03:41:10AM -0800, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
[snip]
> > -test_expect_success SHA1 'git branch -m q q2 without config should succeed' '
> > +test_expect_success DEFAULT_REPO_FORMAT 'git branch -m q q2 without config should succeed' '
> >  	git branch -m q q2 &&
> >  	git branch -m q2 q
> >  '
> 
> My only concern is whether we'll end up blindly adding
> DEFAULT_REPO_FORMAT for tests where only SHA1 is a prereq or only where
> the new format extension is a prereq.

Yeah, this is of course a possibility with this new prereq. I don't
really know what to do about it though, and think that it does serve a
sensible purpose. So... dunno.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 03/12] refs: refactor logic to look up storage backends
From: Patrick Steinhardt @ 2023-12-28  8:56 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git
In-Reply-To: <CAOLa=ZR1Bc+Acwjma9OM0SvyLg4-HB-S3Pxr66VCKTb0d_tdnw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2044 bytes --]

On Fri, Dec 22, 2023 at 04:38:30AM -0800, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > +
> > +const char *ref_storage_format_to_name(int ref_storage_format)
> > +{
> > +	const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
> > +	if (!be)
> > +		return "unknown";
> > +	return be->name;
> > +}
> >
> 
> Would it make more sense to return NULL here?

The only purpose of this function will be to print error messages, so
"unknown" seems like the better choice.

> >  /*
> >   * How to handle various characters in refnames:
> >   * 0: An acceptable character for refs
> > @@ -2029,12 +2045,12 @@ static struct ref_store *ref_store_init(struct repository *repo,
> >  					const char *gitdir,
> >  					unsigned int flags)
> >  {
> > -	const char *be_name = "files";
> > -	struct ref_storage_be *be = find_ref_storage_backend(be_name);
> > +	int format = REF_STORAGE_FORMAT_FILES;
> > +	const struct ref_storage_be *be = find_ref_storage_backend(format);
> >  	struct ref_store *refs;
> >
> >  	if (!be)
> > -		BUG("reference backend %s is unknown", be_name);
> > +		BUG("reference backend %s is unknown", ref_storage_format_to_name(format));
> >
> 
> This doesn't really get us more information, since be == NULL, calling
> `ref_storage_format_to_name` will again call `find_ref_storage_backend`,
> which will be NULL. I wonder if its better to:
> 
> @@ -2029,12 +2045,12 @@ static struct ref_store *ref_store_init(struct
> repository *repo,
>                      const char *gitdir,
>                      unsigned int flags)
>   {
>  -	const char *be_name = "files";
>  -	struct ref_storage_be *be = find_ref_storage_backend(be_name);
>  +	int format = REF_STORAGE_FORMAT_FILES;
>  +	const struct ref_storage_be *be = find_ref_storage_backend(format);
>      struct ref_store *refs;
> 
>      if (!be)
>  -		BUG("reference backend %s is unknown", be_name);
>  +		BUG("reference backend is unknown");

Good point, will change.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 04/12] setup: start tracking ref storage format when
From: Patrick Steinhardt @ 2023-12-28  8:56 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git
In-Reply-To: <CAOLa=ZQLLqj=78dJN0y51EFf3_XsHnTBYNjkB0cAAGf3dXJ9KA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1230 bytes --]

On Fri, Dec 22, 2023 at 05:09:37AM -0800, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/refs.c b/refs.c
> > index e87c85897d..d289a5e175 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -2045,12 +2045,12 @@ static struct ref_store *ref_store_init(struct repository *repo,
> >  					const char *gitdir,
> >  					unsigned int flags)
> >  {
> > -	int format = REF_STORAGE_FORMAT_FILES;
> > -	const struct ref_storage_be *be = find_ref_storage_backend(format);
> > +	const struct ref_storage_be *be;
> >  	struct ref_store *refs;
> >
> > +	be = find_ref_storage_backend(repo->ref_storage_format);
> >  	if (!be)
> > -		BUG("reference backend %s is unknown", ref_storage_format_to_name(format));
> > +		BUG("reference backend is unknown");
> >
> >  	refs = be->init(repo, gitdir, flags);
> >  	return refs;
> > diff --git a/refs.h b/refs.h
> > index c6571bcf6c..944a67ac1b 100644
> > --- a/refs.h
> > +++ b/refs.h
> > @@ -11,6 +11,7 @@ struct string_list;
> >  struct string_list_item;
> >  struct worktree;
> >
> > +int default_ref_storage_format(void);
> >
> 
> Is this used/defined somewhere?

No, it's a leftover from a previous iteration. Will drop.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 04/12] setup: start tracking ref storage format when
From: Patrick Steinhardt @ 2023-12-28  8:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqv88ssp4r.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 1629 bytes --]

On Wed, Dec 20, 2023 at 10:30:28AM -0800, Junio C Hamano wrote:
> There is a topic in-flight that introduces the .compat_hash_algo
> member to the repo_fmt structure.  Seeing a conflict resolution like
> the attached (there are many others that are similar in spirit), I
> have to wonder if we want to add repo_set_ref_storage_format()
> helper function.  There are many assignments to .ref_storage_format
> member after this series is applied.
> 
> Note that I haven't read the series in full, so take the above with
> a grain of salt---it might turn out to be that direct assignment is
> more desirable, I dunno.
> 
> Thanks.

Wrapping this in a `repo_set_ref_storage_format()` doesn't add much for
now, but on the other hand it also doesn't hurt and is more in line with
how we set the other formats. This could also be useful in the future to
have a central place for additional sanity checks, if required. So yeah,
let's do it.

Makes me wonder whether we should then also add the following diff to
"setup: set repository's format on init" when both topics are being
merged together:

diff --git a/setup.c b/setup.c
index 3d980814bc..3d35c78c68 100644
--- a/setup.c
+++ b/setup.c
@@ -2210,6 +2210,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * format we can update the repository's settings accordingly.
 	 */
 	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+	repo_set_compat_hash_algo(the_repository, repo_fmt.compat_hash_algo);
 	repo_set_ref_storage_format(the_repository, repo_fmt.ref_storage_format);
 
 	if (!(flags & INIT_DB_SKIP_REFDB))

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 00/12] Introduce `refStorage` extension
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703067989.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 12827 bytes --]

Hi,

this is the second version of my patch series that introduces the new
`refStorage` extension. This extension will be used for the upcoming
reftable backend.

Changes compared to v2:

  - Fixed various typos in commit messages.

  - Fixed redundant information when the refstorage extension's value
    isn't understood.

  - Introduced `repo_set_ref_storage_format()`.

Thanks for the feedback so far!

Patrick

Patrick Steinhardt (12):
  t: introduce DEFAULT_REPO_FORMAT prereq
  worktree: skip reading HEAD when repairing worktrees
  refs: refactor logic to look up storage backends
  setup: start tracking ref storage format
  setup: set repository's formats on init
  setup: introduce "extensions.refStorage" extension
  setup: introduce GIT_DEFAULT_REF_FORMAT envvar
  t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
  builtin/rev-parse: introduce `--show-ref-format` flag
  builtin/init: introduce `--ref-format=` value flag
  builtin/clone: introduce `--ref-format=` value flag
  t9500: write "extensions.refstorage" into config

 Documentation/config/extensions.txt           | 11 +++
 Documentation/git-clone.txt                   |  6 ++
 Documentation/git-init.txt                    |  7 ++
 Documentation/git-rev-parse.txt               |  3 +
 Documentation/git.txt                         |  5 ++
 Documentation/ref-storage-format.txt          |  1 +
 .../technical/repository-version.txt          |  5 ++
 builtin/clone.c                               | 17 ++++-
 builtin/init-db.c                             | 15 +++-
 builtin/rev-parse.c                           |  4 ++
 refs.c                                        | 34 ++++++---
 refs.h                                        |  3 +
 refs/debug.c                                  |  1 -
 refs/files-backend.c                          |  1 -
 refs/packed-backend.c                         |  1 -
 refs/refs-internal.h                          |  1 -
 repository.c                                  |  6 ++
 repository.h                                  |  7 ++
 setup.c                                       | 63 +++++++++++++++--
 setup.h                                       |  9 ++-
 t/README                                      |  3 +
 t/t0001-init.sh                               | 70 +++++++++++++++++++
 t/t1500-rev-parse.sh                          | 17 +++++
 t/t3200-branch.sh                             |  2 +-
 t/t5601-clone.sh                              | 17 +++++
 t/t9500-gitweb-standalone-no-errors.sh        |  5 ++
 t/test-lib-functions.sh                       |  5 ++
 t/test-lib.sh                                 | 15 +++-
 worktree.c                                    | 24 ++++---
 29 files changed, 323 insertions(+), 35 deletions(-)
 create mode 100644 Documentation/ref-storage-format.txt

Range-diff against v1:
 1:  239ca38efd !  1:  3613439cb7 t: introduce DEFAULT_REPO_FORMAT prereq
    @@ Commit message
         repository format or otherwise they would fail to run, e.g. because they
         fail to detect the correct hash function. While the hash function is the
         only extension right now that creates problems like this, we are about
    -    to add a second extensions for the ref format.
    +    to add a second extension for the ref format.
     
         Introduce a new DEFAULT_REPO_FORMAT prereq that can easily be amended
         whenever we add new format extensions. Next to making any such changes
 2:  e895091025 !  2:  ecf4f1ddee worktree: skip reading HEAD when repairing worktrees
    @@ Commit message
         worktree: skip reading HEAD when repairing worktrees
     
         When calling `git init --separate-git-dir=<new-path>` on a preexisting
    -    repository, then we move the Git directory of that repository to the new
    -    path specified by the user. If there are worktrees present in the Git
    -    repository, we need to repair the worktrees so that their gitlinks point
    -    to the new location of the repository.
    +    repository, we move the Git directory of that repository to the new path
    +    specified by the user. If there are worktrees present in the repository,
    +    we need to repair the worktrees so that their gitlinks point to the new
    +    location of the repository.
     
         This repair logic will load repositories via `get_worktrees()`, which
         will enumerate up and initialize all worktrees. Part of initialization
    @@ Commit message
         format, which does not work.
     
         We do not require the worktree HEADs at all to repair worktrees. So
    -    let's fix issue this by skipping over the step that reads them.
    +    let's fix this issue by skipping over the step that reads them.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
 3:  f712d5ef5b !  3:  12329b99b7 refs: refactor logic to look up storage backends
    @@ refs.c
     -	for (be = refs_backends; be; be = be->next)
     -		if (!strcmp(be->name, name))
     -			return be;
    -+	if (ref_storage_format && ref_storage_format < ARRAY_SIZE(refs_backends))
    ++	if (ref_storage_format < ARRAY_SIZE(refs_backends))
     +		return refs_backends[ref_storage_format];
      	return NULL;
      }
    @@ refs.c: static struct ref_store *ref_store_init(struct repository *repo,
      
      	if (!be)
     -		BUG("reference backend %s is unknown", be_name);
    -+		BUG("reference backend %s is unknown", ref_storage_format_to_name(format));
    ++		BUG("reference backend is unknown");
      
      	refs = be->init(repo, gitdir, flags);
      	return refs;
 4:  6564659d40 !  4:  ddd099fbaf setup: start tracking ref storage format when
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    setup: start tracking ref storage format when
    +    setup: start tracking ref storage format
     
         In order to discern which ref storage format a repository is supposed to
         use we need to start setting up and/or discovering the format. This
    @@ Commit message
           - The first path is when we create a repository via `init_db()`. When
             we are re-initializing a preexisting repository we need to retain
             the previously used ref storage format -- if the user asked for a
    -        different format then this indicates an erorr and we error out.
    +        different format then this indicates an error and we error out.
             Otherwise we either initialize the repository with the format asked
             for by the user or the default format, which currently is the
             "files" backend.
    @@ refs.c: static struct ref_store *ref_store_init(struct repository *repo,
      
     +	be = find_ref_storage_backend(repo->ref_storage_format);
      	if (!be)
    --		BUG("reference backend %s is unknown", ref_storage_format_to_name(format));
    -+		BUG("reference backend is unknown");
    - 
    - 	refs = be->init(repo, gitdir, flags);
    - 	return refs;
    -
    - ## refs.h ##
    -@@ refs.h: struct string_list;
    - struct string_list_item;
    - struct worktree;
    - 
    -+int default_ref_storage_format(void);
    - int ref_storage_format_by_name(const char *name);
    - const char *ref_storage_format_to_name(int ref_storage_format);
    + 		BUG("reference backend is unknown");
      
     
      ## repository.c ##
    +@@ repository.c: void repo_set_hash_algo(struct repository *repo, int hash_algo)
    + 	repo->hash_algo = &hash_algos[hash_algo];
    + }
    + 
    ++void repo_set_ref_storage_format(struct repository *repo, int format)
    ++{
    ++	repo->ref_storage_format = format;
    ++}
    ++
    + /*
    +  * Attempt to resolve and set the provided 'gitdir' for repository 'repo'.
    +  * Return 0 upon success and a non-zero value upon failure.
     @@ repository.c: int repo_init(struct repository *repo,
      		goto error;
      
      	repo_set_hash_algo(repo, format.hash_algo);
    -+	repo->ref_storage_format = format.ref_storage_format;
    ++	repo_set_ref_storage_format(repo, format.ref_storage_format);
      	repo->repository_format_worktree_config = format.worktree_config;
      
      	/* take ownership of format.partial_clone */
    @@ repository.h: struct repository {
      	/* A unique-id for tracing purposes. */
      	int trace2_repo_id;
      
    +@@ repository.h: void repo_set_gitdir(struct repository *repo, const char *root,
    + 		     const struct set_gitdir_args *extra_args);
    + void repo_set_worktree(struct repository *repo, const char *path);
    + void repo_set_hash_algo(struct repository *repo, int algo);
    ++void repo_set_ref_storage_format(struct repository *repo, int format);
    + void initialize_the_repository(void);
    + RESULT_MUST_BE_USED
    + int repo_init(struct repository *r, const char *gitdir, const char *worktree);
     
      ## setup.c ##
     @@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
      		}
      		if (startup_info->have_repository) {
      			repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
    -+			the_repository->ref_storage_format =
    -+				repo_fmt.ref_storage_format;
    ++			repo_set_ref_storage_format(the_repository,
    ++						    repo_fmt.ref_storage_format);
      			the_repository->repository_format_worktree_config =
      				repo_fmt.worktree_config;
      			/* take ownership of repo_fmt.partial_clone */
    @@ setup.c: void check_repository_format(struct repository_format *fmt)
      	check_repository_format_gently(get_git_dir(), fmt, NULL);
      	startup_info->have_repository = 1;
      	repo_set_hash_algo(the_repository, fmt->hash_algo);
    -+	the_repository->ref_storage_format =
    -+		fmt->ref_storage_format;
    ++	repo_set_ref_storage_format(the_repository,
    ++				    fmt->ref_storage_format);
      	the_repository->repository_format_worktree_config =
      		fmt->worktree_config;
      	the_repository->repository_format_partial_clone =
    @@ setup.c: void create_reference_database(const char *initial_branch, int quiet)
      	safe_create_dir(git_path("refs"), 1);
      	adjust_shared_perm(git_path("refs"));
      
    -+	the_repository->ref_storage_format = ref_storage_format;
    ++	repo_set_ref_storage_format(the_repository, ref_storage_format);
      	if (refs_init_db(&err))
      		die("failed to set up refs db: %s", err.buf);
      
 5:  f90a63d63c !  5:  01a1e58a97 setup: set repository's formats on init
    @@ setup.c: int init_db(const char *git_dir, const char *real_git_dir,
     +	 * Now that we have set up both the hash algorithm and the ref storage
     +	 * format we can update the repository's settings accordingly.
     +	 */
    -+	the_repository->hash_algo = &hash_algos[repo_fmt.hash_algo];
    -+	the_repository->ref_storage_format = repo_fmt.ref_storage_format;
    ++	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
    ++	repo_set_ref_storage_format(the_repository, repo_fmt.ref_storage_format);
     +
      	if (!(flags & INIT_DB_SKIP_REFDB))
      		create_reference_database(repo_fmt.ref_storage_format,
 6:  beeb182f28 =  6:  0a586fa648 setup: introduce "extensions.refStorage" extension
 7:  dd91a75da4 =  7:  6d8754f73a setup: introduce GIT_DEFAULT_REF_FORMAT envvar
 8:  ed3bf008cd =  8:  c645932f3d t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
 9:  8a3d950d69 =  9:  761d647770 builtin/rev-parse: introduce `--show-ref-format` flag
10:  4d98b53553 = 10:  e382b5bf08 builtin/init: introduce `--ref-format=` value flag
11:  71cf0ce827 = 11:  257233658d builtin/clone: introduce `--ref-format=` value flag
12:  bbe2fbb154 ! 12:  b8cd06ec53 t9500: write "extensions.refstorage" into config
    @@ Commit message
         t9500: write "extensions.refstorage" into config
     
         In t9500 we're writing a custom configuration that sets up gitweb. This
    -    requires us manually ensure that the repository format is configured as
    -    required, including both the repository format version and extensions.
    -    With the introduction of the "extensions.refStorage" extension we need
    -    to update the test to also write this new one.
    +    requires us to manually ensure that the repository format is configured
    +    as required, including both the repository format version and
    +    extensions. With the introduction of the "extensions.refStorage"
    +    extension we need to update the test to also write this new one.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     

base-commit: e79552d19784ee7f4bbce278fe25f93fbda196fa
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2 01/12] t: introduce DEFAULT_REPO_FORMAT prereq
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1587 bytes --]

A limited number of tests require repositories to have the default
repository format or otherwise they would fail to run, e.g. because they
fail to detect the correct hash function. While the hash function is the
only extension right now that creates problems like this, we are about
to add a second extension for the ref format.

Introduce a new DEFAULT_REPO_FORMAT prereq that can easily be amended
whenever we add new format extensions. Next to making any such changes
easier on us, the prerequisite's name should also help to clarify the
intent better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t3200-branch.sh | 2 +-
 t/test-lib.sh     | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 6a316f081e..de7d3014e4 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -519,7 +519,7 @@ EOF
 
 mv .git/config .git/config-saved
 
-test_expect_success SHA1 'git branch -m q q2 without config should succeed' '
+test_expect_success DEFAULT_REPO_FORMAT 'git branch -m q q2 without config should succeed' '
 	git branch -m q q2 &&
 	git branch -m q2 q
 '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 876b99562a..dc03f06b8e 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1936,6 +1936,10 @@ test_lazy_prereq SHA1 '
 	esac
 '
 
+test_lazy_prereq DEFAULT_REPO_FORMAT '
+	test_have_prereq SHA1
+'
+
 # Ensure that no test accidentally triggers a Git command
 # that runs the actual maintenance scheduler, affecting a user's
 # system permanently.
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 02/12] worktree: skip reading HEAD when repairing worktrees
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4298 bytes --]

When calling `git init --separate-git-dir=<new-path>` on a preexisting
repository, we move the Git directory of that repository to the new path
specified by the user. If there are worktrees present in the repository,
we need to repair the worktrees so that their gitlinks point to the new
location of the repository.

This repair logic will load repositories via `get_worktrees()`, which
will enumerate up and initialize all worktrees. Part of initialization
is logic that we resolve their respective worktree HEADs, even though
that information may not actually be needed in the end by all callers.

In the context of git-init(1) this is about to become a problem, because
we do not have a repository that was set up via `setup_git_directory()`
or friends. Consequentially, it is not yet fully initialized at the time
of calling `repair_worktrees()`, and properly setting up all parts of
the repository in `init_db()` before we repair worktrees is not an easy
thing to do. While this is okay right now where we only have a single
reference backend in Git, once we gain a second one we would be trying
to look up the worktree HEADs before we have figured out the reference
format, which does not work.

We do not require the worktree HEADs at all to repair worktrees. So
let's fix this issue by skipping over the step that reads them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 worktree.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/worktree.c b/worktree.c
index a56a6c2a3d..9702ed0308 100644
--- a/worktree.c
+++ b/worktree.c
@@ -51,7 +51,7 @@ static void add_head_info(struct worktree *wt)
 /**
  * get the main worktree
  */
-static struct worktree *get_main_worktree(void)
+static struct worktree *get_main_worktree(int skip_reading_head)
 {
 	struct worktree *worktree = NULL;
 	struct strbuf worktree_path = STRBUF_INIT;
@@ -70,11 +70,13 @@ static struct worktree *get_main_worktree(void)
 	 */
 	worktree->is_bare = (is_bare_repository_cfg == 1) ||
 		is_bare_repository();
-	add_head_info(worktree);
+	if (!skip_reading_head)
+		add_head_info(worktree);
 	return worktree;
 }
 
-static struct worktree *get_linked_worktree(const char *id)
+static struct worktree *get_linked_worktree(const char *id,
+					    int skip_reading_head)
 {
 	struct worktree *worktree = NULL;
 	struct strbuf path = STRBUF_INIT;
@@ -93,7 +95,8 @@ static struct worktree *get_linked_worktree(const char *id)
 	CALLOC_ARRAY(worktree, 1);
 	worktree->path = strbuf_detach(&worktree_path, NULL);
 	worktree->id = xstrdup(id);
-	add_head_info(worktree);
+	if (!skip_reading_head)
+		add_head_info(worktree);
 
 done:
 	strbuf_release(&path);
@@ -118,7 +121,7 @@ static void mark_current_worktree(struct worktree **worktrees)
 	free(git_dir);
 }
 
-struct worktree **get_worktrees(void)
+static struct worktree **get_worktrees_internal(int skip_reading_head)
 {
 	struct worktree **list = NULL;
 	struct strbuf path = STRBUF_INIT;
@@ -128,7 +131,7 @@ struct worktree **get_worktrees(void)
 
 	ALLOC_ARRAY(list, alloc);
 
-	list[counter++] = get_main_worktree();
+	list[counter++] = get_main_worktree(skip_reading_head);
 
 	strbuf_addf(&path, "%s/worktrees", get_git_common_dir());
 	dir = opendir(path.buf);
@@ -137,7 +140,7 @@ struct worktree **get_worktrees(void)
 		while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
 			struct worktree *linked = NULL;
 
-			if ((linked = get_linked_worktree(d->d_name))) {
+			if ((linked = get_linked_worktree(d->d_name, skip_reading_head))) {
 				ALLOC_GROW(list, counter + 1, alloc);
 				list[counter++] = linked;
 			}
@@ -151,6 +154,11 @@ struct worktree **get_worktrees(void)
 	return list;
 }
 
+struct worktree **get_worktrees(void)
+{
+	return get_worktrees_internal(0);
+}
+
 const char *get_worktree_git_dir(const struct worktree *wt)
 {
 	if (!wt)
@@ -591,7 +599,7 @@ static void repair_noop(int iserr UNUSED,
 
 void repair_worktrees(worktree_repair_fn fn, void *cb_data)
 {
-	struct worktree **worktrees = get_worktrees();
+	struct worktree **worktrees = get_worktrees_internal(1);
 	struct worktree **wt = worktrees + 1; /* +1 skips main worktree */
 
 	if (!fn)
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 03/12] refs: refactor logic to look up storage backends
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 5129 bytes --]

In order to look up ref storage backends, we're currently using a linked
list of backends, where each backend is expected to set up its `next`
pointer to the next ref storage backend. This is kind of a weird setup
as backends need to be aware of other backends without much of a reason.

Refactor the code so that the array of backends is centrally defined in
"refs.c", where each backend is now identified by an integer constant.
Expose functions to translate from those integer constants to the name
and vice versa, which will be required by subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                | 34 +++++++++++++++++++++++++---------
 refs.h                |  3 +++
 refs/debug.c          |  1 -
 refs/files-backend.c  |  1 -
 refs/packed-backend.c |  1 -
 refs/refs-internal.h  |  1 -
 repository.h          |  3 +++
 7 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index 16bfa21df7..ab14634731 100644
--- a/refs.c
+++ b/refs.c
@@ -33,17 +33,33 @@
 /*
  * List of all available backends
  */
-static struct ref_storage_be *refs_backends = &refs_be_files;
+static const struct ref_storage_be *refs_backends[] = {
+	[REF_STORAGE_FORMAT_FILES] = &refs_be_files,
+};
 
-static struct ref_storage_be *find_ref_storage_backend(const char *name)
+static const struct ref_storage_be *find_ref_storage_backend(int ref_storage_format)
 {
-	struct ref_storage_be *be;
-	for (be = refs_backends; be; be = be->next)
-		if (!strcmp(be->name, name))
-			return be;
+	if (ref_storage_format < ARRAY_SIZE(refs_backends))
+		return refs_backends[ref_storage_format];
 	return NULL;
 }
 
+int ref_storage_format_by_name(const char *name)
+{
+	for (int i = 0; i < ARRAY_SIZE(refs_backends); i++)
+		if (refs_backends[i] && !strcmp(refs_backends[i]->name, name))
+			return i;
+	return REF_STORAGE_FORMAT_UNKNOWN;
+}
+
+const char *ref_storage_format_to_name(int ref_storage_format)
+{
+	const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
+	if (!be)
+		return "unknown";
+	return be->name;
+}
+
 /*
  * How to handle various characters in refnames:
  * 0: An acceptable character for refs
@@ -2029,12 +2045,12 @@ static struct ref_store *ref_store_init(struct repository *repo,
 					const char *gitdir,
 					unsigned int flags)
 {
-	const char *be_name = "files";
-	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	int format = REF_STORAGE_FORMAT_FILES;
+	const struct ref_storage_be *be = find_ref_storage_backend(format);
 	struct ref_store *refs;
 
 	if (!be)
-		BUG("reference backend %s is unknown", be_name);
+		BUG("reference backend is unknown");
 
 	refs = be->init(repo, gitdir, flags);
 	return refs;
diff --git a/refs.h b/refs.h
index ff113bb12a..e2098ea51b 100644
--- a/refs.h
+++ b/refs.h
@@ -11,6 +11,9 @@ struct string_list;
 struct string_list_item;
 struct worktree;
 
+int ref_storage_format_by_name(const char *name);
+const char *ref_storage_format_to_name(int ref_storage_format);
+
 /*
  * Resolve a reference, recursively following symbolic refererences.
  *
diff --git a/refs/debug.c b/refs/debug.c
index 83b7a0ba65..b9775f2c37 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -426,7 +426,6 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
 }
 
 struct ref_storage_be refs_be_debug = {
-	.next = NULL,
 	.name = "debug",
 	.init = NULL,
 	.init_db = debug_init_db,
diff --git a/refs/files-backend.c b/refs/files-backend.c
index ad8b1d143f..43fd0ac760 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3241,7 +3241,6 @@ static int files_init_db(struct ref_store *ref_store, struct strbuf *err UNUSED)
 }
 
 struct ref_storage_be refs_be_files = {
-	.next = NULL,
 	.name = "files",
 	.init = files_ref_store_create,
 	.init_db = files_init_db,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index b9fa097a29..8d1090e284 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1705,7 +1705,6 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
 }
 
 struct ref_storage_be refs_be_packed = {
-	.next = NULL,
 	.name = "packed",
 	.init = packed_ref_store_create,
 	.init_db = packed_init_db,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 4af83bf9a5..8e9f04cc67 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -663,7 +663,6 @@ typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refnam
 				 struct strbuf *referent);
 
 struct ref_storage_be {
-	struct ref_storage_be *next;
 	const char *name;
 	ref_store_init_fn *init;
 	ref_init_db_fn *init_db;
diff --git a/repository.h b/repository.h
index 5f18486f64..ea4c488b81 100644
--- a/repository.h
+++ b/repository.h
@@ -24,6 +24,9 @@ enum fetch_negotiation_setting {
 	FETCH_NEGOTIATION_NOOP,
 };
 
+#define REF_STORAGE_FORMAT_UNKNOWN 0
+#define REF_STORAGE_FORMAT_FILES   1
+
 struct repo_settings {
 	int initialized;
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 04/12] setup: start tracking ref storage format
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 10025 bytes --]

In order to discern which ref storage format a repository is supposed to
use we need to start setting up and/or discovering the format. This
needs to happen in two separate code paths.

  - The first path is when we create a repository via `init_db()`. When
    we are re-initializing a preexisting repository we need to retain
    the previously used ref storage format -- if the user asked for a
    different format then this indicates an error and we error out.
    Otherwise we either initialize the repository with the format asked
    for by the user or the default format, which currently is the
    "files" backend.

  - The second path is when discovering repositories, where we need to
    read the config of that repository. There is not yet any way to
    configure something other than the "files" backend, so we can just
    blindly set the ref storage format to this backend.

Wire up this logic so that we have the ref storage format always readily
available when needed. As there is only a single backend and because it
is not configurable we cannot yet verify that this tracking works as
expected via tests, but tests will be added in subsequent commits. To
countermand this ommission now though, raise a BUG() in case the ref
storage format is not set up properly in `ref_store_init()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/clone.c   |  5 +++--
 builtin/init-db.c |  4 +++-
 refs.c            |  4 ++--
 repository.c      |  6 ++++++
 repository.h      |  4 ++++
 setup.c           | 26 +++++++++++++++++++++++---
 setup.h           |  6 +++++-
 7 files changed, 46 insertions(+), 9 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 343f536cf8..48aeb1b90b 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1107,7 +1107,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	 * repository, and reference backends may persist that information into
 	 * their on-disk data structures.
 	 */
-	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL,
+	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN,
+		REF_STORAGE_FORMAT_UNKNOWN, NULL,
 		do_not_override_repo_unix_permissions, INIT_DB_QUIET | INIT_DB_SKIP_REFDB);
 
 	if (real_git_dir) {
@@ -1292,7 +1293,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 	initialize_repository_version(hash_algo, 1);
 	repo_set_hash_algo(the_repository, hash_algo);
-	create_reference_database(NULL, 1);
+	create_reference_database(the_repository->ref_storage_format, NULL, 1);
 
 	/*
 	 * Before fetching from the remote, download and install bundle
diff --git a/builtin/init-db.c b/builtin/init-db.c
index cb727c826f..b6e80feab6 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -11,6 +11,7 @@
 #include "object-file.h"
 #include "parse-options.h"
 #include "path.h"
+#include "refs.h"
 #include "setup.h"
 #include "strbuf.h"
 
@@ -236,5 +237,6 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
-		       initial_branch, init_shared_repository, flags);
+		       REF_STORAGE_FORMAT_UNKNOWN, initial_branch,
+		       init_shared_repository, flags);
 }
diff --git a/refs.c b/refs.c
index ab14634731..be8fcabde0 100644
--- a/refs.c
+++ b/refs.c
@@ -2045,10 +2045,10 @@ static struct ref_store *ref_store_init(struct repository *repo,
 					const char *gitdir,
 					unsigned int flags)
 {
-	int format = REF_STORAGE_FORMAT_FILES;
-	const struct ref_storage_be *be = find_ref_storage_backend(format);
+	const struct ref_storage_be *be;
 	struct ref_store *refs;
 
+	be = find_ref_storage_backend(repo->ref_storage_format);
 	if (!be)
 		BUG("reference backend is unknown");
 
diff --git a/repository.c b/repository.c
index a7679ceeaa..691cabf45b 100644
--- a/repository.c
+++ b/repository.c
@@ -104,6 +104,11 @@ void repo_set_hash_algo(struct repository *repo, int hash_algo)
 	repo->hash_algo = &hash_algos[hash_algo];
 }
 
+void repo_set_ref_storage_format(struct repository *repo, int format)
+{
+	repo->ref_storage_format = format;
+}
+
 /*
  * Attempt to resolve and set the provided 'gitdir' for repository 'repo'.
  * Return 0 upon success and a non-zero value upon failure.
@@ -184,6 +189,7 @@ int repo_init(struct repository *repo,
 		goto error;
 
 	repo_set_hash_algo(repo, format.hash_algo);
+	repo_set_ref_storage_format(repo, format.ref_storage_format);
 	repo->repository_format_worktree_config = format.worktree_config;
 
 	/* take ownership of format.partial_clone */
diff --git a/repository.h b/repository.h
index ea4c488b81..d3a24da4d6 100644
--- a/repository.h
+++ b/repository.h
@@ -163,6 +163,9 @@ struct repository {
 	/* Repository's current hash algorithm, as serialized on disk. */
 	const struct git_hash_algo *hash_algo;
 
+	/* Repository's reference storage format, as serialized on disk. */
+	int ref_storage_format;
+
 	/* A unique-id for tracing purposes. */
 	int trace2_repo_id;
 
@@ -202,6 +205,7 @@ void repo_set_gitdir(struct repository *repo, const char *root,
 		     const struct set_gitdir_args *extra_args);
 void repo_set_worktree(struct repository *repo, const char *path);
 void repo_set_hash_algo(struct repository *repo, int algo);
+void repo_set_ref_storage_format(struct repository *repo, int format);
 void initialize_the_repository(void);
 RESULT_MUST_BE_USED
 int repo_init(struct repository *r, const char *gitdir, const char *worktree);
diff --git a/setup.c b/setup.c
index bc90bbd033..e58ab7e786 100644
--- a/setup.c
+++ b/setup.c
@@ -1566,6 +1566,8 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		if (startup_info->have_repository) {
 			repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+			repo_set_ref_storage_format(the_repository,
+						    repo_fmt.ref_storage_format);
 			the_repository->repository_format_worktree_config =
 				repo_fmt.worktree_config;
 			/* take ownership of repo_fmt.partial_clone */
@@ -1659,6 +1661,8 @@ void check_repository_format(struct repository_format *fmt)
 	check_repository_format_gently(get_git_dir(), fmt, NULL);
 	startup_info->have_repository = 1;
 	repo_set_hash_algo(the_repository, fmt->hash_algo);
+	repo_set_ref_storage_format(the_repository,
+				    fmt->ref_storage_format);
 	the_repository->repository_format_worktree_config =
 		fmt->worktree_config;
 	the_repository->repository_format_partial_clone =
@@ -1899,7 +1903,8 @@ static int is_reinit(void)
 	return ret;
 }
 
-void create_reference_database(const char *initial_branch, int quiet)
+void create_reference_database(int ref_storage_format,
+			       const char *initial_branch, int quiet)
 {
 	struct strbuf err = STRBUF_INIT;
 	int reinit = is_reinit();
@@ -1919,6 +1924,7 @@ void create_reference_database(const char *initial_branch, int quiet)
 	safe_create_dir(git_path("refs"), 1);
 	adjust_shared_perm(git_path("refs"));
 
+	repo_set_ref_storage_format(the_repository, ref_storage_format);
 	if (refs_init_db(&err))
 		die("failed to set up refs db: %s", err.buf);
 
@@ -2137,8 +2143,20 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 	}
 }
 
+static void validate_ref_storage_format(struct repository_format *repo_fmt, int format)
+{
+	if (repo_fmt->version >= 0 &&
+	    format != REF_STORAGE_FORMAT_UNKNOWN &&
+	    format != repo_fmt->ref_storage_format) {
+		die(_("attempt to reinitialize repository with different reference storage format"));
+	} else if (format != REF_STORAGE_FORMAT_UNKNOWN) {
+		repo_fmt->ref_storage_format = format;
+	}
+}
+
 int init_db(const char *git_dir, const char *real_git_dir,
-	    const char *template_dir, int hash, const char *initial_branch,
+	    const char *template_dir, int hash,
+	    int ref_storage_format, const char *initial_branch,
 	    int init_shared_repository, unsigned int flags)
 {
 	int reinit;
@@ -2181,13 +2199,15 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	check_repository_format(&repo_fmt);
 
 	validate_hash_algorithm(&repo_fmt, hash);
+	validate_ref_storage_format(&repo_fmt, ref_storage_format);
 
 	reinit = create_default_files(template_dir, original_git_dir,
 				      &repo_fmt, prev_bare_repository,
 				      init_shared_repository);
 
 	if (!(flags & INIT_DB_SKIP_REFDB))
-		create_reference_database(initial_branch, flags & INIT_DB_QUIET);
+		create_reference_database(repo_fmt.ref_storage_format,
+					  initial_branch, flags & INIT_DB_QUIET);
 	create_object_directory();
 
 	if (get_shared_repository()) {
diff --git a/setup.h b/setup.h
index 3f0f17c351..4927abf2cf 100644
--- a/setup.h
+++ b/setup.h
@@ -115,6 +115,7 @@ struct repository_format {
 	int worktree_config;
 	int is_bare;
 	int hash_algo;
+	int ref_storage_format;
 	int sparse_index;
 	char *work_tree;
 	struct string_list unknown_extensions;
@@ -131,6 +132,7 @@ struct repository_format {
 	.version = -1, \
 	.is_bare = -1, \
 	.hash_algo = GIT_HASH_SHA1, \
+	.ref_storage_format = REF_STORAGE_FORMAT_FILES, \
 	.unknown_extensions = STRING_LIST_INIT_DUP, \
 	.v1_only_extensions = STRING_LIST_INIT_DUP, \
 }
@@ -175,10 +177,12 @@ void check_repository_format(struct repository_format *fmt);
 
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash_algo,
+	    int ref_storage_format,
 	    const char *initial_branch, int init_shared_repository,
 	    unsigned int flags);
 void initialize_repository_version(int hash_algo, int reinit);
-void create_reference_database(const char *initial_branch, int quiet);
+void create_reference_database(int ref_storage_format,
+			       const char *initial_branch, int quiet);
 
 /*
  * NOTE NOTE NOTE!!
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 05/12] setup: set repository's formats on init
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2231 bytes --]

The proper hash algorithm and ref storage format that will be used for a
newly initialized repository will be figured out in `init_db()` via
`validate_hash_algorithm()` and `validate_ref_storage_format()`. Until
now though, we never set up the hash algorithm or ref storage format of
`the_repository` accordingly.

There are only two callsites of `init_db()`, one in git-init(1) and one
in git-clone(1). The former function doesn't care for the formats to be
set up properly because it never access the repository after calling the
function in the first place.

For git-clone(1) it's a different story though, as we call `init_db()`
before listing remote refs. While we do indeed have the wrong hash
function in `the_repository` when `init_db()` sets up a non-default
object format for the repository, it never mattered because we adjust
the hash after learning about the remote's hash function via the listed
refs.

So the current state is correct for the hash algo, but it's not for the
ref storage format because git-clone(1) wouldn't know to set it up
properly. But instead of adjusting only the `ref_storage_format`, set
both the hash algo and the ref storage format so that `the_repository`
is in the correct state when `init_db()` exits. This is fine as we will
adjust the hash later on anyway and makes it easier to reason about the
end state of `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 setup.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/setup.c b/setup.c
index e58ab7e786..3d980814bc 100644
--- a/setup.c
+++ b/setup.c
@@ -2205,6 +2205,13 @@ int init_db(const char *git_dir, const char *real_git_dir,
 				      &repo_fmt, prev_bare_repository,
 				      init_shared_repository);
 
+	/*
+	 * Now that we have set up both the hash algorithm and the ref storage
+	 * format we can update the repository's settings accordingly.
+	 */
+	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+	repo_set_ref_storage_format(the_repository, repo_fmt.ref_storage_format);
+
 	if (!(flags & INIT_DB_SKIP_REFDB))
 		create_reference_database(repo_fmt.ref_storage_format,
 					  initial_branch, flags & INIT_DB_QUIET);
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 06/12] setup: introduce "extensions.refStorage" extension
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 9372 bytes --]

Introduce a new "extensions.refStorage" extension that allows us to
specify the ref storage format used by a repository. For now, the only
supported format is the "files" format, but this list will likely soon
be extended to also support the upcoming "reftable" format.

There have been discussions on the Git mailing list in the past around
how exactly this extension should look like. One alternative [1] that
was discussed was whether it would make sense to model the extension in
such a way that backends are arbitrarily stackable. This would allow for
a combined value of e.g. "loose,packed-refs" or "loose,reftable", which
indicates that new refs would be written via "loose" files backend and
compressed into "packed-refs" or "reftable" backends, respectively.

It is arguable though whether this flexibility and the complexity that
it brings with it is really required for now. It is not foreseeable that
there will be a proliferation of backends in the near-term future, and
the current set of existing formats and formats which are on the horizon
can easily be configured with the much simpler proposal where we have a
single value, only.

Furthermore, if we ever see that we indeed want to gain the ability to
arbitrarily stack the ref formats, then we can adapt the current
extension rather easily. Given that Git clients will refuse any unknown
value for the "extensions.refStorage" extension they would also know to
ignore a stacked "loose,packed-refs" in the future.

So let's stick with the easy proposal for the time being and wire up the
extension.

[1]: <pull.1408.git.1667846164.gitgitgadget@gmail.com>

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/extensions.txt           | 11 ++++++++
 Documentation/ref-storage-format.txt          |  1 +
 .../technical/repository-version.txt          |  5 ++++
 builtin/clone.c                               |  2 +-
 setup.c                                       | 23 +++++++++++++---
 setup.h                                       |  3 ++-
 t/t0001-init.sh                               | 26 +++++++++++++++++++
 t/test-lib.sh                                 |  2 +-
 8 files changed, 67 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/ref-storage-format.txt

diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
index bccaec7a96..66db0e15da 100644
--- a/Documentation/config/extensions.txt
+++ b/Documentation/config/extensions.txt
@@ -7,6 +7,17 @@ Note that this setting should only be set by linkgit:git-init[1] or
 linkgit:git-clone[1].  Trying to change it after initialization will not
 work and will produce hard-to-diagnose issues.
 
+extensions.refStorage::
+	Specify the ref storage format to use. The acceptable values are:
++
+include::../ref-storage-format.txt[]
++
+It is an error to specify this key unless `core.repositoryFormatVersion` is 1.
++
+Note that this setting should only be set by linkgit:git-init[1] or
+linkgit:git-clone[1]. Trying to change it after initialization will not
+work and will produce hard-to-diagnose issues.
+
 extensions.worktreeConfig::
 	If enabled, then worktrees will load config settings from the
 	`$GIT_DIR/config.worktree` file in addition to the
diff --git a/Documentation/ref-storage-format.txt b/Documentation/ref-storage-format.txt
new file mode 100644
index 0000000000..1a65cac468
--- /dev/null
+++ b/Documentation/ref-storage-format.txt
@@ -0,0 +1 @@
+* `files` for loose files with packed-refs. This is the default.
diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 045a76756f..27be3741e6 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -100,3 +100,8 @@ If set, by default "git config" reads from both "config" and
 multiple working directory mode, "config" file is shared while
 "config.worktree" is per-working directory (i.e., it's in
 GIT_COMMON_DIR/worktrees/<id>/config.worktree)
+
+==== `refStorage`
+
+Specifies the file format for the ref database. The only valid value
+is `files` (loose references with a packed-refs file).
diff --git a/builtin/clone.c b/builtin/clone.c
index 48aeb1b90b..0fb3816d0c 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1291,7 +1291,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	 * ours to the same thing.
 	 */
 	hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
-	initialize_repository_version(hash_algo, 1);
+	initialize_repository_version(hash_algo, the_repository->ref_storage_format, 1);
 	repo_set_hash_algo(the_repository, hash_algo);
 	create_reference_database(the_repository->ref_storage_format, NULL, 1);
 
diff --git a/setup.c b/setup.c
index 3d980814bc..3aac8c01cd 100644
--- a/setup.c
+++ b/setup.c
@@ -592,6 +592,17 @@ static enum extension_result handle_extension(const char *var,
 				     "extensions.objectformat", value);
 		data->hash_algo = format;
 		return EXTENSION_OK;
+	} else if (!strcmp(ext, "refstorage")) {
+		int format;
+
+		if (!value)
+			return config_error_nonbool(var);
+		format = ref_storage_format_by_name(value);
+		if (format == REF_STORAGE_FORMAT_UNKNOWN)
+			return error(_("invalid value for '%s': '%s'"),
+				     "extensions.refstorage", value);
+		data->ref_storage_format = format;
+		return EXTENSION_OK;
 	}
 	return EXTENSION_UNKNOWN;
 }
@@ -1871,12 +1882,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
 	return 1;
 }
 
-void initialize_repository_version(int hash_algo, int reinit)
+void initialize_repository_version(int hash_algo, int ref_storage_format,
+				   int reinit)
 {
 	char repo_version_string[10];
 	int repo_version = GIT_REPO_VERSION;
 
-	if (hash_algo != GIT_HASH_SHA1)
+	if (hash_algo != GIT_HASH_SHA1 ||
+	    ref_storage_format != REF_STORAGE_FORMAT_FILES)
 		repo_version = GIT_REPO_VERSION_READ;
 
 	/* This forces creation of new config file */
@@ -1889,6 +1902,10 @@ void initialize_repository_version(int hash_algo, int reinit)
 			       hash_algos[hash_algo].name);
 	else if (reinit)
 		git_config_set_gently("extensions.objectformat", NULL);
+
+	if (ref_storage_format != REF_STORAGE_FORMAT_FILES)
+		git_config_set("extensions.refstorage",
+			       ref_storage_format_to_name(ref_storage_format));
 }
 
 static int is_reinit(void)
@@ -2030,7 +2047,7 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
-	initialize_repository_version(fmt->hash_algo, 0);
+	initialize_repository_version(fmt->hash_algo, fmt->ref_storage_format, 0);
 
 	/* Check filemode trustability */
 	path = git_path_buf(&buf, "config");
diff --git a/setup.h b/setup.h
index 4927abf2cf..89033ae1d9 100644
--- a/setup.h
+++ b/setup.h
@@ -180,7 +180,8 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	    int ref_storage_format,
 	    const char *initial_branch, int init_shared_repository,
 	    unsigned int flags);
-void initialize_repository_version(int hash_algo, int reinit);
+void initialize_repository_version(int hash_algo, int ref_storage_format,
+				   int reinit);
 void create_reference_database(int ref_storage_format,
 			       const char *initial_branch, int quiet);
 
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 2b78e3be47..38b3e4c39e 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -532,6 +532,32 @@ test_expect_success 'init rejects attempts to initialize with different hash' '
 	test_must_fail git -C sha256 init --object-format=sha1
 '
 
+test_expect_success DEFAULT_REPO_FORMAT 'extensions.refStorage is not allowed with repo version 0' '
+	test_when_finished "rm -rf refstorage" &&
+	git init refstorage &&
+	git -C refstorage config extensions.refStorage files &&
+	test_must_fail git -C refstorage rev-parse 2>err &&
+	grep "repo version is 0, but v1-only extension found" err
+'
+
+test_expect_success DEFAULT_REPO_FORMAT 'extensions.refStorage with files backend' '
+	test_when_finished "rm -rf refstorage" &&
+	git init refstorage &&
+	git -C refstorage config core.repositoryformatversion 1 &&
+	git -C refstorage config extensions.refStorage files &&
+	test_commit -C refstorage A &&
+	git -C refstorage rev-parse --verify HEAD
+'
+
+test_expect_success DEFAULT_REPO_FORMAT 'extensions.refStorage with unknown backend' '
+	test_when_finished "rm -rf refstorage" &&
+	git init refstorage &&
+	git -C refstorage config core.repositoryformatversion 1 &&
+	git -C refstorage config extensions.refStorage garbage &&
+	test_must_fail git -C refstorage rev-parse 2>err &&
+	grep "invalid value for ${SQ}extensions.refstorage${SQ}: ${SQ}garbage${SQ}" err
+'
+
 test_expect_success MINGW 'core.hidedotfiles = false' '
 	git config --global core.hidedotfiles false &&
 	rm -rf newdir &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index dc03f06b8e..4685cc3d48 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1937,7 +1937,7 @@ test_lazy_prereq SHA1 '
 '
 
 test_lazy_prereq DEFAULT_REPO_FORMAT '
-	test_have_prereq SHA1
+	test_have_prereq SHA1,REFFILES
 '
 
 # Ensure that no test accidentally triggers a Git command
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 07/12] setup: introduce GIT_DEFAULT_REF_FORMAT envvar
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3130 bytes --]

Introduce a new GIT_DEFAULT_REF_FORMAT environment variable that lets
users control the default ref format used by both git-init(1) and
git-clone(1). This is modeled after GIT_DEFAULT_OBJECT_FORMAT, which
does the same thing for the repository's object format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git.txt |  5 +++++
 setup.c               |  7 +++++++
 t/t0001-init.sh       | 18 ++++++++++++++++++
 3 files changed, 30 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index bf9e6af695..88e4ed4bd6 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -556,6 +556,11 @@ double-quotes and respecting backslash escapes. E.g., the value
 	is always used. The default is "sha1".
 	See `--object-format` in linkgit:git-init[1].
 
+`GIT_DEFAULT_REF_FORMAT`::
+	If this variable is set, the default reference backend format for new
+	repositories will be set to this value. The default is "files".
+	See `--ref-format` in linkgit:git-init[1].
+
 Git Commits
 ~~~~~~~~~~~
 `GIT_AUTHOR_NAME`::
diff --git a/setup.c b/setup.c
index 3aac8c01cd..4712bba6f8 100644
--- a/setup.c
+++ b/setup.c
@@ -2162,12 +2162,19 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 
 static void validate_ref_storage_format(struct repository_format *repo_fmt, int format)
 {
+	const char *name = getenv("GIT_DEFAULT_REF_FORMAT");
+
 	if (repo_fmt->version >= 0 &&
 	    format != REF_STORAGE_FORMAT_UNKNOWN &&
 	    format != repo_fmt->ref_storage_format) {
 		die(_("attempt to reinitialize repository with different reference storage format"));
 	} else if (format != REF_STORAGE_FORMAT_UNKNOWN) {
 		repo_fmt->ref_storage_format = format;
+	} else if (name) {
+		format = ref_storage_format_by_name(name);
+		if (format == REF_STORAGE_FORMAT_UNKNOWN)
+			die(_("unknown ref storage format '%s'"), name);
+		repo_fmt->ref_storage_format = format;
 	}
 }
 
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 38b3e4c39e..30ce752cc1 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -558,6 +558,24 @@ test_expect_success DEFAULT_REPO_FORMAT 'extensions.refStorage with unknown back
 	grep "invalid value for ${SQ}extensions.refstorage${SQ}: ${SQ}garbage${SQ}" err
 '
 
+test_expect_success DEFAULT_REPO_FORMAT 'init with GIT_DEFAULT_REF_FORMAT=files' '
+	test_when_finished "rm -rf refformat" &&
+	GIT_DEFAULT_REF_FORMAT=files git init refformat &&
+	echo 0 >expect &&
+	git -C refformat config core.repositoryformatversion >actual &&
+	test_cmp expect actual &&
+	test_must_fail git -C refformat config extensions.refstorage
+'
+
+test_expect_success 'init with GIT_DEFAULT_REF_FORMAT=garbage' '
+	test_when_finished "rm -rf refformat" &&
+	cat >expect <<-EOF &&
+	fatal: unknown ref storage format ${SQ}garbage${SQ}
+	EOF
+	test_must_fail env GIT_DEFAULT_REF_FORMAT=garbage git init refformat 2>err &&
+	test_cmp expect err
+'
+
 test_expect_success MINGW 'core.hidedotfiles = false' '
 	git config --global core.hidedotfiles false &&
 	rm -rf newdir &&
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 08/12] t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
From: Patrick Steinhardt @ 2023-12-28  9:57 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2531 bytes --]

Introduce a new GIT_TEST_DEFAULT_REF_FORMAT environment variable that
lets developers run the test suite with a different default ref format
without impacting the ref format used by non-test Git invocations. This
is modeled after GIT_TEST_DEFAULT_OBJECT_FORMAT, which does the same
thing for the repository's object format.

Adapt the setup of the `REFFILES` test prerequisite to be conditionally
set based on the default ref format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/README                |  3 +++
 t/test-lib-functions.sh |  5 +++++
 t/test-lib.sh           | 11 ++++++++++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/t/README b/t/README
index 36463d0742..621d3b8c09 100644
--- a/t/README
+++ b/t/README
@@ -479,6 +479,9 @@ GIT_TEST_DEFAULT_HASH=<hash-algo> specifies which hash algorithm to
 use in the test scripts. Recognized values for <hash-algo> are "sha1"
 and "sha256".
 
+GIT_TEST_DEFAULT_REF_FORMAT=<format> specifies which ref storage format
+to use in the test scripts. Recognized values for <format> are "files".
+
 GIT_TEST_NO_WRITE_REV_INDEX=<boolean>, when true disables the
 'pack.writeReverseIndex' setting.
 
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 5eb57914ab..a3a51ea9e8 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1659,6 +1659,11 @@ test_detect_hash () {
 	test_hash_algo="${GIT_TEST_DEFAULT_HASH:-sha1}"
 }
 
+# Detect the hash algorithm in use.
+test_detect_ref_format () {
+	echo "${GIT_TEST_DEFAULT_REF_FORMAT:-files}"
+}
+
 # Load common hash metadata and common placeholder object IDs for use with
 # test_oid.
 test_oid_init () {
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4685cc3d48..fc93aa57e6 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -542,6 +542,8 @@ export EDITOR
 
 GIT_DEFAULT_HASH="${GIT_TEST_DEFAULT_HASH:-sha1}"
 export GIT_DEFAULT_HASH
+GIT_DEFAULT_REF_FORMAT="${GIT_TEST_DEFAULT_REF_FORMAT:-files}"
+export GIT_DEFAULT_REF_FORMAT
 GIT_TEST_MERGE_ALGORITHM="${GIT_TEST_MERGE_ALGORITHM:-ort}"
 export GIT_TEST_MERGE_ALGORITHM
 
@@ -1745,7 +1747,14 @@ parisc* | hppa*)
 	;;
 esac
 
-test_set_prereq REFFILES
+case "$GIT_DEFAULT_REF_FORMAT" in
+files)
+	test_set_prereq REFFILES;;
+*)
+	echo 2>&1 "error: unknown ref format $GIT_DEFAULT_REF_FORMAT"
+	exit 1
+	;;
+esac
 
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_CURL" && test_set_prereq LIBCURL
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 09/12] builtin/rev-parse: introduce `--show-ref-format` flag
From: Patrick Steinhardt @ 2023-12-28  9:58 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2428 bytes --]

Introduce a new `--show-ref-format` to git-rev-parse(1) that causes it
to print the ref format used by a repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-rev-parse.txt |  3 +++
 builtin/rev-parse.c             |  4 ++++
 t/t1500-rev-parse.sh            | 17 +++++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 912fab9f5e..546faf9017 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -307,6 +307,9 @@ The following options are unaffected by `--path-format`:
 	input, multiple algorithms may be printed, space-separated.
 	If not specified, the default is "storage".
 
+--show-ref-format::
+	Show the reference storage format used for the repository.
+
 
 Other Options
 ~~~~~~~~~~~~~
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 917f122440..d08987646a 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -1062,6 +1062,10 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				puts(the_hash_algo->name);
 				continue;
 			}
+			if (!strcmp(arg, "--show-ref-format")) {
+				puts(ref_storage_format_to_name(the_repository->ref_storage_format));
+				continue;
+			}
 			if (!strcmp(arg, "--end-of-options")) {
 				seen_end_of_options = 1;
 				if (filter & (DO_FLAGS | DO_REVS))
diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
index 3f9e7f62e4..a669e592f1 100755
--- a/t/t1500-rev-parse.sh
+++ b/t/t1500-rev-parse.sh
@@ -208,6 +208,23 @@ test_expect_success 'rev-parse --show-object-format in repo' '
 	grep "unknown mode for --show-object-format: squeamish-ossifrage" err
 '
 
+test_expect_success 'rev-parse --show-ref-format' '
+	test_detect_ref_format >expect &&
+	git rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-parse --show-ref-format with invalid storage' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		git config extensions.refstorage broken &&
+		test_must_fail git rev-parse --show-ref-format 2>err &&
+		grep "error: invalid value for ${SQ}extensions.refstorage${SQ}: ${SQ}broken${SQ}" err
+	)
+'
+
 test_expect_success '--show-toplevel from subdir of working tree' '
 	pwd >expect &&
 	git -C sub/dir rev-parse --show-toplevel >actual &&
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 10/12] builtin/init: introduce `--ref-format=` value flag
From: Patrick Steinhardt @ 2023-12-28  9:58 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4852 bytes --]

Introduce a new `--ref-format` value flag for git-init(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-init.txt |  7 +++++++
 builtin/init-db.c          | 13 ++++++++++++-
 t/t0001-init.sh            | 26 ++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-init.txt b/Documentation/git-init.txt
index 6f0d2973bf..e8dc645bb5 100644
--- a/Documentation/git-init.txt
+++ b/Documentation/git-init.txt
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git init' [-q | --quiet] [--bare] [--template=<template-directory>]
 	  [--separate-git-dir <git-dir>] [--object-format=<format>]
+	  [--ref-format=<format>]
 	  [-b <branch-name> | --initial-branch=<branch-name>]
 	  [--shared[=<permissions>]] [<directory>]
 
@@ -57,6 +58,12 @@ values are 'sha1' and (if enabled) 'sha256'.  'sha1' is the default.
 +
 include::object-format-disclaimer.txt[]
 
+--ref-format=<format>::
+
+Specify the given ref storage format for the repository. The valid values are:
++
+include::ref-storage-format.txt[]
+
 --template=<template-directory>::
 
 Specify the directory from which templates will be used.  (See the "TEMPLATE
diff --git a/builtin/init-db.c b/builtin/init-db.c
index b6e80feab6..770b2822fe 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -58,6 +58,7 @@ static int shared_callback(const struct option *opt, const char *arg, int unset)
 static const char *const init_db_usage[] = {
 	N_("git init [-q | --quiet] [--bare] [--template=<template-directory>]\n"
 	   "         [--separate-git-dir <git-dir>] [--object-format=<format>]\n"
+	   "         [--ref-format=<format>]\n"
 	   "         [-b <branch-name> | --initial-branch=<branch-name>]\n"
 	   "         [--shared[=<permissions>]] [<directory>]"),
 	NULL
@@ -77,8 +78,10 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	const char *template_dir = NULL;
 	unsigned int flags = 0;
 	const char *object_format = NULL;
+	const char *ref_format = NULL;
 	const char *initial_branch = NULL;
 	int hash_algo = GIT_HASH_UNKNOWN;
+	int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
 	int init_shared_repository = -1;
 	const struct option init_db_options[] = {
 		OPT_STRING(0, "template", &template_dir, N_("template-directory"),
@@ -96,6 +99,8 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 			   N_("override the name of the initial branch")),
 		OPT_STRING(0, "object-format", &object_format, N_("hash"),
 			   N_("specify the hash algorithm to use")),
+		OPT_STRING(0, "ref-format", &ref_format, N_("format"),
+			   N_("specify the reference format to use")),
 		OPT_END()
 	};
 
@@ -159,6 +164,12 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 			die(_("unknown hash algorithm '%s'"), object_format);
 	}
 
+	if (ref_format) {
+		ref_storage_format = ref_storage_format_by_name(ref_format);
+		if (ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
+			die(_("unknown ref storage format '%s'"), ref_format);
+	}
+
 	if (init_shared_repository != -1)
 		set_shared_repository(init_shared_repository);
 
@@ -237,6 +248,6 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
-		       REF_STORAGE_FORMAT_UNKNOWN, initial_branch,
+		       ref_storage_format, initial_branch,
 		       init_shared_repository, flags);
 }
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 30ce752cc1..b131d665db 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -576,6 +576,32 @@ test_expect_success 'init with GIT_DEFAULT_REF_FORMAT=garbage' '
 	test_cmp expect err
 '
 
+test_expect_success 'init with --ref-format=files' '
+	test_when_finished "rm -rf refformat" &&
+	git init --ref-format=files refformat &&
+	echo files >expect &&
+	git -C refformat rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 're-init with same format' '
+	test_when_finished "rm -rf refformat" &&
+	git init --ref-format=files refformat &&
+	git init --ref-format=files refformat &&
+	echo files >expect &&
+	git -C refformat rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'init with --ref-format=garbage' '
+	test_when_finished "rm -rf refformat" &&
+	cat >expect <<-EOF &&
+	fatal: unknown ref storage format ${SQ}garbage${SQ}
+	EOF
+	test_must_fail git init --ref-format=garbage refformat 2>err &&
+	test_cmp expect err
+'
+
 test_expect_success MINGW 'core.hidedotfiles = false' '
 	git config --global core.hidedotfiles false &&
 	rm -rf newdir &&
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 11/12] builtin/clone: introduce `--ref-format=` value flag
From: Patrick Steinhardt @ 2023-12-28  9:58 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3995 bytes --]

Introduce a new `--ref-format` value flag for git-clone(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-clone.txt |  6 ++++++
 builtin/clone.c             | 12 +++++++++++-
 t/t5601-clone.sh            | 17 +++++++++++++++++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index c37c4a37f7..6e43eb9c20 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -311,6 +311,12 @@ or `--mirror` is given)
 	The result is Git repository can be separated from working
 	tree.
 
+--ref-format=<ref-format::
+
+Specify the given ref storage format for the repository. The valid values are:
++
+include::ref-storage-format.txt[]
+
 -j <n>::
 --jobs <n>::
 	The number of submodules fetched at the same time.
diff --git a/builtin/clone.c b/builtin/clone.c
index 0fb3816d0c..18093d5d16 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -72,6 +72,7 @@ static char *remote_name = NULL;
 static char *option_branch = NULL;
 static struct string_list option_not = STRING_LIST_INIT_NODUP;
 static const char *real_git_dir;
+static const char *ref_format;
 static char *option_upload_pack = "git-upload-pack";
 static int option_verbosity;
 static int option_progress = -1;
@@ -157,6 +158,8 @@ static struct option builtin_clone_options[] = {
 		    N_("any cloned submodules will be shallow")),
 	OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
 		   N_("separate git dir from working tree")),
+	OPT_STRING(0, "ref-format", &ref_format, N_("format"),
+		   N_("specify the reference format to use")),
 	OPT_STRING_LIST('c', "config", &option_config, N_("key=value"),
 			N_("set config inside the new repository")),
 	OPT_STRING_LIST(0, "server-option", &server_options,
@@ -932,6 +935,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int submodule_progress;
 	int filter_submodules = 0;
 	int hash_algo;
+	int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
 	const int do_not_override_repo_unix_permissions = -1;
 
 	struct transport_ls_refs_options transport_ls_refs_options =
@@ -957,6 +961,12 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_single_branch == -1)
 		option_single_branch = deepen ? 1 : 0;
 
+	if (ref_format) {
+		ref_storage_format = ref_storage_format_by_name(ref_format);
+		if (ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
+			die(_("unknown ref storage format '%s'"), ref_format);
+	}
+
 	if (option_mirror)
 		option_bare = 1;
 
@@ -1108,7 +1118,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	 * their on-disk data structures.
 	 */
 	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN,
-		REF_STORAGE_FORMAT_UNKNOWN, NULL,
+		ref_storage_format, NULL,
 		do_not_override_repo_unix_permissions, INIT_DB_QUIET | INIT_DB_SKIP_REFDB);
 
 	if (real_git_dir) {
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 47eae641f0..fb1b9c686d 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -157,6 +157,23 @@ test_expect_success 'clone --mirror does not repeat tags' '
 
 '
 
+test_expect_success 'clone with files ref format' '
+	test_when_finished "rm -rf ref-storage" &&
+	git clone --ref-format=files --mirror src ref-storage &&
+	echo files >expect &&
+	git -C ref-storage rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'clone with garbage ref format' '
+	cat >expect <<-EOF &&
+	fatal: unknown ref storage format ${SQ}garbage${SQ}
+	EOF
+	test_must_fail git clone --ref-format=garbage --mirror src ref-storage 2>err &&
+	test_cmp expect err &&
+	test_path_is_missing ref-storage
+'
+
 test_expect_success 'clone to destination with trailing /' '
 
 	git clone src target-1/ &&
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 12/12] t9500: write "extensions.refstorage" into config
From: Patrick Steinhardt @ 2023-12-28  9:58 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano
In-Reply-To: <cover.1703753910.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1312 bytes --]

In t9500 we're writing a custom configuration that sets up gitweb. This
requires us to manually ensure that the repository format is configured
as required, including both the repository format version and
extensions. With the introduction of the "extensions.refStorage"
extension we need to update the test to also write this new one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t9500-gitweb-standalone-no-errors.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 0333065d4d..7679780fb8 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -627,6 +627,7 @@ test_expect_success \
 test_expect_success 'setup' '
 	version=$(git config core.repositoryformatversion) &&
 	algo=$(test_might_fail git config extensions.objectformat) &&
+	refstorage=$(test_might_fail git config extensions.refstorage) &&
 	cat >.git/config <<-\EOF &&
 	# testing noval and alternate separator
 	[gitweb]
@@ -637,6 +638,10 @@ test_expect_success 'setup' '
 	if test -n "$algo"
 	then
 		git config extensions.objectformat "$algo"
+	fi &&
+	if test -n "$refstorage"
+	then
+		git config extensions.refstorage "$refstorage"
 	fi
 '
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 0/6] worktree: initialize refdb via ref backends
From: Patrick Steinhardt @ 2023-12-28  9:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4262 bytes --]

Hi,

when initializing worktrees we manually create the on-disk data
structures required for the ref backend in "worktree.c". This works just
fine right now where we only have a single user-exposed ref backend, but
it will become unwieldy once we have multiple ref backends. This patch
series thus refactors how we initialize worktrees so that we can use
`refs_init_db()` to initialize required files for us.

This patch series conflicts with ps/refstorage-extension. The conflict
can be solved as shown below. I'm happy to defer this patch series
though until the topic has landed on `master` in case this causes
issues.

Patrick

diff --cc setup.c
index f2d55994e2,4712bba6f8..0000000000
--- a/setup.c
+++ b/setup.c
@@@ -1904,7 -1926,23 +1926,8 @@@ void create_reference_database(int ref_
  	struct strbuf err = STRBUF_INIT;
  	int reinit = is_reinit();
  
 -	/*
 -	 * We need to create a "refs" dir in any case so that older versions of
 -	 * Git can tell that this is a repository. This serves two main purposes:
 -	 *
 -	 * - Clients will know to stop walking the parent-directory chain when
 -	 *   detecting the Git repository. Otherwise they may end up detecting
 -	 *   a Git repository in a parent directory instead.
 -	 *
 -	 * - Instead of failing to detect a repository with unknown reference
 -	 *   format altogether, old clients will print an error saying that
 -	 *   they do not understand the reference format extension.
 -	 */
 -	safe_create_dir(git_path("refs"), 1);
 -	adjust_shared_perm(git_path("refs"));
 -
+ 	repo_set_ref_storage_format(the_repository, ref_storage_format);
 -	if (refs_init_db(&err))
 +	if (refs_init_db(get_main_ref_store(the_repository), 0, &err))
  		die("failed to set up refs db: %s", err.buf);
  
  	/*
diff --cc worktree.c
index 085f2cc41a,9702ed0308..0000000000
--- a/worktree.c
+++ b/worktree.c
@@@ -79,7 -75,8 +80,8 @@@ static struct worktree *get_main_worktr
  	return worktree;
  }
  
- struct worktree *get_linked_worktree(const char *id)
 -static struct worktree *get_linked_worktree(const char *id,
 -					    int skip_reading_head)
++struct worktree *get_linked_worktree(const char *id,
++				     int skip_reading_head)
  {
  	struct worktree *worktree = NULL;
  	struct strbuf path = STRBUF_INIT;
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 9d935bee84..558c5537f5 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -502,7 +502,7 @@ static int add_worktree(const char *path, const char *refname,
 	/*
 	 * Set up the ref store of the worktree and create the HEAD reference.
 	 */
-	wt = get_linked_worktree(name);
+	wt = get_linked_worktree(name, 1);
 	if (!wt) {
 		ret = error(_("could not find created worktree '%s'"), name);
 		goto done;
diff --git a/worktree.h b/worktree.h
index 8a75691eac..f14784a2ff 100644
--- a/worktree.h
+++ b/worktree.h
@@ -61,7 +61,8 @@ struct worktree *find_worktree(struct worktree **list,
  * Look up the worktree corresponding to `id`, or NULL of no such worktree
  * exists.
  */
-struct worktree *get_linked_worktree(const char *id);
+struct worktree *get_linked_worktree(const char *id,
+				     int skip_reading_head);
 
 /*
  * Return the worktree corresponding to `path`, or NULL if no such worktree

Patrick Steinhardt (6):
  refs: prepare `refs_init_db()` for initializing worktree refs
  setup: move creation of "refs/" into the files backend
  refs/files: skip creation of "refs/{heads,tags}" for worktrees
  builtin/worktree: move setup of commondir file earlier
  worktree: expose interface to look up worktree by name
  builtin/worktree: create refdb via ref backend

 builtin/worktree.c    | 53 ++++++++++++++++++++-----------------------
 refs.c                |  6 ++---
 refs.h                |  4 +++-
 refs/debug.c          |  4 ++--
 refs/files-backend.c  | 37 +++++++++++++++++++++++++-----
 refs/packed-backend.c |  1 +
 refs/refs-internal.h  |  4 +++-
 setup.c               | 17 +-------------
 worktree.c            | 25 ++++++++++++--------
 worktree.h            | 11 +++++++++
 10 files changed, 94 insertions(+), 68 deletions(-)


base-commit: e79552d19784ee7f4bbce278fe25f93fbda196fa
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 1/6] refs: prepare `refs_init_db()` for initializing worktree refs
From: Patrick Steinhardt @ 2023-12-28  9:59 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1703754513.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4448 bytes --]

The purpose of `refs_init_db()` is to initialize the on-disk files of a
new ref database. The function is quite inflexible right now though, as
callers can neither specify the `struct ref_store` nor can they pass any
flags.

Refactor the interface to accept both of these. This will be required so
that we can start initializing per-worktree ref databases via the ref
backend instead of open-coding the initialization in "worktree.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                | 6 ++----
 refs.h                | 2 +-
 refs/debug.c          | 4 ++--
 refs/files-backend.c  | 4 +++-
 refs/packed-backend.c | 1 +
 refs/refs-internal.h  | 4 +++-
 setup.c               | 2 +-
 7 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/refs.c b/refs.c
index 16bfa21df7..87e2659db7 100644
--- a/refs.c
+++ b/refs.c
@@ -1928,11 +1928,9 @@ const char *refs_resolve_ref_unsafe(struct ref_store *refs,
 }
 
 /* backend functions */
-int refs_init_db(struct strbuf *err)
+int refs_init_db(struct ref_store *refs, int flags, struct strbuf *err)
 {
-	struct ref_store *refs = get_main_ref_store(the_repository);
-
-	return refs->be->init_db(refs, err);
+	return refs->be->init_db(refs, flags, err);
 }
 
 const char *resolve_ref_unsafe(const char *refname, int resolve_flags,
diff --git a/refs.h b/refs.h
index ff113bb12a..6090e92578 100644
--- a/refs.h
+++ b/refs.h
@@ -123,7 +123,7 @@ int should_autocreate_reflog(const char *refname);
 
 int is_branch(const char *refname);
 
-int refs_init_db(struct strbuf *err);
+int refs_init_db(struct ref_store *refs, int flags, struct strbuf *err);
 
 /*
  * Return the peeled value of the oid currently being iterated via
diff --git a/refs/debug.c b/refs/debug.c
index 83b7a0ba65..c061def8b5 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -33,10 +33,10 @@ struct ref_store *maybe_debug_wrap_ref_store(const char *gitdir, struct ref_stor
 	return (struct ref_store *)res;
 }
 
-static int debug_init_db(struct ref_store *refs, struct strbuf *err)
+static int debug_init_db(struct ref_store *refs, int flags, struct strbuf *err)
 {
 	struct debug_ref_store *drefs = (struct debug_ref_store *)refs;
-	int res = drefs->refs->be->init_db(drefs->refs, err);
+	int res = drefs->refs->be->init_db(drefs->refs, flags, err);
 	trace_printf_key(&trace_refs, "init_db: %d\n", res);
 	return res;
 }
diff --git a/refs/files-backend.c b/refs/files-backend.c
index ad8b1d143f..387eeb5037 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3220,7 +3220,9 @@ static int files_reflog_expire(struct ref_store *ref_store,
 	return -1;
 }
 
-static int files_init_db(struct ref_store *ref_store, struct strbuf *err UNUSED)
+static int files_init_db(struct ref_store *ref_store,
+			 int flags UNUSED,
+			 struct strbuf *err UNUSED)
 {
 	struct files_ref_store *refs =
 		files_downcast(ref_store, REF_STORE_WRITE, "init_db");
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index b9fa097a29..3f10bccf44 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1246,6 +1246,7 @@ static const char PACKED_REFS_HEADER[] =
 	"# pack-refs with: peeled fully-peeled sorted \n";
 
 static int packed_init_db(struct ref_store *ref_store UNUSED,
+			  int flags UNUSED,
 			  struct strbuf *err UNUSED)
 {
 	/* Nothing to do. */
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 4af83bf9a5..a33b28c19d 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -529,7 +529,9 @@ typedef struct ref_store *ref_store_init_fn(struct repository *repo,
 					    const char *gitdir,
 					    unsigned int flags);
 
-typedef int ref_init_db_fn(struct ref_store *refs, struct strbuf *err);
+typedef int ref_init_db_fn(struct ref_store *refs,
+			   int flags,
+			   struct strbuf *err);
 
 typedef int ref_transaction_prepare_fn(struct ref_store *refs,
 				       struct ref_transaction *transaction,
diff --git a/setup.c b/setup.c
index bc90bbd033..a4eb2a38ac 100644
--- a/setup.c
+++ b/setup.c
@@ -1919,7 +1919,7 @@ void create_reference_database(const char *initial_branch, int quiet)
 	safe_create_dir(git_path("refs"), 1);
 	adjust_shared_perm(git_path("refs"));
 
-	if (refs_init_db(&err))
+	if (refs_init_db(get_main_ref_store(the_repository), 0, &err))
 		die("failed to set up refs db: %s", err.buf);
 
 	/*
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 2/6] setup: move creation of "refs/" into the files backend
From: Patrick Steinhardt @ 2023-12-28  9:59 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1703754513.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3400 bytes --]

When creating the ref database we unconditionally create the "refs/"
directory in "setup.c". This is a mandatory prerequisite for all Git
repositories regardless of the ref backend in use, because Git will be
unable to detect the directory as a repository if "refs/" doesn't exist.

We are about to add another new caller that will want to create a ref
database when creating worktrees. We would require the same logic to
create the "refs/" directory even though the caller really should not
care about such low-level details. Ideally, the ref database should be
fully initialized after calling `refs_init_db()`.

Move the code to create the directory into the files backend itself to
make it so. This means that future ref backends will also need to have
equivalent logic around to ensure that the directory exists, but it
seems a lot more sensible to have it this way round than to require
callers to create the directory themselves.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/files-backend.c | 17 +++++++++++++++++
 setup.c              | 15 ---------------
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 387eeb5037..ed47c5dc08 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3228,9 +3228,26 @@ static int files_init_db(struct ref_store *ref_store,
 		files_downcast(ref_store, REF_STORE_WRITE, "init_db");
 	struct strbuf sb = STRBUF_INIT;
 
+	/*
+	 * We need to create a "refs" dir in any case so that older versions of
+	 * Git can tell that this is a repository. This serves two main purposes:
+	 *
+	 * - Clients will know to stop walking the parent-directory chain when
+	 *   detecting the Git repository. Otherwise they may end up detecting
+	 *   a Git repository in a parent directory instead.
+	 *
+	 * - Instead of failing to detect a repository with unknown reference
+	 *   format altogether, old clients will print an error saying that
+	 *   they do not understand the reference format extension.
+	 */
+	strbuf_addf(&sb, "%s/refs", ref_store->gitdir);
+	safe_create_dir(sb.buf, 1);
+	adjust_shared_perm(sb.buf);
+
 	/*
 	 * Create .git/refs/{heads,tags}
 	 */
+	strbuf_reset(&sb);
 	files_ref_path(refs, &sb, "refs/heads");
 	safe_create_dir(sb.buf, 1);
 
diff --git a/setup.c b/setup.c
index a4eb2a38ac..f2d55994e2 100644
--- a/setup.c
+++ b/setup.c
@@ -1904,21 +1904,6 @@ void create_reference_database(const char *initial_branch, int quiet)
 	struct strbuf err = STRBUF_INIT;
 	int reinit = is_reinit();
 
-	/*
-	 * We need to create a "refs" dir in any case so that older versions of
-	 * Git can tell that this is a repository. This serves two main purposes:
-	 *
-	 * - Clients will know to stop walking the parent-directory chain when
-	 *   detecting the Git repository. Otherwise they may end up detecting
-	 *   a Git repository in a parent directory instead.
-	 *
-	 * - Instead of failing to detect a repository with unknown reference
-	 *   format altogether, old clients will print an error saying that
-	 *   they do not understand the reference format extension.
-	 */
-	safe_create_dir(git_path("refs"), 1);
-	adjust_shared_perm(git_path("refs"));
-
 	if (refs_init_db(get_main_ref_store(the_repository), 0, &err))
 		die("failed to set up refs db: %s", err.buf);
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 3/6] refs/files: skip creation of "refs/{heads,tags}" for worktrees
From: Patrick Steinhardt @ 2023-12-28 10:00 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1703754513.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2384 bytes --]

The files ref backend will create both "refs/heads" and "refs/tags" in
the Git directory. While this logic makes sense for normal repositories,
it does not fo worktrees because those refs are "common" refs that would
always be contained in the main repository's ref database.

Introduce a new flag telling the backend that it is expected to create a
per-worktree ref database and skip creation of these dirs in the files
backend when the flag is set. No other backends (currently) need
worktree-specific logic, so this is the only required change to start
creating per-worktree ref databases via `refs_init_db()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.h               |  2 ++
 refs/files-backend.c | 22 ++++++++++++++--------
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/refs.h b/refs.h
index 6090e92578..8ed890841b 100644
--- a/refs.h
+++ b/refs.h
@@ -123,6 +123,8 @@ int should_autocreate_reflog(const char *refname);
 
 int is_branch(const char *refname);
 
+#define REFS_INIT_DB_IS_WORKTREE (1 << 0)
+
 int refs_init_db(struct ref_store *refs, int flags, struct strbuf *err);
 
 /*
diff --git a/refs/files-backend.c b/refs/files-backend.c
index ed47c5dc08..a82d6c453f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3221,7 +3221,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
 }
 
 static int files_init_db(struct ref_store *ref_store,
-			 int flags UNUSED,
+			 int flags,
 			 struct strbuf *err UNUSED)
 {
 	struct files_ref_store *refs =
@@ -3245,15 +3245,21 @@ static int files_init_db(struct ref_store *ref_store,
 	adjust_shared_perm(sb.buf);
 
 	/*
-	 * Create .git/refs/{heads,tags}
+	 * There is no need to create directories for common refs when creating
+	 * a worktree ref store.
 	 */
-	strbuf_reset(&sb);
-	files_ref_path(refs, &sb, "refs/heads");
-	safe_create_dir(sb.buf, 1);
+	if (!(flags & REFS_INIT_DB_IS_WORKTREE)) {
+		/*
+		 * Create .git/refs/{heads,tags}
+		 */
+		strbuf_reset(&sb);
+		files_ref_path(refs, &sb, "refs/heads");
+		safe_create_dir(sb.buf, 1);
 
-	strbuf_reset(&sb);
-	files_ref_path(refs, &sb, "refs/tags");
-	safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+		files_ref_path(refs, &sb, "refs/tags");
+		safe_create_dir(sb.buf, 1);
+	}
 
 	strbuf_release(&sb);
 	return 0;
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox