git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] refs: ref storage format migrations
@ 2024-05-23  8:25 Patrick Steinhardt
  2024-05-23  8:25 ` [PATCH 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
                   ` (13 more replies)
  0 siblings, 14 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23  8:25 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 5689 bytes --]

Hi,

this patch series implements migration of ref storage formats such that
it is for example possible to migrate a repository with "files" format
to the "reftable" format. The migration logic comes in the form of a new
`git refs migrate` subcommand. As mentioned in [1], I do have plans to
extend the new git-refs(1) over time.

In the current form, the migration logic has three major limitations:

  - It does not work with repositories that have worktrees as we have to
    migrate multiple ref stores here, one for every worktree. I wanted
    to avoid making this series too complex right from the start.

  - It does not migrate reflogs, because we have no interfaces to do so.
    I want to eventually address this by adding log-only updates to
    transactions.

  - It is not safe with concurrent writers. This is the limitation that
    is most critical in my eyes. The root cause here is that it is
    inherently impossible to lock the "files" backend for writes. I have
    been thinking about this issue a lot and have not found any solution
    that works. There are partial solutions here:

      - Create a central lockfile for the "files" backend -- if there,
        the backend will refuse to write. If that lock needs to be
        acquired during the "commit" phase of transaction though then we
        would essentially start to sequentialize all writes in the
        "files" backend, which is a non-starter. If not, then processes
        which are running already may not have seen it, and thus the
        issue still exists.

     - Pack all loose refs, remove "refs/" and create a "refs.lock"
       file. This isn't safe though because root refs can still be
       written.

     - Create a log of concurrent writes and apply that to the migrated
       refs once done. This is a lot of complexity, and it's unclear
       whether it even solves the issue with already-running writers.

     - Create a temporary "extensions.refMigration" extension that is
       unhandled by Git. New processes will refuse to run in such a
       repo and thus cannot write to it. Again, unsafe with running
       writers.

    Another alternative is that we could just make this a best effort.
    The "reftable" backend supports locking, and for the "files" backend
    we could just lock "HEAD" and call it a day. I'm not sure whether it
    is preferable though to have a "partial" solution compared to having
    none at all, as it may cause users to be less mindful. That's why I
    decided to just have no solution at all and document the limitation
    accordingly.

    If anybody has ideas here I'd be very happy to hear them.

Anyway. The current state of this patch series is sufficient to migrate
repos without reflogs and worktrees, and thus mostly applies to bare
repositories, only. This is somewhat intentional -- as a server operator
this is of course our primary usecase at GitLab. We do plan to also
upstream support for writing reflogs though, but in a later step. We do
not plan to implement support for migrating repositories with worktrees,
but I'd be happy to help out with the effort in case somebody else wants
to.

The series is built on top of 4365c6fcf9 (The sixth batch, 2024-05-20).
It pulls in the following two dependencies:

  - ps/refs-without-the-repository-updates at 00892786b8 (refs/packed:
    remove references to `the_hash_algo`, 2024-05-17). This is mostly to
    avoid conflicts.

  - ps/pseudo-ref-terminology at 8e4f5c2dc2 (refs: refuse to write
    pseudorefs, 2024-05-15). This is a functional prerequisite because
    the migration logic relies on the new definition of pseudorefs.

There are two minor textual conflicts when merged to "next" or "seen".
One is a conflicting added header in "refs/reftable-backend.c", and one
is a conflict with added functions in "refs.c". Both of these conflicts
are trivial to solve by accepting both sides.

Thanks!

Patrick

[1]: https://lore.kernel.org/git/ZkNJaaKTTKbns8wo@tanuki/#t

Patrick Steinhardt (9):
  setup: unset ref storage when reinitializing repository version
  refs: convert ref storage format to an enum
  refs: pass storage format to `ref_store_init()` explicitly
  refs: allow to skip creation of reflog entries
  refs/files: refactor `add_pseudoref_and_head_entries()`
  refs/files: extract function to iterate through root refs
  refs: implement removal of ref storages
  refs: implement logic to migrate between ref storage formats
  builtin/refs: new command to migrate ref storage formats

 .gitignore                 |   1 +
 Documentation/git-refs.txt |  59 +++++++
 Makefile                   |   1 +
 builtin.h                  |   1 +
 builtin/clone.c            |   2 +-
 builtin/init-db.c          |   2 +-
 builtin/refs.c             |  75 +++++++++
 git.c                      |   1 +
 refs.c                     | 319 +++++++++++++++++++++++++++++++++++--
 refs.h                     |  41 ++++-
 refs/files-backend.c       | 121 ++++++++++++--
 refs/packed-backend.c      |  15 ++
 refs/refs-internal.h       |   7 +
 refs/reftable-backend.c    |  37 ++++-
 repository.c               |   3 +-
 repository.h               |  10 +-
 setup.c                    |  10 +-
 setup.h                    |   9 +-
 t/helper/test-ref-store.c  |   1 +
 t/t1460-refs-migrate.sh    | 243 ++++++++++++++++++++++++++++
 20 files changed, 913 insertions(+), 45 deletions(-)
 create mode 100644 Documentation/git-refs.txt
 create mode 100644 builtin/refs.c
 create mode 100755 t/t1460-refs-migrate.sh

-- 
2.45.1.216.g4365c6fcf9.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 103+ messages in thread

end of thread, other threads:[~2024-06-08 19:06 UTC | newest]

Thread overview: 103+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-23  8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 2/9] refs: convert ref storage format to an enum Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 3/9] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 4/9] refs: allow to skip creation of reflog entries Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 5/9] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 6/9] refs/files: extract function to iterate through root refs Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 7/9] refs: implement removal of ref storages Patrick Steinhardt
2024-05-23  8:25 ` [PATCH 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-05-23 17:31   ` Eric Sunshine
2024-05-24  7:35     ` Patrick Steinhardt
2024-05-24  9:01       ` Eric Sunshine
2024-05-23  8:25 ` [PATCH 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
2024-05-23 17:40   ` Eric Sunshine
2024-05-24  7:35     ` Patrick Steinhardt
2024-05-23 16:09 ` [PATCH 0/9] refs: ref storage format migrations Junio C Hamano
2024-05-24  7:33   ` Patrick Steinhardt
2024-05-24 16:28     ` Junio C Hamano
2024-05-28  5:13       ` Patrick Steinhardt
2024-05-28 16:16         ` Junio C Hamano
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
2024-05-24 10:14   ` [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-05-24 21:33     ` Justin Tobler
2024-05-28  5:13       ` Patrick Steinhardt
2024-05-24 10:14   ` [PATCH v2 2/9] refs: convert ref storage format to an enum Patrick Steinhardt
2024-05-24 10:14   ` [PATCH v2 3/9] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
2024-05-24 10:14   ` [PATCH v2 4/9] refs: allow to skip creation of reflog entries Patrick Steinhardt
2024-05-24 10:14   ` [PATCH v2 5/9] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
2024-05-24 10:14   ` [PATCH v2 6/9] refs/files: extract function to iterate through root refs Patrick Steinhardt
2024-05-24 10:15   ` [PATCH v2 7/9] refs: implement removal of ref storages Patrick Steinhardt
2024-05-24 10:15   ` [PATCH v2 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-05-24 22:32     ` Justin Tobler
2024-05-28  5:14       ` Patrick Steinhardt
2024-05-24 10:15   ` [PATCH v2 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
2024-05-24 18:24     ` Ramsay Jones
2024-05-24 19:29       ` Eric Sunshine
2024-05-28  5:14         ` Patrick Steinhardt
2024-05-28  6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 09/12] worktree: don't store main worktree twice Patrick Steinhardt
2024-05-28  6:31   ` [PATCH v3 10/12] refs: implement removal of ref storages Patrick Steinhardt
2024-05-28  6:32   ` [PATCH v3 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-05-28  6:32   ` [PATCH v3 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
2024-05-31 23:46     ` Junio C Hamano
2024-06-02  1:03       ` Junio C Hamano
2024-06-03  7:37         ` Patrick Steinhardt
2024-05-28 18:16   ` [PATCH v3 00/12] refs: ref storage format migrations Junio C Hamano
2024-05-28 18:26     ` Junio C Hamano
2024-06-03  9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
2024-06-04  8:23     ` Karthik Nayak
2024-06-03  9:30   ` [PATCH v4 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
2024-06-04 11:04     ` Karthik Nayak
2024-06-03  9:30   ` [PATCH v4 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
2024-06-05 10:07     ` Jeff King
2024-06-06  4:50       ` Patrick Steinhardt
2024-06-06  5:15         ` Patrick Steinhardt
2024-06-06  6:32           ` Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 09/12] worktree: don't store main worktree twice Patrick Steinhardt
2024-06-03  9:30   ` [PATCH v4 10/12] refs: implement removal of ref storages Patrick Steinhardt
2024-06-04 11:17     ` Karthik Nayak
2024-06-05 10:12     ` Jeff King
2024-06-05 16:54       ` Junio C Hamano
2024-06-06  4:51       ` Patrick Steinhardt
2024-06-03  9:31   ` [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-06-04 15:28     ` Karthik Nayak
2024-06-05  5:52       ` Patrick Steinhardt
2024-06-05 10:03     ` Jeff King
2024-06-05 16:59       ` Junio C Hamano
2024-06-06  4:51         ` Patrick Steinhardt
2024-06-06  7:01           ` Jeff King
2024-06-06 15:41             ` Junio C Hamano
2024-06-08 11:36               ` Jeff King
2024-06-08 19:06                 ` Junio C Hamano
2024-06-06  4:51       ` Patrick Steinhardt
2024-06-03  9:31   ` [PATCH v4 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
2024-06-06  5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
2024-06-06  5:28   ` [PATCH v5 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 09/12] worktree: don't store main worktree twice Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 10/12] refs: implement removal of ref storages Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-06-06  5:29   ` [PATCH v5 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
2024-06-06  7:06   ` [PATCH v5 00/12] refs: ref storage migrations Jeff King
2024-06-06 16:18   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).