* [PATCH 1/9] setup: unset ref storage when reinitializing repository version
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 2/9] refs: convert ref storage format to an enum Patrick Steinhardt
` (12 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]
When reinitializing a repository's version we may end up unsetting the
hash algorithm when it matches the default hash algorithm. If we didn't
do that then the previously configured value might remain intact.
While the same issue exists for the ref storage extension, we don't do
this here. This has been fine for most of the part because it is not
supported to re-initialize a repository with a different ref storage
format anyway. We're about to introduce a new command to migrate ref
storages though, so this is about to become an issue there.
Prepare for this and unset the ref storage format when reinitializing a
repoistory with the "files" format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/setup.c b/setup.c
index 7975230ffb..8c84ec9d4b 100644
--- a/setup.c
+++ b/setup.c
@@ -2028,6 +2028,8 @@ void initialize_repository_version(int hash_algo,
if (ref_storage_format != REF_STORAGE_FORMAT_FILES)
git_config_set("extensions.refstorage",
ref_storage_format_to_name(ref_storage_format));
+ else if (reinit)
+ git_config_set_gently("extensions.refstorage", NULL);
}
static int is_reinit(void)
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 2/9] refs: convert ref storage format to an enum
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 3/9] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
` (11 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 8337 bytes --]
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.
Convert the ref storage format to instead be an enum.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
refs.c | 7 ++++---
refs.h | 10 ++++++++--
repository.c | 3 ++-
repository.h | 10 ++++------
setup.c | 8 ++++----
setup.h | 9 +++++----
8 files changed, 29 insertions(+), 22 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index 1e07524c53..e808e02017 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -970,7 +970,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
int submodule_progress;
int filter_submodules = 0;
int hash_algo;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
const int do_not_override_repo_unix_permissions = -1;
const char *template_dir;
char *template_dir_dup = NULL;
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 0170469b84..582dcf20f8 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -81,7 +81,7 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
const char *ref_format = NULL;
const char *initial_branch = NULL;
int hash_algo = GIT_HASH_UNKNOWN;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
int init_shared_repository = -1;
const struct option init_db_options[] = {
OPT_STRING(0, "template", &template_dir, N_("template-directory"),
diff --git a/refs.c b/refs.c
index 31032588e0..e6db85a165 100644
--- a/refs.c
+++ b/refs.c
@@ -37,14 +37,15 @@ static const struct ref_storage_be *refs_backends[] = {
[REF_STORAGE_FORMAT_REFTABLE] = &refs_be_reftable,
};
-static const struct ref_storage_be *find_ref_storage_backend(unsigned int ref_storage_format)
+static const struct ref_storage_be *find_ref_storage_backend(
+ enum ref_storage_format ref_storage_format)
{
if (ref_storage_format < ARRAY_SIZE(refs_backends))
return refs_backends[ref_storage_format];
return NULL;
}
-unsigned int ref_storage_format_by_name(const char *name)
+enum ref_storage_format ref_storage_format_by_name(const char *name)
{
for (unsigned int i = 0; i < ARRAY_SIZE(refs_backends); i++)
if (refs_backends[i] && !strcmp(refs_backends[i]->name, name))
@@ -52,7 +53,7 @@ unsigned int ref_storage_format_by_name(const char *name)
return REF_STORAGE_FORMAT_UNKNOWN;
}
-const char *ref_storage_format_to_name(unsigned int ref_storage_format)
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format)
{
const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
if (!be)
diff --git a/refs.h b/refs.h
index fe7f0db35e..a7afa9bede 100644
--- a/refs.h
+++ b/refs.h
@@ -11,8 +11,14 @@ struct string_list;
struct string_list_item;
struct worktree;
-unsigned int ref_storage_format_by_name(const char *name);
-const char *ref_storage_format_to_name(unsigned int ref_storage_format);
+enum ref_storage_format {
+ REF_STORAGE_FORMAT_UNKNOWN,
+ REF_STORAGE_FORMAT_FILES,
+ REF_STORAGE_FORMAT_REFTABLE,
+};
+
+enum ref_storage_format ref_storage_format_by_name(const char *name);
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format);
/*
* Resolve a reference, recursively following symbolic refererences.
diff --git a/repository.c b/repository.c
index d29b0304fb..166863f852 100644
--- a/repository.c
+++ b/repository.c
@@ -124,7 +124,8 @@ void repo_set_compat_hash_algo(struct repository *repo, int algo)
repo_read_loose_object_map(repo);
}
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format)
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format)
{
repo->ref_storage_format = format;
}
diff --git a/repository.h b/repository.h
index 4bd8969005..a35cd77c35 100644
--- a/repository.h
+++ b/repository.h
@@ -1,6 +1,7 @@
#ifndef REPOSITORY_H
#define REPOSITORY_H
+#include "refs.h"
#include "strmap.h"
struct config_set;
@@ -26,10 +27,6 @@ enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NOOP,
};
-#define REF_STORAGE_FORMAT_UNKNOWN 0
-#define REF_STORAGE_FORMAT_FILES 1
-#define REF_STORAGE_FORMAT_REFTABLE 2
-
struct repo_settings {
int initialized;
@@ -181,7 +178,7 @@ struct repository {
const struct git_hash_algo *compat_hash_algo;
/* Repository's reference storage format, as serialized on disk. */
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
/* A unique-id for tracing purposes. */
int trace2_repo_id;
@@ -220,7 +217,8 @@ void repo_set_gitdir(struct repository *repo, const char *root,
void repo_set_worktree(struct repository *repo, const char *path);
void repo_set_hash_algo(struct repository *repo, int algo);
void repo_set_compat_hash_algo(struct repository *repo, int compat_algo);
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format);
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format);
void initialize_repository(struct repository *repo);
RESULT_MUST_BE_USED
int repo_init(struct repository *r, const char *gitdir, const char *worktree);
diff --git a/setup.c b/setup.c
index 8c84ec9d4b..b49ee3e95f 100644
--- a/setup.c
+++ b/setup.c
@@ -1997,7 +1997,7 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
}
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit)
{
char repo_version_string[10];
@@ -2044,7 +2044,7 @@ static int is_reinit(void)
return ret;
}
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet)
{
struct strbuf err = STRBUF_INIT;
@@ -2243,7 +2243,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
}
static void validate_ref_storage_format(struct repository_format *repo_fmt,
- unsigned int format)
+ enum ref_storage_format format)
{
const char *name = getenv("GIT_DEFAULT_REF_FORMAT");
@@ -2263,7 +2263,7 @@ static void validate_ref_storage_format(struct repository_format *repo_fmt,
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch,
int init_shared_repository, unsigned int flags)
{
diff --git a/setup.h b/setup.h
index b3fd3bf45a..cd8dbc2497 100644
--- a/setup.h
+++ b/setup.h
@@ -1,6 +1,7 @@
#ifndef SETUP_H
#define SETUP_H
+#include "refs.h"
#include "string-list.h"
int is_inside_git_dir(void);
@@ -128,7 +129,7 @@ struct repository_format {
int is_bare;
int hash_algo;
int compat_hash_algo;
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
int sparse_index;
char *work_tree;
struct string_list unknown_extensions;
@@ -192,13 +193,13 @@ const char *get_template_dir(const char *option_template);
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch, int init_shared_repository,
unsigned int flags);
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit);
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet);
/*
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 3/9] refs: pass storage format to `ref_store_init()` explicitly
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 2/9] refs: convert ref storage format to an enum Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 4/9] refs: allow to skip creation of reflog entries Patrick Steinhardt
` (10 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 2646 bytes --]
We're about to introduce logic to migrate refs from one storage format
to another one. This will require us to initialize a ref store with a
different format than the one used by the passed-in repository.
Prepare for this by accepting the desired ref storage format as
parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/refs.c b/refs.c
index e6db85a165..7c3f4df457 100644
--- a/refs.c
+++ b/refs.c
@@ -1894,13 +1894,14 @@ static struct ref_store *lookup_ref_store_map(struct strmap *map,
* gitdir.
*/
static struct ref_store *ref_store_init(struct repository *repo,
+ enum ref_storage_format format,
const char *gitdir,
unsigned int flags)
{
const struct ref_storage_be *be;
struct ref_store *refs;
- be = find_ref_storage_backend(repo->ref_storage_format);
+ be = find_ref_storage_backend(format);
if (!be)
BUG("reference backend is unknown");
@@ -1922,7 +1923,8 @@ struct ref_store *get_main_ref_store(struct repository *r)
if (!r->gitdir)
BUG("attempting to get main_ref_store outside of repository");
- r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
+ r->refs_private = ref_store_init(r, r->ref_storage_format,
+ r->gitdir, REF_STORE_ALL_CAPS);
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
return r->refs_private;
}
@@ -1982,7 +1984,8 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo,
free(subrepo);
goto done;
}
- refs = ref_store_init(subrepo, submodule_sb.buf,
+ refs = ref_store_init(subrepo, the_repository->ref_storage_format,
+ submodule_sb.buf,
REF_STORE_READ | REF_STORE_ODB);
register_ref_store_map(&repo->submodule_ref_stores, "submodule",
refs, submodule);
@@ -2011,12 +2014,12 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
struct strbuf common_path = STRBUF_INIT;
strbuf_git_common_path(&common_path, wt->repo,
"worktrees/%s", wt->id);
- refs = ref_store_init(wt->repo, common_path.buf,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, wt->repo->ref_storage_format,
+ common_path.buf, REF_STORE_ALL_CAPS);
strbuf_release(&common_path);
} else {
- refs = ref_store_init(wt->repo, wt->repo->commondir,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, the_repository->ref_storage_format,
+ wt->repo->commondir, REF_STORE_ALL_CAPS);
}
if (refs)
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 4/9] refs: allow to skip creation of reflog entries
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (2 preceding siblings ...)
2024-05-23 8:25 ` [PATCH 3/9] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 5/9] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
` (9 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 3845 bytes --]
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 6 ++++++
refs.h | 8 +++++++-
refs/files-backend.c | 4 ++++
refs/reftable-backend.c | 3 ++-
t/helper/test-ref-store.c | 1 +
5 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/refs.c b/refs.c
index 7c3f4df457..66e9585767 100644
--- a/refs.c
+++ b/refs.c
@@ -1194,6 +1194,12 @@ int ref_transaction_update(struct ref_transaction *transaction,
{
assert(err);
+ if ((flags & REF_FORCE_CREATE_REFLOG) &&
+ (flags & REF_SKIP_CREATE_REFLOG)) {
+ strbuf_addstr(err, _("refusing to force and skip creation of reflog"));
+ return -1;
+ }
+
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
((new_oid && !is_null_oid(new_oid)) ?
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL) :
diff --git a/refs.h b/refs.h
index a7afa9bede..50a2b3ab09 100644
--- a/refs.h
+++ b/refs.h
@@ -659,13 +659,19 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
*/
#define REF_SKIP_REFNAME_VERIFICATION (1 << 11)
+/*
+ * Skip creation of a reflog entry, even if it would have otherwise been
+ * created.
+ */
+#define REF_SKIP_CREATE_REFLOG (1 << 12)
+
/*
* Bitmask of all of the flags that are allowed to be passed in to
* ref_transaction_update() and friends:
*/
#define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS \
(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
- REF_SKIP_REFNAME_VERIFICATION)
+ REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
/*
* Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 73380d7e99..bd0d63bcba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1750,6 +1750,9 @@ static int files_log_ref_write(struct files_ref_store *refs,
{
int logfd, result;
+ if (flags & REF_SKIP_CREATE_REFLOG)
+ return 0;
+
if (log_all_ref_updates == LOG_REFS_UNSET)
log_all_ref_updates = is_bare_repository() ? LOG_REFS_NONE : LOG_REFS_NORMAL;
@@ -2251,6 +2254,7 @@ static int split_head_update(struct ref_update *update,
struct ref_update *new_update;
if ((update->flags & REF_LOG_ONLY) ||
+ (update->flags & REF_SKIP_CREATE_REFLOG) ||
(update->flags & REF_IS_PRUNING) ||
(update->flags & REF_UPDATE_VIA_HEAD))
return 0;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f6edfdf5b3..bffed9257f 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1103,7 +1103,8 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
if (ret)
goto done;
- } else if (u->flags & REF_HAVE_NEW &&
+ } else if (!(u->flags & REF_SKIP_CREATE_REFLOG) &&
+ (u->flags & REF_HAVE_NEW) &&
(u->flags & REF_FORCE_CREATE_REFLOG ||
should_write_log(&arg->refs->base, u->refname))) {
struct reftable_log_record *log;
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index c9efd74c2b..ad24300170 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -126,6 +126,7 @@ static struct flag_definition transaction_flags[] = {
FLAG_DEF(REF_FORCE_CREATE_REFLOG),
FLAG_DEF(REF_SKIP_OID_VERIFICATION),
FLAG_DEF(REF_SKIP_REFNAME_VERIFICATION),
+ FLAG_DEF(REF_SKIP_CREATE_REFLOG),
{ NULL, 0 }
};
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 5/9] refs/files: refactor `add_pseudoref_and_head_entries()`
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (3 preceding siblings ...)
2024-05-23 8:25 ` [PATCH 4/9] refs: allow to skip creation of reflog entries Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 6/9] refs/files: extract function to iterate through root refs Patrick Steinhardt
` (8 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]
The `add_pseudoref_and_head_entries()` function accepts both the ref
store as well as a directory name as input. This is unnecessary though
as the ref store already uniquely identifies the root directory of the
ref store anyway.
Furthermore, the function is misnamed now that we have clarified the
meaning of pseudorefs as it doesn't add pseudorefs, but root refs.
Rename it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index bd0d63bcba..b4e5437ffe 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -324,16 +324,14 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
}
/*
- * Add pseudorefs to the ref dir by parsing the directory for any files
- * which follow the pseudoref syntax.
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
*/
-static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
- struct ref_dir *dir,
- const char *dirname)
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
{
- struct files_ref_store *refs =
- files_downcast(ref_store, REF_STORE_READ, "fill_ref_dir");
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
+ const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
DIR *d;
@@ -388,8 +386,7 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
dir = get_ref_dir(refs->loose->root);
if (flags & DO_FOR_EACH_INCLUDE_ROOT_REFS)
- add_pseudoref_and_head_entries(dir->cache->ref_store, dir,
- refs->loose->root->name);
+ add_root_refs(refs, dir);
/*
* Add an incomplete entry for "refs/" (to be filled
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 6/9] refs/files: extract function to iterate through root refs
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (4 preceding siblings ...)
2024-05-23 8:25 ` [PATCH 5/9] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 7/9] refs: implement removal of ref storages Patrick Steinhardt
` (7 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]
Extract a new function that can be used to iterate through all root refs
known to the "files" backend. This will be used in the next commit,
where we start to teach ref backends to remove themselves.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 49 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 40 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b4e5437ffe..b7268b26c8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -323,17 +323,15 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
add_per_worktree_entries_to_dir(dir, dirname);
}
-/*
- * Add root refs to the ref dir by parsing the directory for any files which
- * follow the root ref syntax.
- */
-static void add_root_refs(struct files_ref_store *refs,
- struct ref_dir *dir)
+static int for_each_root_ref(struct files_ref_store *refs,
+ int (*cb)(const char *refname, void *cb_data),
+ void *cb_data)
{
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
+ int ret;
DIR *d;
files_ref_path(refs, &path, dirname);
@@ -341,7 +339,7 @@ static void add_root_refs(struct files_ref_store *refs,
d = opendir(path.buf);
if (!d) {
strbuf_release(&path);
- return;
+ return -1;
}
strbuf_addstr(&refname, dirname);
@@ -357,14 +355,47 @@ static void add_root_refs(struct files_ref_store *refs,
strbuf_addstr(&refname, de->d_name);
dtype = get_dtype(de, &path, 1);
- if (dtype == DT_REG && is_root_ref(de->d_name))
- loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
+ if (dtype == DT_REG && is_root_ref(de->d_name)) {
+ ret = cb(refname.buf, cb_data);
+ if (ret)
+ goto done;
+ }
strbuf_setlen(&refname, dirnamelen);
}
+
+done:
strbuf_release(&refname);
strbuf_release(&path);
closedir(d);
+ return ret;
+}
+
+struct fill_root_ref_data {
+ struct files_ref_store *refs;
+ struct ref_dir *dir;
+};
+
+static int fill_root_ref(const char *refname, void *cb_data)
+{
+ struct fill_root_ref_data *data = cb_data;
+ loose_fill_ref_dir_regular_file(data->refs, refname, data->dir);
+ return 0;
+}
+
+/*
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
+ */
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
+{
+ struct fill_root_ref_data data = {
+ .refs = refs,
+ .dir = dir,
+ };
+
+ for_each_root_ref(refs, fill_root_ref, &data);
}
static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 7/9] refs: implement removal of ref storages
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (5 preceding siblings ...)
2024-05-23 8:25 ` [PATCH 6/9] refs/files: extract function to iterate through root refs Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 8:25 ` [PATCH 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
` (6 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 7882 bytes --]
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.
Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 5 ++++
refs.h | 5 ++++
refs/files-backend.c | 61 +++++++++++++++++++++++++++++++++++++++++
refs/packed-backend.c | 15 ++++++++++
refs/refs-internal.h | 7 +++++
refs/reftable-backend.c | 34 +++++++++++++++++++++++
6 files changed, 127 insertions(+)
diff --git a/refs.c b/refs.c
index 66e9585767..9b112b0527 100644
--- a/refs.c
+++ b/refs.c
@@ -1861,6 +1861,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
return refs->be->create_on_disk(refs, flags, err);
}
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err)
+{
+ return refs->be->remove_on_disk(refs, err);
+}
+
int repo_resolve_gitlink_ref(struct repository *r,
const char *submodule, const char *refname,
struct object_id *oid)
diff --git a/refs.h b/refs.h
index 50a2b3ab09..61ee7b7a15 100644
--- a/refs.h
+++ b/refs.h
@@ -129,6 +129,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
*/
void ref_store_release(struct ref_store *ref_store);
+/*
+ * Remove the ref store from disk. This deletes all associated data.
+ */
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err);
+
/*
* Return the peeled value of the oid currently being iterated via
* for_each_ref(), etc. This is equivalent to calling:
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b7268b26c8..8b74518022 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3340,11 +3340,72 @@ static int files_ref_store_create_on_disk(struct ref_store *ref_store,
return 0;
}
+struct remove_one_root_ref_data {
+ const char *gitdir;
+ struct strbuf *err;
+};
+
+static int remove_one_root_ref(const char *refname,
+ void *cb_data)
+{
+ struct remove_one_root_ref_data *data = cb_data;
+ struct strbuf buf = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&buf, "%s/%s", data->gitdir, refname);
+
+ ret = remove_path(buf.buf);
+ if (ret < 0)
+ strbuf_addf(data->err, "could not delete %s: %s\n",
+ refname, strerror(errno));
+
+ strbuf_release(&buf);
+ return ret;
+}
+
+static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct files_ref_store *refs =
+ files_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct remove_one_root_ref_data data = {
+ .gitdir = refs->base.gitdir,
+ .err = err,
+ };
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addstr(err, "could not delete refs");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addstr(err, "could not delete logs\n");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ ret = for_each_root_ref(refs, remove_one_root_ref, &data);
+ if (ret < 0)
+ ret = -1;
+
+ if (ref_store_remove_on_disk(refs->packed_ref_store, err) < 0)
+ ret = -1;
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct ref_storage_be refs_be_files = {
.name = "files",
.init = files_ref_store_init,
.release = files_ref_store_release,
.create_on_disk = files_ref_store_create_on_disk,
+ .remove_on_disk = files_ref_store_remove_on_disk,
.transaction_prepare = files_transaction_prepare,
.transaction_finish = files_transaction_finish,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 2789fd92f5..c4c1e36aa2 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1,5 +1,6 @@
#include "../git-compat-util.h"
#include "../config.h"
+#include "../dir.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1266,6 +1267,19 @@ static int packed_ref_store_create_on_disk(struct ref_store *ref_store UNUSED,
return 0;
}
+static int packed_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct packed_ref_store *refs = packed_downcast(ref_store, 0, "remove");
+
+ if (remove_path(refs->path) < 0) {
+ strbuf_addstr(err, "could not delete packed-refs");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* Write the packed refs from the current snapshot to the packed-refs
* tempfile, incorporating any changes from `updates`. `updates` must
@@ -1724,6 +1738,7 @@ struct ref_storage_be refs_be_packed = {
.init = packed_ref_store_init,
.release = packed_ref_store_release,
.create_on_disk = packed_ref_store_create_on_disk,
+ .remove_on_disk = packed_ref_store_remove_on_disk,
.transaction_prepare = packed_transaction_prepare,
.transaction_finish = packed_transaction_finish,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 33749fbd83..cbcb6f9c36 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -517,6 +517,12 @@ typedef int ref_store_create_on_disk_fn(struct ref_store *refs,
int flags,
struct strbuf *err);
+/*
+ * Remove the reference store from disk.
+ */
+typedef int ref_store_remove_on_disk_fn(struct ref_store *refs,
+ struct strbuf *err);
+
typedef int ref_transaction_prepare_fn(struct ref_store *refs,
struct ref_transaction *transaction,
struct strbuf *err);
@@ -649,6 +655,7 @@ struct ref_storage_be {
ref_store_init_fn *init;
ref_store_release_fn *release;
ref_store_create_on_disk_fn *create_on_disk;
+ ref_store_remove_on_disk_fn *remove_on_disk;
ref_transaction_prepare_fn *transaction_prepare;
ref_transaction_finish_fn *transaction_finish;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index bffed9257f..62992a67ee 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
#include "../git-compat-util.h"
#include "../abspath.h"
#include "../chdir-notify.h"
+#include "../dir.h"
#include "../environment.h"
#include "../gettext.h"
#include "../hash.h"
@@ -343,6 +344,38 @@ static int reftable_be_create_on_disk(struct ref_store *ref_store,
return 0;
}
+static int reftable_be_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct reftable_ref_store *refs =
+ reftable_be_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addstr(err, "could not delete reftables");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+ if (remove_path(sb.buf) < 0) {
+ strbuf_addstr(err, "could not delete stub HEAD");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+ if (remove_path(sb.buf) < 0) {
+ strbuf_addstr(err, "could not delete stub heads");
+ ret = -1;
+ }
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct reftable_ref_iterator {
struct ref_iterator base;
struct reftable_ref_store *refs;
@@ -2196,6 +2229,7 @@ struct ref_storage_be refs_be_reftable = {
.init = reftable_be_init,
.release = reftable_be_release,
.create_on_disk = reftable_be_create_on_disk,
+ .remove_on_disk = reftable_be_remove_on_disk,
.transaction_prepare = reftable_be_transaction_prepare,
.transaction_finish = reftable_be_transaction_finish,
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 8/9] refs: implement logic to migrate between ref storage formats
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (6 preceding siblings ...)
2024-05-23 8:25 ` [PATCH 7/9] refs: implement removal of ref storages Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 17:31 ` Eric Sunshine
2024-05-23 8:25 ` [PATCH 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
` (5 subsequent siblings)
13 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 11145 bytes --]
With the introduction of the new "reftable" backend, users may want to
migrate repositories between the backends without having to recreate the
whole repository. Add the logic to do so.
The implementation is generic and works with arbitrary ref storage
formats because we only use. It does have a few limitations though:
- We do not migrate repositories with worktrees, because worktrees
have separate ref storages. It makes the overall affair more complex
if we have to migrate multiple storages at once.
- We do not migrate reflogs, because we have no interfaces to write
many reflog entries.
- We do not lock the repository for concurrent access, and thus
concurrent writes may make use end up with weird in-between states.
There is no way to fully lock the "files" backend for writes due to
its format, and thus we punt on this topic altogether and defer to
the user to avoid those from happening.
In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
typically have neither worktrees nor reflogs. But it will not work for
many other repositories without some preparations. These limitations are
not set into stone though, and ideally we will eventually address them
over time.
The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 284 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
refs.h | 18 ++++
2 files changed, 302 insertions(+)
diff --git a/refs.c b/refs.c
index 9b112b0527..4ef2dae5e1 100644
--- a/refs.c
+++ b/refs.c
@@ -2570,3 +2570,287 @@ int ref_update_check_old_target(const char *referent, struct ref_update *update,
referent, update->old_target);
return -1;
}
+
+struct migration_data {
+ struct ref_store *old_refs;
+ struct ref_store *new_refs;
+ struct ref_transaction *transaction;
+ struct strbuf *errbuf;
+ const char *refname;
+};
+
+static int migrate_one_ref(const char *refname, const struct object_id *oid,
+ int flags, void *cb_data)
+{
+ struct migration_data *data = cb_data;
+ struct strbuf symref_target = STRBUF_INIT;
+ int ret;
+
+ if (flags & REF_ISSYMREF) {
+ ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target);
+ if (ret < 0)
+ goto done;
+
+ ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(),
+ symref_target.buf, NULL,
+ REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ } else {
+ ret = ref_transaction_create(data->transaction, refname, oid,
+ REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION,
+ NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ }
+
+done:
+ strbuf_release(&symref_target);
+ return ret;
+}
+
+static int move_files(const char *from_path, const char *to_path, struct strbuf *errbuf)
+{
+ struct strbuf from_buf = STRBUF_INIT, to_buf = STRBUF_INIT;
+ size_t from_len, to_len;
+ DIR *from_dir;
+ int ret;
+
+ from_dir = opendir(from_path);
+ if (!from_dir) {
+ strbuf_addf(errbuf, "could not open source directory: '%s'", from_path);
+ ret = -1;
+ goto done;
+ }
+
+ strbuf_addstr(&from_buf, from_path);
+ strbuf_complete(&from_buf, '/');
+ from_len = from_buf.len;
+
+ strbuf_addstr(&to_buf, to_path);
+ strbuf_complete(&to_buf, '/');
+ to_len = to_buf.len;
+
+ while (1) {
+ struct dirent *ent;
+
+ errno = 0;
+ ent = readdir(from_dir);
+ if (!ent)
+ break;
+
+ if (!strcmp(ent->d_name, ".") ||
+ !strcmp(ent->d_name, ".."))
+ continue;
+
+ strbuf_setlen(&from_buf, from_len);
+ strbuf_addstr(&from_buf, ent->d_name);
+
+ strbuf_setlen(&to_buf, to_len);
+ strbuf_addstr(&to_buf, ent->d_name);
+
+ ret = rename(from_buf.buf, to_buf.buf);
+ if (ret < 0) {
+ strbuf_addf(errbuf, "could not link file '%s' to '%s': %s",
+ from_buf.buf, to_buf.buf, strerror(errno));
+ goto done;
+ }
+ }
+
+ if (errno) {
+ strbuf_addf(errbuf, "could not read entry from directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ ret = 0;
+
+done:
+ strbuf_release(&from_buf);
+ strbuf_release(&to_buf);
+ closedir(from_dir);
+ return ret;
+}
+
+static int count_reflogs(const char *reflog, void *payload)
+{
+ size_t *reflog_count = payload;
+ (*reflog_count)++;
+ return 0;
+}
+
+static int has_worktrees(void)
+{
+ struct worktree **worktrees = get_worktrees();
+ int ret = 0;
+ size_t i;
+
+ for (i = 0; worktrees[i]; i++) {
+ if (is_main_worktree(worktrees[i]))
+ continue;
+ ret = 1;
+ }
+
+ free_worktrees(worktrees);
+ return ret;
+}
+
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *errbuf)
+{
+ struct ref_store *old_refs = NULL, *new_refs = NULL;
+ struct ref_transaction *transaction = NULL;
+ struct strbuf buf = STRBUF_INIT;
+ struct migration_data data;
+ size_t reflog_count = 0;
+ char *new_gitdir;
+ int ret;
+
+ old_refs = get_main_ref_store(repo);
+
+ /*
+ * The overall logic looks like this:
+ *
+ * 1. Set up a new temporary directory and initialize it with the new
+ * format. This is where all refs will be migrated into.
+ *
+ * 2. Enumerate all refs and write them into the new ref storage.
+ * This operation is safe as we do not yet modify the main
+ * repository.
+ *
+ * 3. If we're in dry-run mode then we are done and can hand over the
+ * directory to the caller for inspection. If not, we now start
+ * with the destructive part.
+ *
+ * 4. Delete the old ref storage from disk. As we have a copy of refs
+ * in the new ref storage it's okay(ish) if we now get interrupted
+ * as there is an equivalent copy of all refs available.
+ *
+ * 5. Move the new ref storage files into place.
+ *
+ * 6. Change the repository format to the new ref format.
+ */
+ strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
+ new_gitdir = mkdtemp(buf.buf);
+ if (!new_gitdir) {
+ strbuf_addf(errbuf, "cannot create migration directory: %s",
+ strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
+ strbuf_addstr(errbuf, "cannot count reflogs");
+ ret = -1;
+ goto done;
+ }
+ if (reflog_count) {
+ strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * TODO: we should really be passing the caller-provided repository to
+ * `has_worktrees()`, but our worktree subsystem doesn't yet support
+ * that.
+ */
+ if (has_worktrees()) {
+ strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ new_refs = ref_store_init(repo, format, new_gitdir,
+ REF_STORE_ALL_CAPS);
+ ret = ref_store_create_on_disk(new_refs, 0, errbuf);
+ if (ret < 0)
+ goto done;
+
+ transaction = ref_store_transaction_begin(new_refs, errbuf);
+ if (!transaction)
+ goto done;
+
+ data.old_refs = old_refs;
+ data.new_refs = new_refs;
+ data.transaction = transaction;
+ data.errbuf = errbuf;
+
+ /*
+ * We need to use the internal `do_for_each_ref()` here so that we can
+ * also include broken refs and symrefs. These would otherwise be
+ * skipped silently.
+ *
+ * Ideally, we would do this call while locking the old ref storage
+ * such that there cannot be any concurrent modifications. We do not
+ * have the infra for that though, and the "files" backend does not
+ * allow for a central lock due to its design. It's thus on the user to
+ * ensure that there are no concurrent writes.
+ */
+ ret = do_for_each_ref(old_refs, "", NULL, migrate_one_ref, 0,
+ DO_FOR_EACH_INCLUDE_ROOT_REFS | DO_FOR_EACH_INCLUDE_BROKEN,
+ &data);
+ if (ret < 0)
+ goto done;
+
+ /*
+ * TODO: we might want to migrate to `initial_ref_transaction_commit()`
+ * here, which is more efficient for the files backend because it would
+ * write new refs into the packed-refs file directly. At this point,
+ * the files backend doesn't handle pseudo-refs and symrefs correctly
+ * though, so this requires some more work.
+ */
+ ret = ref_transaction_commit(transaction, errbuf);
+ if (ret < 0)
+ goto done;
+
+ if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
+ printf(_("Finished dry-run migration of refs, "
+ "the result can be found at '%s'\n"), new_gitdir);
+ ret = 0;
+ goto done;
+ }
+
+ /*
+ * Until now we were in the non-destructive phase, where we only
+ * populated the new ref store. From hereon though we are about
+ * to get hands by deleting the old ref store and then moving
+ * the new one into place.
+ *
+ * Assuming that there were no concurrent writes, the new ref
+ * store should have all information. So if we fail from hereon
+ * we may be in an in-between state, but it would still be able
+ * to recover by manually moving remaining files from the
+ * temporary migration directory into place.
+ */
+ ret = ref_store_remove_on_disk(old_refs, errbuf);
+ if (ret < 0)
+ goto done;
+
+ ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
+ if (ret < 0)
+ goto done;
+ rmdir(new_gitdir);
+
+ /*
+ * We have migrated the repository, so we now need to adjust the
+ * repository format so that clients will use the new ref store.
+ * We also need to swap out the repository's main ref store.
+ */
+ initialize_repository_version(hash_algo_by_ptr(repo->hash_algo), format, 1);
+
+ repo->refs_private = new_refs;
+ ref_store_release(old_refs);
+
+ ret = 0;
+
+done:
+ if (ret && new_refs)
+ ref_store_release(new_refs);
+ ref_transaction_free(transaction);
+ strbuf_release(&buf);
+ return ret;
+}
diff --git a/refs.h b/refs.h
index 61ee7b7a15..76d25df4de 100644
--- a/refs.h
+++ b/refs.h
@@ -1070,6 +1070,24 @@ int is_root_ref(const char *refname);
*/
int is_pseudo_ref(const char *refname);
+/*
+ * The following flags can be passed to `repo_migrate_ref_storage_format()`:
+ *
+ * - REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN: perform a dry-run migration
+ * without touching the main repository. The result will be written into a
+ * temporary ref storage directory.
+ */
+#define REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN (1 << 0)
+
+/*
+ * Migrate the ref storage format used by the repository to the
+ * specified one.
+ */
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *err);
+
/*
* The following functions have been removed in Git v2.45 in favor of functions
* that receive a `ref_store` as parameter. The intent of this section is
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] refs: implement logic to migrate between ref storage formats
2024-05-23 8:25 ` [PATCH 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-05-23 17:31 ` Eric Sunshine
2024-05-24 7:35 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Eric Sunshine @ 2024-05-23 17:31 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Thu, May 23, 2024 at 4:26 AM Patrick Steinhardt <ps@pks.im> wrote:
> With the introduction of the new "reftable" backend, users may want to
> migrate repositories between the backends without having to recreate the
> whole repository. Add the logic to do so.
>
> The implementation is generic and works with arbitrary ref storage
> formats because we only use.
ECANNOTPARSE: This sentence seems to be broken grammatically.
> It does have a few limitations though:
>
> - We do not migrate repositories with worktrees, because worktrees
> have separate ref storages. It makes the overall affair more complex
> if we have to migrate multiple storages at once.
>
> - We do not migrate reflogs, because we have no interfaces to write
> many reflog entries.
>
> - We do not lock the repository for concurrent access, and thus
> concurrent writes may make use end up with weird in-between states.
> There is no way to fully lock the "files" backend for writes due to
> its format, and thus we punt on this topic altogether and defer to
> the user to avoid those from happening.
>
> In other words, this version is a minimum viable product for migrating a
> repository's ref storage format. It works alright for bare repos, which
> typically have neither worktrees nor reflogs.
Worktrees hanging off a bare repository is an explicitly supported
use-case, and there are people who use and promote such an
organization, so I'm not sure if "typically" is accurate these days.
Anyhow, just a minor observation, probably not worth rewording, and
certainly not worth a reroll.
> But it will not work for
> many other repositories without some preparations. These limitations are
> not set into stone though, and ideally we will eventually address them
> over time.
>
> The logic is not yet used by anything, and thus there are no tests for
> it. Those will be added in the next commit.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] refs: implement logic to migrate between ref storage formats
2024-05-23 17:31 ` Eric Sunshine
@ 2024-05-24 7:35 ` Patrick Steinhardt
2024-05-24 9:01 ` Eric Sunshine
0 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 7:35 UTC (permalink / raw)
To: Eric Sunshine; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 2067 bytes --]
On Thu, May 23, 2024 at 01:31:03PM -0400, Eric Sunshine wrote:
> On Thu, May 23, 2024 at 4:26 AM Patrick Steinhardt <ps@pks.im> wrote:
> > With the introduction of the new "reftable" backend, users may want to
> > migrate repositories between the backends without having to recreate the
> > whole repository. Add the logic to do so.
> >
> > The implementation is generic and works with arbitrary ref storage
> > formats because we only use.
>
> ECANNOTPARSE: This sentence seems to be broken grammatically.
Will fix.
> > It does have a few limitations though:
> >
> > - We do not migrate repositories with worktrees, because worktrees
> > have separate ref storages. It makes the overall affair more complex
> > if we have to migrate multiple storages at once.
> >
> > - We do not migrate reflogs, because we have no interfaces to write
> > many reflog entries.
> >
> > - We do not lock the repository for concurrent access, and thus
> > concurrent writes may make use end up with weird in-between states.
> > There is no way to fully lock the "files" backend for writes due to
> > its format, and thus we punt on this topic altogether and defer to
> > the user to avoid those from happening.
> >
> > In other words, this version is a minimum viable product for migrating a
> > repository's ref storage format. It works alright for bare repos, which
> > typically have neither worktrees nor reflogs.
>
> Worktrees hanging off a bare repository is an explicitly supported
> use-case, and there are people who use and promote such an
> organization, so I'm not sure if "typically" is accurate these days.
> Anyhow, just a minor observation, probably not worth rewording, and
> certainly not worth a reroll.
True enough. I would claim that most bare repositories out in the wild
do not have worktrees, mostly because they are used on the server side.
But in the end, quantity is rather irrelevant. I'll s/typically/often/
to relax the statement a bit. Does that work for you?
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] refs: implement logic to migrate between ref storage formats
2024-05-24 7:35 ` Patrick Steinhardt
@ 2024-05-24 9:01 ` Eric Sunshine
0 siblings, 0 replies; 103+ messages in thread
From: Eric Sunshine @ 2024-05-24 9:01 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Fri, May 24, 2024 at 3:35 AM Patrick Steinhardt <ps@pks.im> wrote:
> On Thu, May 23, 2024 at 01:31:03PM -0400, Eric Sunshine wrote:
> > On Thu, May 23, 2024 at 4:26 AM Patrick Steinhardt <ps@pks.im> wrote:
> > > In other words, this version is a minimum viable product for migrating a
> > > repository's ref storage format. It works alright for bare repos, which
> > > typically have neither worktrees nor reflogs.
> >
> > Worktrees hanging off a bare repository is an explicitly supported
> > use-case, and there are people who use and promote such an
> > organization, so I'm not sure if "typically" is accurate these days.
> > Anyhow, just a minor observation, probably not worth rewording, and
> > certainly not worth a reroll.
>
> True enough. I would claim that most bare repositories out in the wild
> do not have worktrees, mostly because they are used on the server side.
> But in the end, quantity is rather irrelevant. I'll s/typically/often/
> to relax the statement a bit. Does that work for you?
Yes, "often" works just fine.
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 9/9] builtin/refs: new command to migrate ref storage formats
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (7 preceding siblings ...)
2024-05-23 8:25 ` [PATCH 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-05-23 8:25 ` Patrick Steinhardt
2024-05-23 17:40 ` Eric Sunshine
2024-05-23 16:09 ` [PATCH 0/9] refs: ref storage format migrations Junio C Hamano
` (4 subsequent siblings)
13 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-23 8:25 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 15417 bytes --]
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:
- There is no good place to put the migration logic in existing
commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
not the correct place to put it, either.
- I had it in my mind to create a new low-level command for accessing
refs for quite a while already. git-refs(1) is that command and can
over time grow more functionality relating to refs. This should help
discoverability by consolidating low-level access to refs into a
single executable.
As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
.gitignore | 1 +
Documentation/git-refs.txt | 59 +++++++++
Makefile | 1 +
builtin.h | 1 +
builtin/refs.c | 75 ++++++++++++
git.c | 1 +
t/t1460-refs-migrate.sh | 243 +++++++++++++++++++++++++++++++++++++
7 files changed, 381 insertions(+)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
diff --git a/.gitignore b/.gitignore
index 612c0f6a0f..8caf3700c2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -126,6 +126,7 @@
/git-rebase
/git-receive-pack
/git-reflog
+/git-refs
/git-remote
/git-remote-http
/git-remote-https
diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
new file mode 100644
index 0000000000..53cb30d9fb
--- /dev/null
+++ b/Documentation/git-refs.txt
@@ -0,0 +1,59 @@
+git-refs(1)
+===========
+
+NAME
+----
+
+git-refs - Low-level access to refs
+
+SYNOPSIS
+--------
+
+[verse]
+'git refs migrate' --ref-format=<format> [--dry-run]
+
+DESCRIPTION
+-----------
+
+This command provides low-level access to refs.
+
+COMMANDS
+--------
+
+migrate::
+ Migrate ref store between different formats.
+
+OPTIONS
+-------
+
+The following options are specific to 'git refs migrate':
+
+--ref-format=<format>::
+ The ref format to migrate the ref store to. Can be one of:
++
+include::ref-storage-format.txt[]
+
+--dry-run::
+ Perform the migration, but do not modify the repository. The migrated
+ refs will be written into a separate directory that can be inspected
+ separately. This can be used to double check that the migration works
+ as expected before doing performing the actual migration.
+
+KNOWN LIMITATIONS
+-----------------
+
+The ref format migration has several known limitations in its current form:
+
+* It is not possible to migrate repositories that have reflogs.
+
+* It is not possible to migrate repositories that have worktrees.
+
+* There is no way to block concurrent writes to the repository during an
+ ongoing migration. Concurrent writes can lead to an inconsistent migrated
+ state. Users are expected to block writes on a higher level.
+
+These limitations may eventually be lifted.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index cf504963c2..2d702b552c 100644
--- a/Makefile
+++ b/Makefile
@@ -1283,6 +1283,7 @@ BUILTIN_OBJS += builtin/read-tree.o
BUILTIN_OBJS += builtin/rebase.o
BUILTIN_OBJS += builtin/receive-pack.o
BUILTIN_OBJS += builtin/reflog.o
+BUILTIN_OBJS += builtin/refs.o
BUILTIN_OBJS += builtin/remote-ext.o
BUILTIN_OBJS += builtin/remote-fd.o
BUILTIN_OBJS += builtin/remote.o
diff --git a/builtin.h b/builtin.h
index 28280636da..7eda9b2486 100644
--- a/builtin.h
+++ b/builtin.h
@@ -207,6 +207,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix);
int cmd_rebase__interactive(int argc, const char **argv, const char *prefix);
int cmd_receive_pack(int argc, const char **argv, const char *prefix);
int cmd_reflog(int argc, const char **argv, const char *prefix);
+int cmd_refs(int argc, const char **argv, const char *prefix);
int cmd_remote(int argc, const char **argv, const char *prefix);
int cmd_remote_ext(int argc, const char **argv, const char *prefix);
int cmd_remote_fd(int argc, const char **argv, const char *prefix);
diff --git a/builtin/refs.c b/builtin/refs.c
new file mode 100644
index 0000000000..02401afa4e
--- /dev/null
+++ b/builtin/refs.c
@@ -0,0 +1,75 @@
+#include "builtin.h"
+#include "parse-options.h"
+#include "refs.h"
+#include "repository.h"
+#include "strbuf.h"
+
+#define REFS_MIGRATE_USAGE \
+ N_("git refs migrate --ref-format=<format> [--dry-run]")
+
+static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
+{
+ const char * const migrate_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL
+ };
+ const char *format_str = NULL;
+ enum ref_storage_format format;
+ unsigned int flags = 0;
+ struct option options[] = {
+ OPT_STRING_F(0, "ref-format", &format_str, N_("format"),
+ N_("specify the reference format to convert to"),
+ PARSE_OPT_NONEG),
+ OPT_BIT(0, "dry-run", &flags,
+ N_("perform a non-destructive dry-run"),
+ REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN),
+ OPT_END(),
+ };
+ struct strbuf errbuf = STRBUF_INIT;
+ int err;
+
+ argc = parse_options(argc, argv, prefix, options, migrate_usage, 0);
+ if (argc)
+ usage(_("too many arguments"));
+ if (!format_str)
+ usage(_("missing --ref-format=<format>"));
+
+ format = ref_storage_format_by_name(format_str);
+ if (format == REF_STORAGE_FORMAT_UNKNOWN) {
+ err = error(_("unknown ref storage format '%s'"), format_str);
+ goto out;
+ }
+
+ if (the_repository->ref_storage_format == format) {
+ err = error(_("repository already uses '%s' format"),
+ ref_storage_format_to_name(format));
+ goto out;
+ }
+
+ if (repo_migrate_ref_storage_format(the_repository, format, flags, &errbuf) < 0) {
+ err = error("%s", errbuf.buf);
+ goto out;
+ }
+
+ err = 0;
+
+out:
+ strbuf_release(&errbuf);
+ return err;
+}
+
+int cmd_refs(int argc, const char **argv, const char *prefix)
+{
+ const char * const refs_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ parse_opt_subcommand_fn *fn = NULL;
+ struct option opts[] = {
+ OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate),
+ OPT_END(),
+ };
+
+ argc = parse_options(argc, argv, prefix, opts, refs_usage, 0);
+ return fn(argc, argv, prefix);
+}
diff --git a/git.c b/git.c
index 637c61ca9c..683bb69194 100644
--- a/git.c
+++ b/git.c
@@ -594,6 +594,7 @@ static struct cmd_struct commands[] = {
{ "rebase", cmd_rebase, RUN_SETUP | NEED_WORK_TREE },
{ "receive-pack", cmd_receive_pack },
{ "reflog", cmd_reflog, RUN_SETUP },
+ { "refs", cmd_refs, RUN_SETUP },
{ "remote", cmd_remote, RUN_SETUP },
{ "remote-ext", cmd_remote_ext, NO_PARSEOPT },
{ "remote-fd", cmd_remote_fd, NO_PARSEOPT },
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
new file mode 100755
index 0000000000..f7c0783d30
--- /dev/null
+++ b/t/t1460-refs-migrate.sh
@@ -0,0 +1,243 @@
+#!/bin/sh
+
+test_description='migration of ref storage backends'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_migration () {
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >expect &&
+ git -C "$1" refs migrate --ref-format="$2" &&
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >actual &&
+ test_cmp expect actual &&
+
+ git -C "$1" rev-parse --show-ref-format >actual &&
+ echo "$2" >expect &&
+ test_cmp expect actual
+}
+
+test_expect_success 'setup' '
+ rm -rf .git &&
+ # The migration does not yet support reflogs.
+ git config --global core.logAllRefUpdates false
+'
+
+test_expect_success "superfluous arguments" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate foo 2>err &&
+ cat >expect <<-EOF &&
+ usage: too many arguments
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "missing ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate 2>err &&
+ cat >expect <<-EOF &&
+ usage: missing --ref-format=<format>
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "unknown ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=unknown 2>err &&
+ cat >expect <<-EOF &&
+ error: unknown ref storage format ${SQ}unknown${SQ}
+ EOF
+ test_cmp expect err
+'
+
+ref_formats="files reftable"
+for from_format in $ref_formats
+do
+ for to_format in $ref_formats
+ do
+ if test "$from_format" = "$to_format"
+ then
+ continue
+ fi
+
+ test_expect_success "$from_format: migration to same format fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$from_format 2>err &&
+ cat >expect <<-EOF &&
+ error: repository already uses ${SQ}$from_format${SQ} format
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with reflog fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_config -C repo core.logAllRefUpdates true &&
+ test_commit -C repo logged &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating reflogs is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with worktree fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ git -C repo worktree add wt &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating repositories with worktrees is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: unborn HEAD" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: single ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: bare repository" '
+ test_when_finished "rm -rf repo repo.git" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git clone --ref-format=$from_format --mirror repo repo.git &&
+ test_migration repo.git "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dangling symref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo symbolic-ref BROKEN_HEAD refs/heads/nonexistent &&
+ test_migration repo "$to_format" &&
+ echo refs/heads/nonexistent >expect &&
+ git -C repo symbolic-ref BROKEN_HEAD >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: broken ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test-tool -C repo ref-store main update-ref "" refs/heads/broken \
+ "$(test_oid 001)" "$ZERO_OID" REF_SKIP_CREATE_REFLOG,REF_SKIP_OID_VERIFICATION &&
+ test_migration repo "$to_format" &&
+ test_oid 001 >expect &&
+ git -C repo rev-parse refs/heads/broken >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: pseudo-refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo update-ref FOO_HEAD HEAD &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: special refs are left alone" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo rev-parse HEAD >repo/.git/MERGE_HEAD &&
+ git -C repo rev-parse MERGE_HEAD &&
+ test_migration repo "$to_format" &&
+ test_path_is_file repo/.git/MERGE_HEAD
+ '
+
+ test_expect_success "$from_format -> $to_format: a bunch of refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+
+ test_commit -C repo initial &&
+ cat >input <<-EOF &&
+ create FOO_HEAD HEAD
+ create refs/heads/branch-1 HEAD
+ create refs/heads/branch-2 HEAD
+ create refs/heads/branch-3 HEAD
+ create refs/heads/branch-4 HEAD
+ create refs/tags/tag-1 HEAD
+ create refs/tags/tag-2 HEAD
+ EOF
+ git -C repo update-ref --stdin <input &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dry-run migration does not modify repository" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo refs migrate --dry-run \
+ --ref-format=$to_format >output &&
+ grep "Finished dry-run migration of refs" output &&
+ test_path_is_dir repo/.git/ref_migration.* &&
+ echo $from_format >expect &&
+ git -C repo rev-parse --show-ref-format >actual &&
+ test_cmp expect actual
+ '
+ done
+done
+
+test_expect_success 'migrating from files format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=files repo &&
+ test_commit -C repo first &&
+ git -C repo pack-refs --all &&
+ test_commit -C repo second &&
+ git -C repo update-ref ORIG_HEAD HEAD &&
+ git -C repo rev-parse HEAD >repo/.git/FETCH_HEAD &&
+
+ test_path_is_file repo/.git/HEAD &&
+ test_path_is_file repo/.git/ORIG_HEAD &&
+ test_path_is_file repo/.git/refs/heads/main &&
+ test_path_is_file repo/.git/packed-refs &&
+
+ test_migration repo reftable &&
+
+ echo "ref: refs/heads/.invalid" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ echo "this repository uses the reftable format" >expect &&
+ test_cmp expect repo/.git/refs/heads &&
+ test_path_is_file repo/.git/FETCH_HEAD &&
+ test_path_is_missing repo/.git/ORIG_HEAD &&
+ test_path_is_missing repo/.git/refs/heads/main &&
+ test_path_is_missing repo/.git/logs &&
+ test_path_is_missing repo/.git/packed-refs
+'
+
+test_expect_success 'migrating from reftable format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=reftable repo &&
+ test_commit -C repo first &&
+
+ test_path_is_dir repo/.git/reftable &&
+ test_migration repo files &&
+
+ test_path_is_missing repo/.git/reftable &&
+ echo "ref: refs/heads/main" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ test_path_is_file repo/.git/refs/heads/main
+'
+
+test_done
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 9/9] builtin/refs: new command to migrate ref storage formats
2024-05-23 8:25 ` [PATCH 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
@ 2024-05-23 17:40 ` Eric Sunshine
2024-05-24 7:35 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Eric Sunshine @ 2024-05-23 17:40 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Thu, May 23, 2024 at 4:26 AM Patrick Steinhardt <ps@pks.im> wrote:
> Introduce a new command that allows the user to migrate a repository
> between ref storage formats. This new command is implemented as part of
> a new git-refs(1) executable. This is due to two reasons:
>
> - There is no good place to put the migration logic in existing
> commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
> not the correct place to put it, either.
>
> - I had it in my mind to create a new low-level command for accessing
> refs for quite a while already. git-refs(1) is that command and can
> over time grow more functionality relating to refs. This should help
> discoverability by consolidating low-level access to refs into a
> single executable.
>
> As mentioned in the preceding commit that introduces the ref storage
> format migration logic, the new `git refs migrate` command still has a
> bunch of restrictions. These restrictions are documented accordingly.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
> @@ -0,0 +1,59 @@
> +--dry-run::
> + Perform the migration, but do not modify the repository. The migrated
> + refs will be written into a separate directory that can be inspected
> + separately. This can be used to double check that the migration works
> + as expected before doing performing the actual migration.
s/doing performing/performing/
The mysterious "into a separate directory" is never made concrete. Can
this provide more information so the reader can know where this
directory is and how to double-check that it worked "as expected"?
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 9/9] builtin/refs: new command to migrate ref storage formats
2024-05-23 17:40 ` Eric Sunshine
@ 2024-05-24 7:35 ` Patrick Steinhardt
0 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 7:35 UTC (permalink / raw)
To: Eric Sunshine; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 2060 bytes --]
On Thu, May 23, 2024 at 01:40:50PM -0400, Eric Sunshine wrote:
> On Thu, May 23, 2024 at 4:26 AM Patrick Steinhardt <ps@pks.im> wrote:
> > Introduce a new command that allows the user to migrate a repository
> > between ref storage formats. This new command is implemented as part of
> > a new git-refs(1) executable. This is due to two reasons:
> >
> > - There is no good place to put the migration logic in existing
> > commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
> > not the correct place to put it, either.
> >
> > - I had it in my mind to create a new low-level command for accessing
> > refs for quite a while already. git-refs(1) is that command and can
> > over time grow more functionality relating to refs. This should help
> > discoverability by consolidating low-level access to refs into a
> > single executable.
> >
> > As mentioned in the preceding commit that introduces the ref storage
> > format migration logic, the new `git refs migrate` command still has a
> > bunch of restrictions. These restrictions are documented accordingly.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> > diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
> > @@ -0,0 +1,59 @@
> > +--dry-run::
> > + Perform the migration, but do not modify the repository. The migrated
> > + refs will be written into a separate directory that can be inspected
> > + separately. This can be used to double check that the migration works
> > + as expected before doing performing the actual migration.
>
> s/doing performing/performing/
>
> The mysterious "into a separate directory" is never made concrete. Can
> this provide more information so the reader can know where this
> directory is and how to double-check that it worked "as expected"?
Good point. I'll add a sentence that "The name of the directory will be
reported on stdout". As we use a temporary directory name we cannot
mention a static name here.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] refs: ref storage format migrations
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (8 preceding siblings ...)
2024-05-23 8:25 ` [PATCH 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
@ 2024-05-23 16:09 ` Junio C Hamano
2024-05-24 7:33 ` Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
` (3 subsequent siblings)
13 siblings, 1 reply; 103+ messages in thread
From: Junio C Hamano @ 2024-05-23 16:09 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> - It is not safe with concurrent writers. This is the limitation that
> ...
> none at all, as it may cause users to be less mindful. That's why I
> decided to just have no solution at all and document the limitation
> accordingly.
Documenting the limitation is a good place to start. For normal
users, would it be sufficient to
(1) tell your colleagues that this repository is currently closed
and do not push into it;
(2) configure "git gc --auto" to never kick in;
(3) delist the repository from "git maintenance" schedule.
before they try this feature out?
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] refs: ref storage format migrations
2024-05-23 16:09 ` [PATCH 0/9] refs: ref storage format migrations Junio C Hamano
@ 2024-05-24 7:33 ` Patrick Steinhardt
2024-05-24 16:28 ` Junio C Hamano
0 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 7:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 981 bytes --]
On Thu, May 23, 2024 at 09:09:41AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > - It is not safe with concurrent writers. This is the limitation that
> > ...
> > none at all, as it may cause users to be less mindful. That's why I
> > decided to just have no solution at all and document the limitation
> > accordingly.
>
> Documenting the limitation is a good place to start. For normal
> users, would it be sufficient to
>
> (1) tell your colleagues that this repository is currently closed
> and do not push into it;
>
> (2) configure "git gc --auto" to never kick in;
>
> (3) delist the repository from "git maintenance" schedule.
>
> before they try this feature out?
I think (2) wouldn't even be needed. Auto-GC only kicks in when there is
a write in the repository, and if both (1) and (3) are true then there
are none. But other than that yes, (1) and (3) should be sufficient.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] refs: ref storage format migrations
2024-05-24 7:33 ` Patrick Steinhardt
@ 2024-05-24 16:28 ` Junio C Hamano
2024-05-28 5:13 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Junio C Hamano @ 2024-05-24 16:28 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
>> Documenting the limitation is a good place to start. For normal
>> users, would it be sufficient to
>>
>> (1) tell your colleagues that this repository is currently closed
>> and do not push into it;
>>
>> (2) configure "git gc --auto" to never kick in;
>>
>> (3) delist the repository from "git maintenance" schedule.
>>
>> before they try this feature out?
>
> I think (2) wouldn't even be needed. Auto-GC only kicks in when there is
> a write in the repository, and if both (1) and (3) are true then there
> are none. But other than that yes, (1) and (3) should be sufficient.
So it may make sense to document something like that at least?
Thanks.
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] refs: ref storage format migrations
2024-05-24 16:28 ` Junio C Hamano
@ 2024-05-28 5:13 ` Patrick Steinhardt
2024-05-28 16:16 ` Junio C Hamano
0 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 5:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1366 bytes --]
On Fri, May 24, 2024 at 09:28:13AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> >> Documenting the limitation is a good place to start. For normal
> >> users, would it be sufficient to
> >>
> >> (1) tell your colleagues that this repository is currently closed
> >> and do not push into it;
> >>
> >> (2) configure "git gc --auto" to never kick in;
> >>
> >> (3) delist the repository from "git maintenance" schedule.
> >>
> >> before they try this feature out?
> >
> > I think (2) wouldn't even be needed. Auto-GC only kicks in when there is
> > a write in the repository, and if both (1) and (3) are true then there
> > are none. But other than that yes, (1) and (3) should be sufficient.
>
> So it may make sense to document something like that at least?
It is documented as part of the new git-refs(1) man page, in the "Known
limitations" section:
* There is no way to block concurrent writes to the repository during an
ongoing migration. Concurrent writes can lead to an inconsistent migrated
state. Users are expected to block writes on a higher level. If your
repository is registered for scheduled maintenance, it is recommended to
unregister it first with git-maintenance(1).
Is that sufficient, or do you think we need to expand on this?
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] refs: ref storage format migrations
2024-05-28 5:13 ` Patrick Steinhardt
@ 2024-05-28 16:16 ` Junio C Hamano
0 siblings, 0 replies; 103+ messages in thread
From: Junio C Hamano @ 2024-05-28 16:16 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> It is documented as part of the new git-refs(1) man page, in the "Known
> limitations" section:
>
> * There is no way to block concurrent writes to the repository during an
> ongoing migration. Concurrent writes can lead to an inconsistent migrated
> state. Users are expected to block writes on a higher level. If your
> repository is registered for scheduled maintenance, it is recommended to
> unregister it first with git-maintenance(1).
Good. Thanks.
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 0/9] refs: ref storage format migrations
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (9 preceding siblings ...)
2024-05-23 16:09 ` [PATCH 0/9] refs: ref storage format migrations Junio C Hamano
@ 2024-05-24 10:14 ` Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
` (8 more replies)
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (2 subsequent siblings)
13 siblings, 9 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:14 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 5625 bytes --]
Hi,
this patch series implements support for migrating between ref storage
formats in a repository via a new `git refs migrate` command. The scope
of this command is currently limited to repositories without worktrees
and reflogs. Furthermore, the user needs to make sure that there are no
concurrent writes.
Changes compared to v1:
- Improve commit messages.
- Mention that `--dry-run` mode will print the path to the migrated
directory to stdout.
- Mention that the repository should be deregistered from maintenance
before running the migration.
Thanks!
Patrick
Patrick Steinhardt (9):
setup: unset ref storage when reinitializing repository version
refs: convert ref storage format to an enum
refs: pass storage format to `ref_store_init()` explicitly
refs: allow to skip creation of reflog entries
refs/files: refactor `add_pseudoref_and_head_entries()`
refs/files: extract function to iterate through root refs
refs: implement removal of ref storages
refs: implement logic to migrate between ref storage formats
builtin/refs: new command to migrate ref storage formats
.gitignore | 1 +
Documentation/git-refs.txt | 62 +++++++
Makefile | 1 +
builtin.h | 1 +
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
builtin/refs.c | 75 +++++++++
git.c | 1 +
refs.c | 319 +++++++++++++++++++++++++++++++++++--
refs.h | 41 ++++-
refs/files-backend.c | 121 ++++++++++++--
refs/packed-backend.c | 15 ++
refs/refs-internal.h | 7 +
refs/reftable-backend.c | 37 ++++-
repository.c | 3 +-
repository.h | 10 +-
setup.c | 10 +-
setup.h | 9 +-
t/helper/test-ref-store.c | 1 +
t/t1460-refs-migrate.sh | 243 ++++++++++++++++++++++++++++
20 files changed, 916 insertions(+), 45 deletions(-)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
Range-diff against v1:
1: 8b11127daf = 1: 8b11127daf setup: unset ref storage when reinitializing repository version
2: 25f740f395 = 2: 25f740f395 refs: convert ref storage format to an enum
3: 6e7b9764f6 = 3: 6e7b9764f6 refs: pass storage format to `ref_store_init()` explicitly
4: 03f4ac6ee7 = 4: 03f4ac6ee7 refs: allow to skip creation of reflog entries
5: 71f31fe66c = 5: 71f31fe66c refs/files: refactor `add_pseudoref_and_head_entries()`
6: 6b696690ca = 6: 6b696690ca refs/files: extract function to iterate through root refs
7: b758c419c6 = 7: b758c419c6 refs: implement removal of ref storages
8: 4e0edda6d3 ! 8: 4d3eb5ea89 refs: implement logic to migrate between ref storage formats
@@ Commit message
whole repository. Add the logic to do so.
The implementation is generic and works with arbitrary ref storage
- formats because we only use. It does have a few limitations though:
+ formats so that a backend does not need to implement any migration
+ logic. It does have a few limitations though:
- We do not migrate repositories with worktrees, because worktrees
have separate ref storages. It makes the overall affair more complex
@@ Commit message
In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
- typically have neither worktrees nor reflogs. But it will not work for
- many other repositories without some preparations. These limitations are
- not set into stone though, and ideally we will eventually address them
- over time.
+ often have neither worktrees nor reflogs. But it will not work for many
+ other repositories without some preparations. These limitations are not
+ set into stone though, and ideally we will eventually address them over
+ time.
The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.
9: 2ebcc0db65 ! 9: 0df17a51b4 builtin/refs: new command to migrate ref storage formats
@@ Documentation/git-refs.txt (new)
+--dry-run::
+ Perform the migration, but do not modify the repository. The migrated
+ refs will be written into a separate directory that can be inspected
-+ separately. This can be used to double check that the migration works
-+ as expected before doing performing the actual migration.
++ separately. The name of the directory will be reported on stdout. This
++ can be used to double check that the migration works as expected doing
++ performing the actual migration.
+
+KNOWN LIMITATIONS
+-----------------
@@ Documentation/git-refs.txt (new)
+
+* There is no way to block concurrent writes to the repository during an
+ ongoing migration. Concurrent writes can lead to an inconsistent migrated
-+ state. Users are expected to block writes on a higher level.
++ state. Users are expected to block writes on a higher level. If your
++ repository is registered for scheduled maintenance, it is recommended to
++ unregister it first with git-maintenance(1).
+
+These limitations may eventually be lifted.
+
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
@ 2024-05-24 10:14 ` Patrick Steinhardt
2024-05-24 21:33 ` Justin Tobler
2024-05-24 10:14 ` [PATCH v2 2/9] refs: convert ref storage format to an enum Patrick Steinhardt
` (7 subsequent siblings)
8 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:14 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]
When reinitializing a repository's version we may end up unsetting the
hash algorithm when it matches the default hash algorithm. If we didn't
do that then the previously configured value might remain intact.
While the same issue exists for the ref storage extension, we don't do
this here. This has been fine for most of the part because it is not
supported to re-initialize a repository with a different ref storage
format anyway. We're about to introduce a new command to migrate ref
storages though, so this is about to become an issue there.
Prepare for this and unset the ref storage format when reinitializing a
repoistory with the "files" format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/setup.c b/setup.c
index 7975230ffb..8c84ec9d4b 100644
--- a/setup.c
+++ b/setup.c
@@ -2028,6 +2028,8 @@ void initialize_repository_version(int hash_algo,
if (ref_storage_format != REF_STORAGE_FORMAT_FILES)
git_config_set("extensions.refstorage",
ref_storage_format_to_name(ref_storage_format));
+ else if (reinit)
+ git_config_set_gently("extensions.refstorage", NULL);
}
static int is_reinit(void)
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version
2024-05-24 10:14 ` [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
@ 2024-05-24 21:33 ` Justin Tobler
2024-05-28 5:13 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Justin Tobler @ 2024-05-24 21:33 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Junio C Hamano
On 24/05/24 12:14PM, Patrick Steinhardt wrote:
> When reinitializing a repository's version we may end up unsetting the
> hash algorithm when it matches the default hash algorithm. If we didn't
> do that then the previously configured value might remain intact.
>
> While the same issue exists for the ref storage extension, we don't do
> this here. This has been fine for most of the part because it is not
> supported to re-initialize a repository with a different ref storage
> format anyway. We're about to introduce a new command to migrate ref
> storages though, so this is about to become an issue there.
Ah, so this would be important in the context of migrating a repository
from "reftable" to "files".
> Prepare for this and unset the ref storage format when reinitializing a
> repoistory with the "files" format.
s/repoistory/repository/
-Justin
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version
2024-05-24 21:33 ` Justin Tobler
@ 2024-05-28 5:13 ` Patrick Steinhardt
0 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 5:13 UTC (permalink / raw)
To: Justin Tobler; +Cc: git, Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 1010 bytes --]
On Fri, May 24, 2024 at 04:33:51PM -0500, Justin Tobler wrote:
> On 24/05/24 12:14PM, Patrick Steinhardt wrote:
> > When reinitializing a repository's version we may end up unsetting the
> > hash algorithm when it matches the default hash algorithm. If we didn't
> > do that then the previously configured value might remain intact.
> >
> > While the same issue exists for the ref storage extension, we don't do
> > this here. This has been fine for most of the part because it is not
> > supported to re-initialize a repository with a different ref storage
> > format anyway. We're about to introduce a new command to migrate ref
> > storages though, so this is about to become an issue there.
>
> Ah, so this would be important in the context of migrating a repository
> from "reftable" to "files".
Exactly.
> > Prepare for this and unset the ref storage format when reinitializing a
> > repoistory with the "files" format.
>
> s/repoistory/repository/
Thanks, fixed.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 2/9] refs: convert ref storage format to an enum
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
@ 2024-05-24 10:14 ` Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 3/9] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
` (6 subsequent siblings)
8 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:14 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 8337 bytes --]
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.
Convert the ref storage format to instead be an enum.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
refs.c | 7 ++++---
refs.h | 10 ++++++++--
repository.c | 3 ++-
repository.h | 10 ++++------
setup.c | 8 ++++----
setup.h | 9 +++++----
8 files changed, 29 insertions(+), 22 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index 1e07524c53..e808e02017 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -970,7 +970,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
int submodule_progress;
int filter_submodules = 0;
int hash_algo;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
const int do_not_override_repo_unix_permissions = -1;
const char *template_dir;
char *template_dir_dup = NULL;
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 0170469b84..582dcf20f8 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -81,7 +81,7 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
const char *ref_format = NULL;
const char *initial_branch = NULL;
int hash_algo = GIT_HASH_UNKNOWN;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
int init_shared_repository = -1;
const struct option init_db_options[] = {
OPT_STRING(0, "template", &template_dir, N_("template-directory"),
diff --git a/refs.c b/refs.c
index 31032588e0..e6db85a165 100644
--- a/refs.c
+++ b/refs.c
@@ -37,14 +37,15 @@ static const struct ref_storage_be *refs_backends[] = {
[REF_STORAGE_FORMAT_REFTABLE] = &refs_be_reftable,
};
-static const struct ref_storage_be *find_ref_storage_backend(unsigned int ref_storage_format)
+static const struct ref_storage_be *find_ref_storage_backend(
+ enum ref_storage_format ref_storage_format)
{
if (ref_storage_format < ARRAY_SIZE(refs_backends))
return refs_backends[ref_storage_format];
return NULL;
}
-unsigned int ref_storage_format_by_name(const char *name)
+enum ref_storage_format ref_storage_format_by_name(const char *name)
{
for (unsigned int i = 0; i < ARRAY_SIZE(refs_backends); i++)
if (refs_backends[i] && !strcmp(refs_backends[i]->name, name))
@@ -52,7 +53,7 @@ unsigned int ref_storage_format_by_name(const char *name)
return REF_STORAGE_FORMAT_UNKNOWN;
}
-const char *ref_storage_format_to_name(unsigned int ref_storage_format)
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format)
{
const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
if (!be)
diff --git a/refs.h b/refs.h
index fe7f0db35e..a7afa9bede 100644
--- a/refs.h
+++ b/refs.h
@@ -11,8 +11,14 @@ struct string_list;
struct string_list_item;
struct worktree;
-unsigned int ref_storage_format_by_name(const char *name);
-const char *ref_storage_format_to_name(unsigned int ref_storage_format);
+enum ref_storage_format {
+ REF_STORAGE_FORMAT_UNKNOWN,
+ REF_STORAGE_FORMAT_FILES,
+ REF_STORAGE_FORMAT_REFTABLE,
+};
+
+enum ref_storage_format ref_storage_format_by_name(const char *name);
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format);
/*
* Resolve a reference, recursively following symbolic refererences.
diff --git a/repository.c b/repository.c
index d29b0304fb..166863f852 100644
--- a/repository.c
+++ b/repository.c
@@ -124,7 +124,8 @@ void repo_set_compat_hash_algo(struct repository *repo, int algo)
repo_read_loose_object_map(repo);
}
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format)
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format)
{
repo->ref_storage_format = format;
}
diff --git a/repository.h b/repository.h
index 4bd8969005..a35cd77c35 100644
--- a/repository.h
+++ b/repository.h
@@ -1,6 +1,7 @@
#ifndef REPOSITORY_H
#define REPOSITORY_H
+#include "refs.h"
#include "strmap.h"
struct config_set;
@@ -26,10 +27,6 @@ enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NOOP,
};
-#define REF_STORAGE_FORMAT_UNKNOWN 0
-#define REF_STORAGE_FORMAT_FILES 1
-#define REF_STORAGE_FORMAT_REFTABLE 2
-
struct repo_settings {
int initialized;
@@ -181,7 +178,7 @@ struct repository {
const struct git_hash_algo *compat_hash_algo;
/* Repository's reference storage format, as serialized on disk. */
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
/* A unique-id for tracing purposes. */
int trace2_repo_id;
@@ -220,7 +217,8 @@ void repo_set_gitdir(struct repository *repo, const char *root,
void repo_set_worktree(struct repository *repo, const char *path);
void repo_set_hash_algo(struct repository *repo, int algo);
void repo_set_compat_hash_algo(struct repository *repo, int compat_algo);
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format);
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format);
void initialize_repository(struct repository *repo);
RESULT_MUST_BE_USED
int repo_init(struct repository *r, const char *gitdir, const char *worktree);
diff --git a/setup.c b/setup.c
index 8c84ec9d4b..b49ee3e95f 100644
--- a/setup.c
+++ b/setup.c
@@ -1997,7 +1997,7 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
}
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit)
{
char repo_version_string[10];
@@ -2044,7 +2044,7 @@ static int is_reinit(void)
return ret;
}
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet)
{
struct strbuf err = STRBUF_INIT;
@@ -2243,7 +2243,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
}
static void validate_ref_storage_format(struct repository_format *repo_fmt,
- unsigned int format)
+ enum ref_storage_format format)
{
const char *name = getenv("GIT_DEFAULT_REF_FORMAT");
@@ -2263,7 +2263,7 @@ static void validate_ref_storage_format(struct repository_format *repo_fmt,
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch,
int init_shared_repository, unsigned int flags)
{
diff --git a/setup.h b/setup.h
index b3fd3bf45a..cd8dbc2497 100644
--- a/setup.h
+++ b/setup.h
@@ -1,6 +1,7 @@
#ifndef SETUP_H
#define SETUP_H
+#include "refs.h"
#include "string-list.h"
int is_inside_git_dir(void);
@@ -128,7 +129,7 @@ struct repository_format {
int is_bare;
int hash_algo;
int compat_hash_algo;
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
int sparse_index;
char *work_tree;
struct string_list unknown_extensions;
@@ -192,13 +193,13 @@ const char *get_template_dir(const char *option_template);
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch, int init_shared_repository,
unsigned int flags);
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit);
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet);
/*
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 3/9] refs: pass storage format to `ref_store_init()` explicitly
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 1/9] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 2/9] refs: convert ref storage format to an enum Patrick Steinhardt
@ 2024-05-24 10:14 ` Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 4/9] refs: allow to skip creation of reflog entries Patrick Steinhardt
` (5 subsequent siblings)
8 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:14 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 2646 bytes --]
We're about to introduce logic to migrate refs from one storage format
to another one. This will require us to initialize a ref store with a
different format than the one used by the passed-in repository.
Prepare for this by accepting the desired ref storage format as
parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/refs.c b/refs.c
index e6db85a165..7c3f4df457 100644
--- a/refs.c
+++ b/refs.c
@@ -1894,13 +1894,14 @@ static struct ref_store *lookup_ref_store_map(struct strmap *map,
* gitdir.
*/
static struct ref_store *ref_store_init(struct repository *repo,
+ enum ref_storage_format format,
const char *gitdir,
unsigned int flags)
{
const struct ref_storage_be *be;
struct ref_store *refs;
- be = find_ref_storage_backend(repo->ref_storage_format);
+ be = find_ref_storage_backend(format);
if (!be)
BUG("reference backend is unknown");
@@ -1922,7 +1923,8 @@ struct ref_store *get_main_ref_store(struct repository *r)
if (!r->gitdir)
BUG("attempting to get main_ref_store outside of repository");
- r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
+ r->refs_private = ref_store_init(r, r->ref_storage_format,
+ r->gitdir, REF_STORE_ALL_CAPS);
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
return r->refs_private;
}
@@ -1982,7 +1984,8 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo,
free(subrepo);
goto done;
}
- refs = ref_store_init(subrepo, submodule_sb.buf,
+ refs = ref_store_init(subrepo, the_repository->ref_storage_format,
+ submodule_sb.buf,
REF_STORE_READ | REF_STORE_ODB);
register_ref_store_map(&repo->submodule_ref_stores, "submodule",
refs, submodule);
@@ -2011,12 +2014,12 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
struct strbuf common_path = STRBUF_INIT;
strbuf_git_common_path(&common_path, wt->repo,
"worktrees/%s", wt->id);
- refs = ref_store_init(wt->repo, common_path.buf,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, wt->repo->ref_storage_format,
+ common_path.buf, REF_STORE_ALL_CAPS);
strbuf_release(&common_path);
} else {
- refs = ref_store_init(wt->repo, wt->repo->commondir,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, the_repository->ref_storage_format,
+ wt->repo->commondir, REF_STORE_ALL_CAPS);
}
if (refs)
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 4/9] refs: allow to skip creation of reflog entries
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
` (2 preceding siblings ...)
2024-05-24 10:14 ` [PATCH v2 3/9] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
@ 2024-05-24 10:14 ` Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 5/9] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
` (4 subsequent siblings)
8 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:14 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 3845 bytes --]
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 6 ++++++
refs.h | 8 +++++++-
refs/files-backend.c | 4 ++++
refs/reftable-backend.c | 3 ++-
t/helper/test-ref-store.c | 1 +
5 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/refs.c b/refs.c
index 7c3f4df457..66e9585767 100644
--- a/refs.c
+++ b/refs.c
@@ -1194,6 +1194,12 @@ int ref_transaction_update(struct ref_transaction *transaction,
{
assert(err);
+ if ((flags & REF_FORCE_CREATE_REFLOG) &&
+ (flags & REF_SKIP_CREATE_REFLOG)) {
+ strbuf_addstr(err, _("refusing to force and skip creation of reflog"));
+ return -1;
+ }
+
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
((new_oid && !is_null_oid(new_oid)) ?
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL) :
diff --git a/refs.h b/refs.h
index a7afa9bede..50a2b3ab09 100644
--- a/refs.h
+++ b/refs.h
@@ -659,13 +659,19 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
*/
#define REF_SKIP_REFNAME_VERIFICATION (1 << 11)
+/*
+ * Skip creation of a reflog entry, even if it would have otherwise been
+ * created.
+ */
+#define REF_SKIP_CREATE_REFLOG (1 << 12)
+
/*
* Bitmask of all of the flags that are allowed to be passed in to
* ref_transaction_update() and friends:
*/
#define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS \
(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
- REF_SKIP_REFNAME_VERIFICATION)
+ REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
/*
* Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 73380d7e99..bd0d63bcba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1750,6 +1750,9 @@ static int files_log_ref_write(struct files_ref_store *refs,
{
int logfd, result;
+ if (flags & REF_SKIP_CREATE_REFLOG)
+ return 0;
+
if (log_all_ref_updates == LOG_REFS_UNSET)
log_all_ref_updates = is_bare_repository() ? LOG_REFS_NONE : LOG_REFS_NORMAL;
@@ -2251,6 +2254,7 @@ static int split_head_update(struct ref_update *update,
struct ref_update *new_update;
if ((update->flags & REF_LOG_ONLY) ||
+ (update->flags & REF_SKIP_CREATE_REFLOG) ||
(update->flags & REF_IS_PRUNING) ||
(update->flags & REF_UPDATE_VIA_HEAD))
return 0;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f6edfdf5b3..bffed9257f 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1103,7 +1103,8 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
if (ret)
goto done;
- } else if (u->flags & REF_HAVE_NEW &&
+ } else if (!(u->flags & REF_SKIP_CREATE_REFLOG) &&
+ (u->flags & REF_HAVE_NEW) &&
(u->flags & REF_FORCE_CREATE_REFLOG ||
should_write_log(&arg->refs->base, u->refname))) {
struct reftable_log_record *log;
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index c9efd74c2b..ad24300170 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -126,6 +126,7 @@ static struct flag_definition transaction_flags[] = {
FLAG_DEF(REF_FORCE_CREATE_REFLOG),
FLAG_DEF(REF_SKIP_OID_VERIFICATION),
FLAG_DEF(REF_SKIP_REFNAME_VERIFICATION),
+ FLAG_DEF(REF_SKIP_CREATE_REFLOG),
{ NULL, 0 }
};
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 5/9] refs/files: refactor `add_pseudoref_and_head_entries()`
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
` (3 preceding siblings ...)
2024-05-24 10:14 ` [PATCH v2 4/9] refs: allow to skip creation of reflog entries Patrick Steinhardt
@ 2024-05-24 10:14 ` Patrick Steinhardt
2024-05-24 10:14 ` [PATCH v2 6/9] refs/files: extract function to iterate through root refs Patrick Steinhardt
` (3 subsequent siblings)
8 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:14 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]
The `add_pseudoref_and_head_entries()` function accepts both the ref
store as well as a directory name as input. This is unnecessary though
as the ref store already uniquely identifies the root directory of the
ref store anyway.
Furthermore, the function is misnamed now that we have clarified the
meaning of pseudorefs as it doesn't add pseudorefs, but root refs.
Rename it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index bd0d63bcba..b4e5437ffe 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -324,16 +324,14 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
}
/*
- * Add pseudorefs to the ref dir by parsing the directory for any files
- * which follow the pseudoref syntax.
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
*/
-static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
- struct ref_dir *dir,
- const char *dirname)
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
{
- struct files_ref_store *refs =
- files_downcast(ref_store, REF_STORE_READ, "fill_ref_dir");
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
+ const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
DIR *d;
@@ -388,8 +386,7 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
dir = get_ref_dir(refs->loose->root);
if (flags & DO_FOR_EACH_INCLUDE_ROOT_REFS)
- add_pseudoref_and_head_entries(dir->cache->ref_store, dir,
- refs->loose->root->name);
+ add_root_refs(refs, dir);
/*
* Add an incomplete entry for "refs/" (to be filled
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 6/9] refs/files: extract function to iterate through root refs
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
` (4 preceding siblings ...)
2024-05-24 10:14 ` [PATCH v2 5/9] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
@ 2024-05-24 10:14 ` Patrick Steinhardt
2024-05-24 10:15 ` [PATCH v2 7/9] refs: implement removal of ref storages Patrick Steinhardt
` (2 subsequent siblings)
8 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:14 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]
Extract a new function that can be used to iterate through all root refs
known to the "files" backend. This will be used in the next commit,
where we start to teach ref backends to remove themselves.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 49 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 40 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b4e5437ffe..b7268b26c8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -323,17 +323,15 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
add_per_worktree_entries_to_dir(dir, dirname);
}
-/*
- * Add root refs to the ref dir by parsing the directory for any files which
- * follow the root ref syntax.
- */
-static void add_root_refs(struct files_ref_store *refs,
- struct ref_dir *dir)
+static int for_each_root_ref(struct files_ref_store *refs,
+ int (*cb)(const char *refname, void *cb_data),
+ void *cb_data)
{
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
+ int ret;
DIR *d;
files_ref_path(refs, &path, dirname);
@@ -341,7 +339,7 @@ static void add_root_refs(struct files_ref_store *refs,
d = opendir(path.buf);
if (!d) {
strbuf_release(&path);
- return;
+ return -1;
}
strbuf_addstr(&refname, dirname);
@@ -357,14 +355,47 @@ static void add_root_refs(struct files_ref_store *refs,
strbuf_addstr(&refname, de->d_name);
dtype = get_dtype(de, &path, 1);
- if (dtype == DT_REG && is_root_ref(de->d_name))
- loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
+ if (dtype == DT_REG && is_root_ref(de->d_name)) {
+ ret = cb(refname.buf, cb_data);
+ if (ret)
+ goto done;
+ }
strbuf_setlen(&refname, dirnamelen);
}
+
+done:
strbuf_release(&refname);
strbuf_release(&path);
closedir(d);
+ return ret;
+}
+
+struct fill_root_ref_data {
+ struct files_ref_store *refs;
+ struct ref_dir *dir;
+};
+
+static int fill_root_ref(const char *refname, void *cb_data)
+{
+ struct fill_root_ref_data *data = cb_data;
+ loose_fill_ref_dir_regular_file(data->refs, refname, data->dir);
+ return 0;
+}
+
+/*
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
+ */
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
+{
+ struct fill_root_ref_data data = {
+ .refs = refs,
+ .dir = dir,
+ };
+
+ for_each_root_ref(refs, fill_root_ref, &data);
}
static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 7/9] refs: implement removal of ref storages
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
` (5 preceding siblings ...)
2024-05-24 10:14 ` [PATCH v2 6/9] refs/files: extract function to iterate through root refs Patrick Steinhardt
@ 2024-05-24 10:15 ` Patrick Steinhardt
2024-05-24 10:15 ` [PATCH v2 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-05-24 10:15 ` [PATCH v2 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
8 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:15 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 7882 bytes --]
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.
Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 5 ++++
refs.h | 5 ++++
refs/files-backend.c | 61 +++++++++++++++++++++++++++++++++++++++++
refs/packed-backend.c | 15 ++++++++++
refs/refs-internal.h | 7 +++++
refs/reftable-backend.c | 34 +++++++++++++++++++++++
6 files changed, 127 insertions(+)
diff --git a/refs.c b/refs.c
index 66e9585767..9b112b0527 100644
--- a/refs.c
+++ b/refs.c
@@ -1861,6 +1861,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
return refs->be->create_on_disk(refs, flags, err);
}
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err)
+{
+ return refs->be->remove_on_disk(refs, err);
+}
+
int repo_resolve_gitlink_ref(struct repository *r,
const char *submodule, const char *refname,
struct object_id *oid)
diff --git a/refs.h b/refs.h
index 50a2b3ab09..61ee7b7a15 100644
--- a/refs.h
+++ b/refs.h
@@ -129,6 +129,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
*/
void ref_store_release(struct ref_store *ref_store);
+/*
+ * Remove the ref store from disk. This deletes all associated data.
+ */
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err);
+
/*
* Return the peeled value of the oid currently being iterated via
* for_each_ref(), etc. This is equivalent to calling:
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b7268b26c8..8b74518022 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3340,11 +3340,72 @@ static int files_ref_store_create_on_disk(struct ref_store *ref_store,
return 0;
}
+struct remove_one_root_ref_data {
+ const char *gitdir;
+ struct strbuf *err;
+};
+
+static int remove_one_root_ref(const char *refname,
+ void *cb_data)
+{
+ struct remove_one_root_ref_data *data = cb_data;
+ struct strbuf buf = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&buf, "%s/%s", data->gitdir, refname);
+
+ ret = remove_path(buf.buf);
+ if (ret < 0)
+ strbuf_addf(data->err, "could not delete %s: %s\n",
+ refname, strerror(errno));
+
+ strbuf_release(&buf);
+ return ret;
+}
+
+static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct files_ref_store *refs =
+ files_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct remove_one_root_ref_data data = {
+ .gitdir = refs->base.gitdir,
+ .err = err,
+ };
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addstr(err, "could not delete refs");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addstr(err, "could not delete logs\n");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ ret = for_each_root_ref(refs, remove_one_root_ref, &data);
+ if (ret < 0)
+ ret = -1;
+
+ if (ref_store_remove_on_disk(refs->packed_ref_store, err) < 0)
+ ret = -1;
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct ref_storage_be refs_be_files = {
.name = "files",
.init = files_ref_store_init,
.release = files_ref_store_release,
.create_on_disk = files_ref_store_create_on_disk,
+ .remove_on_disk = files_ref_store_remove_on_disk,
.transaction_prepare = files_transaction_prepare,
.transaction_finish = files_transaction_finish,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 2789fd92f5..c4c1e36aa2 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1,5 +1,6 @@
#include "../git-compat-util.h"
#include "../config.h"
+#include "../dir.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1266,6 +1267,19 @@ static int packed_ref_store_create_on_disk(struct ref_store *ref_store UNUSED,
return 0;
}
+static int packed_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct packed_ref_store *refs = packed_downcast(ref_store, 0, "remove");
+
+ if (remove_path(refs->path) < 0) {
+ strbuf_addstr(err, "could not delete packed-refs");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* Write the packed refs from the current snapshot to the packed-refs
* tempfile, incorporating any changes from `updates`. `updates` must
@@ -1724,6 +1738,7 @@ struct ref_storage_be refs_be_packed = {
.init = packed_ref_store_init,
.release = packed_ref_store_release,
.create_on_disk = packed_ref_store_create_on_disk,
+ .remove_on_disk = packed_ref_store_remove_on_disk,
.transaction_prepare = packed_transaction_prepare,
.transaction_finish = packed_transaction_finish,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 33749fbd83..cbcb6f9c36 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -517,6 +517,12 @@ typedef int ref_store_create_on_disk_fn(struct ref_store *refs,
int flags,
struct strbuf *err);
+/*
+ * Remove the reference store from disk.
+ */
+typedef int ref_store_remove_on_disk_fn(struct ref_store *refs,
+ struct strbuf *err);
+
typedef int ref_transaction_prepare_fn(struct ref_store *refs,
struct ref_transaction *transaction,
struct strbuf *err);
@@ -649,6 +655,7 @@ struct ref_storage_be {
ref_store_init_fn *init;
ref_store_release_fn *release;
ref_store_create_on_disk_fn *create_on_disk;
+ ref_store_remove_on_disk_fn *remove_on_disk;
ref_transaction_prepare_fn *transaction_prepare;
ref_transaction_finish_fn *transaction_finish;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index bffed9257f..62992a67ee 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
#include "../git-compat-util.h"
#include "../abspath.h"
#include "../chdir-notify.h"
+#include "../dir.h"
#include "../environment.h"
#include "../gettext.h"
#include "../hash.h"
@@ -343,6 +344,38 @@ static int reftable_be_create_on_disk(struct ref_store *ref_store,
return 0;
}
+static int reftable_be_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct reftable_ref_store *refs =
+ reftable_be_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addstr(err, "could not delete reftables");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+ if (remove_path(sb.buf) < 0) {
+ strbuf_addstr(err, "could not delete stub HEAD");
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+ if (remove_path(sb.buf) < 0) {
+ strbuf_addstr(err, "could not delete stub heads");
+ ret = -1;
+ }
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct reftable_ref_iterator {
struct ref_iterator base;
struct reftable_ref_store *refs;
@@ -2196,6 +2229,7 @@ struct ref_storage_be refs_be_reftable = {
.init = reftable_be_init,
.release = reftable_be_release,
.create_on_disk = reftable_be_create_on_disk,
+ .remove_on_disk = reftable_be_remove_on_disk,
.transaction_prepare = reftable_be_transaction_prepare,
.transaction_finish = reftable_be_transaction_finish,
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 8/9] refs: implement logic to migrate between ref storage formats
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
` (6 preceding siblings ...)
2024-05-24 10:15 ` [PATCH v2 7/9] refs: implement removal of ref storages Patrick Steinhardt
@ 2024-05-24 10:15 ` Patrick Steinhardt
2024-05-24 22:32 ` Justin Tobler
2024-05-24 10:15 ` [PATCH v2 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
8 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:15 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 11187 bytes --]
With the introduction of the new "reftable" backend, users may want to
migrate repositories between the backends without having to recreate the
whole repository. Add the logic to do so.
The implementation is generic and works with arbitrary ref storage
formats so that a backend does not need to implement any migration
logic. It does have a few limitations though:
- We do not migrate repositories with worktrees, because worktrees
have separate ref storages. It makes the overall affair more complex
if we have to migrate multiple storages at once.
- We do not migrate reflogs, because we have no interfaces to write
many reflog entries.
- We do not lock the repository for concurrent access, and thus
concurrent writes may make use end up with weird in-between states.
There is no way to fully lock the "files" backend for writes due to
its format, and thus we punt on this topic altogether and defer to
the user to avoid those from happening.
In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
often have neither worktrees nor reflogs. But it will not work for many
other repositories without some preparations. These limitations are not
set into stone though, and ideally we will eventually address them over
time.
The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 284 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
refs.h | 18 ++++
2 files changed, 302 insertions(+)
diff --git a/refs.c b/refs.c
index 9b112b0527..4ef2dae5e1 100644
--- a/refs.c
+++ b/refs.c
@@ -2570,3 +2570,287 @@ int ref_update_check_old_target(const char *referent, struct ref_update *update,
referent, update->old_target);
return -1;
}
+
+struct migration_data {
+ struct ref_store *old_refs;
+ struct ref_store *new_refs;
+ struct ref_transaction *transaction;
+ struct strbuf *errbuf;
+ const char *refname;
+};
+
+static int migrate_one_ref(const char *refname, const struct object_id *oid,
+ int flags, void *cb_data)
+{
+ struct migration_data *data = cb_data;
+ struct strbuf symref_target = STRBUF_INIT;
+ int ret;
+
+ if (flags & REF_ISSYMREF) {
+ ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target);
+ if (ret < 0)
+ goto done;
+
+ ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(),
+ symref_target.buf, NULL,
+ REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ } else {
+ ret = ref_transaction_create(data->transaction, refname, oid,
+ REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION,
+ NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ }
+
+done:
+ strbuf_release(&symref_target);
+ return ret;
+}
+
+static int move_files(const char *from_path, const char *to_path, struct strbuf *errbuf)
+{
+ struct strbuf from_buf = STRBUF_INIT, to_buf = STRBUF_INIT;
+ size_t from_len, to_len;
+ DIR *from_dir;
+ int ret;
+
+ from_dir = opendir(from_path);
+ if (!from_dir) {
+ strbuf_addf(errbuf, "could not open source directory: '%s'", from_path);
+ ret = -1;
+ goto done;
+ }
+
+ strbuf_addstr(&from_buf, from_path);
+ strbuf_complete(&from_buf, '/');
+ from_len = from_buf.len;
+
+ strbuf_addstr(&to_buf, to_path);
+ strbuf_complete(&to_buf, '/');
+ to_len = to_buf.len;
+
+ while (1) {
+ struct dirent *ent;
+
+ errno = 0;
+ ent = readdir(from_dir);
+ if (!ent)
+ break;
+
+ if (!strcmp(ent->d_name, ".") ||
+ !strcmp(ent->d_name, ".."))
+ continue;
+
+ strbuf_setlen(&from_buf, from_len);
+ strbuf_addstr(&from_buf, ent->d_name);
+
+ strbuf_setlen(&to_buf, to_len);
+ strbuf_addstr(&to_buf, ent->d_name);
+
+ ret = rename(from_buf.buf, to_buf.buf);
+ if (ret < 0) {
+ strbuf_addf(errbuf, "could not link file '%s' to '%s': %s",
+ from_buf.buf, to_buf.buf, strerror(errno));
+ goto done;
+ }
+ }
+
+ if (errno) {
+ strbuf_addf(errbuf, "could not read entry from directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ ret = 0;
+
+done:
+ strbuf_release(&from_buf);
+ strbuf_release(&to_buf);
+ closedir(from_dir);
+ return ret;
+}
+
+static int count_reflogs(const char *reflog, void *payload)
+{
+ size_t *reflog_count = payload;
+ (*reflog_count)++;
+ return 0;
+}
+
+static int has_worktrees(void)
+{
+ struct worktree **worktrees = get_worktrees();
+ int ret = 0;
+ size_t i;
+
+ for (i = 0; worktrees[i]; i++) {
+ if (is_main_worktree(worktrees[i]))
+ continue;
+ ret = 1;
+ }
+
+ free_worktrees(worktrees);
+ return ret;
+}
+
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *errbuf)
+{
+ struct ref_store *old_refs = NULL, *new_refs = NULL;
+ struct ref_transaction *transaction = NULL;
+ struct strbuf buf = STRBUF_INIT;
+ struct migration_data data;
+ size_t reflog_count = 0;
+ char *new_gitdir;
+ int ret;
+
+ old_refs = get_main_ref_store(repo);
+
+ /*
+ * The overall logic looks like this:
+ *
+ * 1. Set up a new temporary directory and initialize it with the new
+ * format. This is where all refs will be migrated into.
+ *
+ * 2. Enumerate all refs and write them into the new ref storage.
+ * This operation is safe as we do not yet modify the main
+ * repository.
+ *
+ * 3. If we're in dry-run mode then we are done and can hand over the
+ * directory to the caller for inspection. If not, we now start
+ * with the destructive part.
+ *
+ * 4. Delete the old ref storage from disk. As we have a copy of refs
+ * in the new ref storage it's okay(ish) if we now get interrupted
+ * as there is an equivalent copy of all refs available.
+ *
+ * 5. Move the new ref storage files into place.
+ *
+ * 6. Change the repository format to the new ref format.
+ */
+ strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
+ new_gitdir = mkdtemp(buf.buf);
+ if (!new_gitdir) {
+ strbuf_addf(errbuf, "cannot create migration directory: %s",
+ strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
+ strbuf_addstr(errbuf, "cannot count reflogs");
+ ret = -1;
+ goto done;
+ }
+ if (reflog_count) {
+ strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * TODO: we should really be passing the caller-provided repository to
+ * `has_worktrees()`, but our worktree subsystem doesn't yet support
+ * that.
+ */
+ if (has_worktrees()) {
+ strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ new_refs = ref_store_init(repo, format, new_gitdir,
+ REF_STORE_ALL_CAPS);
+ ret = ref_store_create_on_disk(new_refs, 0, errbuf);
+ if (ret < 0)
+ goto done;
+
+ transaction = ref_store_transaction_begin(new_refs, errbuf);
+ if (!transaction)
+ goto done;
+
+ data.old_refs = old_refs;
+ data.new_refs = new_refs;
+ data.transaction = transaction;
+ data.errbuf = errbuf;
+
+ /*
+ * We need to use the internal `do_for_each_ref()` here so that we can
+ * also include broken refs and symrefs. These would otherwise be
+ * skipped silently.
+ *
+ * Ideally, we would do this call while locking the old ref storage
+ * such that there cannot be any concurrent modifications. We do not
+ * have the infra for that though, and the "files" backend does not
+ * allow for a central lock due to its design. It's thus on the user to
+ * ensure that there are no concurrent writes.
+ */
+ ret = do_for_each_ref(old_refs, "", NULL, migrate_one_ref, 0,
+ DO_FOR_EACH_INCLUDE_ROOT_REFS | DO_FOR_EACH_INCLUDE_BROKEN,
+ &data);
+ if (ret < 0)
+ goto done;
+
+ /*
+ * TODO: we might want to migrate to `initial_ref_transaction_commit()`
+ * here, which is more efficient for the files backend because it would
+ * write new refs into the packed-refs file directly. At this point,
+ * the files backend doesn't handle pseudo-refs and symrefs correctly
+ * though, so this requires some more work.
+ */
+ ret = ref_transaction_commit(transaction, errbuf);
+ if (ret < 0)
+ goto done;
+
+ if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
+ printf(_("Finished dry-run migration of refs, "
+ "the result can be found at '%s'\n"), new_gitdir);
+ ret = 0;
+ goto done;
+ }
+
+ /*
+ * Until now we were in the non-destructive phase, where we only
+ * populated the new ref store. From hereon though we are about
+ * to get hands by deleting the old ref store and then moving
+ * the new one into place.
+ *
+ * Assuming that there were no concurrent writes, the new ref
+ * store should have all information. So if we fail from hereon
+ * we may be in an in-between state, but it would still be able
+ * to recover by manually moving remaining files from the
+ * temporary migration directory into place.
+ */
+ ret = ref_store_remove_on_disk(old_refs, errbuf);
+ if (ret < 0)
+ goto done;
+
+ ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
+ if (ret < 0)
+ goto done;
+ rmdir(new_gitdir);
+
+ /*
+ * We have migrated the repository, so we now need to adjust the
+ * repository format so that clients will use the new ref store.
+ * We also need to swap out the repository's main ref store.
+ */
+ initialize_repository_version(hash_algo_by_ptr(repo->hash_algo), format, 1);
+
+ repo->refs_private = new_refs;
+ ref_store_release(old_refs);
+
+ ret = 0;
+
+done:
+ if (ret && new_refs)
+ ref_store_release(new_refs);
+ ref_transaction_free(transaction);
+ strbuf_release(&buf);
+ return ret;
+}
diff --git a/refs.h b/refs.h
index 61ee7b7a15..76d25df4de 100644
--- a/refs.h
+++ b/refs.h
@@ -1070,6 +1070,24 @@ int is_root_ref(const char *refname);
*/
int is_pseudo_ref(const char *refname);
+/*
+ * The following flags can be passed to `repo_migrate_ref_storage_format()`:
+ *
+ * - REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN: perform a dry-run migration
+ * without touching the main repository. The result will be written into a
+ * temporary ref storage directory.
+ */
+#define REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN (1 << 0)
+
+/*
+ * Migrate the ref storage format used by the repository to the
+ * specified one.
+ */
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *err);
+
/*
* The following functions have been removed in Git v2.45 in favor of functions
* that receive a `ref_store` as parameter. The intent of this section is
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v2 8/9] refs: implement logic to migrate between ref storage formats
2024-05-24 10:15 ` [PATCH v2 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-05-24 22:32 ` Justin Tobler
2024-05-28 5:14 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Justin Tobler @ 2024-05-24 22:32 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Junio C Hamano
On 24/05/24 12:15PM, Patrick Steinhardt wrote:
> With the introduction of the new "reftable" backend, users may want to
> migrate repositories between the backends without having to recreate the
> whole repository. Add the logic to do so.
>
> The implementation is generic and works with arbitrary ref storage
> formats so that a backend does not need to implement any migration
> logic. It does have a few limitations though:
>
> - We do not migrate repositories with worktrees, because worktrees
> have separate ref storages. It makes the overall affair more complex
> if we have to migrate multiple storages at once.
>
> - We do not migrate reflogs, because we have no interfaces to write
> many reflog entries.
>
> - We do not lock the repository for concurrent access, and thus
> concurrent writes may make use end up with weird in-between states.
Let's drop the "make use" in this line.
> There is no way to fully lock the "files" backend for writes due to
> its format, and thus we punt on this topic altogether and defer to
> the user to avoid those from happening.
>
> In other words, this version is a minimum viable product for migrating a
> repository's ref storage format. It works alright for bare repos, which
> often have neither worktrees nor reflogs. But it will not work for many
> other repositories without some preparations. These limitations are not
> set into stone though, and ideally we will eventually address them over
> time.
>
> The logic is not yet used by anything, and thus there are no tests for
> it. Those will be added in the next commit.
[snip]
> +int repo_migrate_ref_storage_format(struct repository *repo,
> + enum ref_storage_format format,
> + unsigned int flags,
> + struct strbuf *errbuf)
> +{
> + struct ref_store *old_refs = NULL, *new_refs = NULL;
> + struct ref_transaction *transaction = NULL;
> + struct strbuf buf = STRBUF_INIT;
> + struct migration_data data;
> + size_t reflog_count = 0;
> + char *new_gitdir;
> + int ret;
> +
> + old_refs = get_main_ref_store(repo);
> +
> + /*
> + * The overall logic looks like this:
> + *
> + * 1. Set up a new temporary directory and initialize it with the new
> + * format. This is where all refs will be migrated into.
> + *
> + * 2. Enumerate all refs and write them into the new ref storage.
> + * This operation is safe as we do not yet modify the main
> + * repository.
> + *
> + * 3. If we're in dry-run mode then we are done and can hand over the
> + * directory to the caller for inspection. If not, we now start
> + * with the destructive part.
> + *
> + * 4. Delete the old ref storage from disk. As we have a copy of refs
> + * in the new ref storage it's okay(ish) if we now get interrupted
> + * as there is an equivalent copy of all refs available.
> + *
> + * 5. Move the new ref storage files into place.
> + *
> + * 6. Change the repository format to the new ref format.
> + */
> + strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
> + new_gitdir = mkdtemp(buf.buf);
> + if (!new_gitdir) {
> + strbuf_addf(errbuf, "cannot create migration directory: %s",
> + strerror(errno));
> + ret = -1;
> + goto done;
> + }
If the repository contains reflogs or has worktrees the migration does
not proceed. This means that the created tempdir gets left behind with
no indication and would be left to the user clean it up.
Instead, we could move tempdir creation to after these checks so it is
not needlessly created.
> +
> + if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
> + strbuf_addstr(errbuf, "cannot count reflogs");
> + ret = -1;
> + goto done;
> + }
> + if (reflog_count) {
> + strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
> + ret = -1;
> + goto done;
> + }
> +
> + /*
> + * TODO: we should really be passing the caller-provided repository to
> + * `has_worktrees()`, but our worktree subsystem doesn't yet support
> + * that.
> + */
> + if (has_worktrees()) {
> + strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
> + ret = -1;
> + goto done;
> + }
[snip]
> + /*
> + * Until now we were in the non-destructive phase, where we only
> + * populated the new ref store. From hereon though we are about
> + * to get hands by deleting the old ref store and then moving
> + * the new one into place.
> + *
> + * Assuming that there were no concurrent writes, the new ref
> + * store should have all information. So if we fail from hereon
> + * we may be in an in-between state, but it would still be able
> + * to recover by manually moving remaining files from the
> + * temporary migration directory into place.
> + */
If there a failure after this point, should we provide a hint to user
that the refernces exist in the tempdir?
-Justin
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 8/9] refs: implement logic to migrate between ref storage formats
2024-05-24 22:32 ` Justin Tobler
@ 2024-05-28 5:14 ` Patrick Steinhardt
0 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 5:14 UTC (permalink / raw)
To: Justin Tobler; +Cc: git, Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 3193 bytes --]
On Fri, May 24, 2024 at 05:32:20PM -0500, Justin Tobler wrote:
> On 24/05/24 12:15PM, Patrick Steinhardt wrote:
[snip]
> > + /*
> > + * The overall logic looks like this:
> > + *
> > + * 1. Set up a new temporary directory and initialize it with the new
> > + * format. This is where all refs will be migrated into.
> > + *
> > + * 2. Enumerate all refs and write them into the new ref storage.
> > + * This operation is safe as we do not yet modify the main
> > + * repository.
> > + *
> > + * 3. If we're in dry-run mode then we are done and can hand over the
> > + * directory to the caller for inspection. If not, we now start
> > + * with the destructive part.
> > + *
> > + * 4. Delete the old ref storage from disk. As we have a copy of refs
> > + * in the new ref storage it's okay(ish) if we now get interrupted
> > + * as there is an equivalent copy of all refs available.
> > + *
> > + * 5. Move the new ref storage files into place.
> > + *
> > + * 6. Change the repository format to the new ref format.
> > + */
> > + strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
> > + new_gitdir = mkdtemp(buf.buf);
> > + if (!new_gitdir) {
> > + strbuf_addf(errbuf, "cannot create migration directory: %s",
> > + strerror(errno));
> > + ret = -1;
> > + goto done;
> > + }
>
> If the repository contains reflogs or has worktrees the migration does
> not proceed. This means that the created tempdir gets left behind with
> no indication and would be left to the user clean it up.
>
> Instead, we could move tempdir creation to after these checks so it is
> not needlessly created.
True, done.
> > + if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
> > + strbuf_addstr(errbuf, "cannot count reflogs");
> > + ret = -1;
> > + goto done;
> > + }
> > + if (reflog_count) {
> > + strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
> > + ret = -1;
> > + goto done;
> > + }
> > +
> > + /*
> > + * TODO: we should really be passing the caller-provided repository to
> > + * `has_worktrees()`, but our worktree subsystem doesn't yet support
> > + * that.
> > + */
> > + if (has_worktrees()) {
> > + strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
> > + ret = -1;
> > + goto done;
> > + }
> [snip]
> > + /*
> > + * Until now we were in the non-destructive phase, where we only
> > + * populated the new ref store. From hereon though we are about
> > + * to get hands by deleting the old ref store and then moving
> > + * the new one into place.
> > + *
> > + * Assuming that there were no concurrent writes, the new ref
> > + * store should have all information. So if we fail from hereon
> > + * we may be in an in-between state, but it would still be able
> > + * to recover by manually moving remaining files from the
> > + * temporary migration directory into place.
> > + */
>
> If there a failure after this point, should we provide a hint to user
> that the refernces exist in the tempdir?
Good idea.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 9/9] builtin/refs: new command to migrate ref storage formats
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
` (7 preceding siblings ...)
2024-05-24 10:15 ` [PATCH v2 8/9] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-05-24 10:15 ` Patrick Steinhardt
2024-05-24 18:24 ` Ramsay Jones
8 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-24 10:15 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 15602 bytes --]
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:
- There is no good place to put the migration logic in existing
commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
not the correct place to put it, either.
- I had it in my mind to create a new low-level command for accessing
refs for quite a while already. git-refs(1) is that command and can
over time grow more functionality relating to refs. This should help
discoverability by consolidating low-level access to refs into a
single executable.
As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
.gitignore | 1 +
Documentation/git-refs.txt | 62 ++++++++++
Makefile | 1 +
builtin.h | 1 +
builtin/refs.c | 75 ++++++++++++
git.c | 1 +
t/t1460-refs-migrate.sh | 243 +++++++++++++++++++++++++++++++++++++
7 files changed, 384 insertions(+)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
diff --git a/.gitignore b/.gitignore
index 612c0f6a0f..8caf3700c2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -126,6 +126,7 @@
/git-rebase
/git-receive-pack
/git-reflog
+/git-refs
/git-remote
/git-remote-http
/git-remote-https
diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
new file mode 100644
index 0000000000..3f73ad6aa6
--- /dev/null
+++ b/Documentation/git-refs.txt
@@ -0,0 +1,62 @@
+git-refs(1)
+===========
+
+NAME
+----
+
+git-refs - Low-level access to refs
+
+SYNOPSIS
+--------
+
+[verse]
+'git refs migrate' --ref-format=<format> [--dry-run]
+
+DESCRIPTION
+-----------
+
+This command provides low-level access to refs.
+
+COMMANDS
+--------
+
+migrate::
+ Migrate ref store between different formats.
+
+OPTIONS
+-------
+
+The following options are specific to 'git refs migrate':
+
+--ref-format=<format>::
+ The ref format to migrate the ref store to. Can be one of:
++
+include::ref-storage-format.txt[]
+
+--dry-run::
+ Perform the migration, but do not modify the repository. The migrated
+ refs will be written into a separate directory that can be inspected
+ separately. The name of the directory will be reported on stdout. This
+ can be used to double check that the migration works as expected doing
+ performing the actual migration.
+
+KNOWN LIMITATIONS
+-----------------
+
+The ref format migration has several known limitations in its current form:
+
+* It is not possible to migrate repositories that have reflogs.
+
+* It is not possible to migrate repositories that have worktrees.
+
+* There is no way to block concurrent writes to the repository during an
+ ongoing migration. Concurrent writes can lead to an inconsistent migrated
+ state. Users are expected to block writes on a higher level. If your
+ repository is registered for scheduled maintenance, it is recommended to
+ unregister it first with git-maintenance(1).
+
+These limitations may eventually be lifted.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index cf504963c2..2d702b552c 100644
--- a/Makefile
+++ b/Makefile
@@ -1283,6 +1283,7 @@ BUILTIN_OBJS += builtin/read-tree.o
BUILTIN_OBJS += builtin/rebase.o
BUILTIN_OBJS += builtin/receive-pack.o
BUILTIN_OBJS += builtin/reflog.o
+BUILTIN_OBJS += builtin/refs.o
BUILTIN_OBJS += builtin/remote-ext.o
BUILTIN_OBJS += builtin/remote-fd.o
BUILTIN_OBJS += builtin/remote.o
diff --git a/builtin.h b/builtin.h
index 28280636da..7eda9b2486 100644
--- a/builtin.h
+++ b/builtin.h
@@ -207,6 +207,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix);
int cmd_rebase__interactive(int argc, const char **argv, const char *prefix);
int cmd_receive_pack(int argc, const char **argv, const char *prefix);
int cmd_reflog(int argc, const char **argv, const char *prefix);
+int cmd_refs(int argc, const char **argv, const char *prefix);
int cmd_remote(int argc, const char **argv, const char *prefix);
int cmd_remote_ext(int argc, const char **argv, const char *prefix);
int cmd_remote_fd(int argc, const char **argv, const char *prefix);
diff --git a/builtin/refs.c b/builtin/refs.c
new file mode 100644
index 0000000000..02401afa4e
--- /dev/null
+++ b/builtin/refs.c
@@ -0,0 +1,75 @@
+#include "builtin.h"
+#include "parse-options.h"
+#include "refs.h"
+#include "repository.h"
+#include "strbuf.h"
+
+#define REFS_MIGRATE_USAGE \
+ N_("git refs migrate --ref-format=<format> [--dry-run]")
+
+static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
+{
+ const char * const migrate_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL
+ };
+ const char *format_str = NULL;
+ enum ref_storage_format format;
+ unsigned int flags = 0;
+ struct option options[] = {
+ OPT_STRING_F(0, "ref-format", &format_str, N_("format"),
+ N_("specify the reference format to convert to"),
+ PARSE_OPT_NONEG),
+ OPT_BIT(0, "dry-run", &flags,
+ N_("perform a non-destructive dry-run"),
+ REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN),
+ OPT_END(),
+ };
+ struct strbuf errbuf = STRBUF_INIT;
+ int err;
+
+ argc = parse_options(argc, argv, prefix, options, migrate_usage, 0);
+ if (argc)
+ usage(_("too many arguments"));
+ if (!format_str)
+ usage(_("missing --ref-format=<format>"));
+
+ format = ref_storage_format_by_name(format_str);
+ if (format == REF_STORAGE_FORMAT_UNKNOWN) {
+ err = error(_("unknown ref storage format '%s'"), format_str);
+ goto out;
+ }
+
+ if (the_repository->ref_storage_format == format) {
+ err = error(_("repository already uses '%s' format"),
+ ref_storage_format_to_name(format));
+ goto out;
+ }
+
+ if (repo_migrate_ref_storage_format(the_repository, format, flags, &errbuf) < 0) {
+ err = error("%s", errbuf.buf);
+ goto out;
+ }
+
+ err = 0;
+
+out:
+ strbuf_release(&errbuf);
+ return err;
+}
+
+int cmd_refs(int argc, const char **argv, const char *prefix)
+{
+ const char * const refs_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ parse_opt_subcommand_fn *fn = NULL;
+ struct option opts[] = {
+ OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate),
+ OPT_END(),
+ };
+
+ argc = parse_options(argc, argv, prefix, opts, refs_usage, 0);
+ return fn(argc, argv, prefix);
+}
diff --git a/git.c b/git.c
index 637c61ca9c..683bb69194 100644
--- a/git.c
+++ b/git.c
@@ -594,6 +594,7 @@ static struct cmd_struct commands[] = {
{ "rebase", cmd_rebase, RUN_SETUP | NEED_WORK_TREE },
{ "receive-pack", cmd_receive_pack },
{ "reflog", cmd_reflog, RUN_SETUP },
+ { "refs", cmd_refs, RUN_SETUP },
{ "remote", cmd_remote, RUN_SETUP },
{ "remote-ext", cmd_remote_ext, NO_PARSEOPT },
{ "remote-fd", cmd_remote_fd, NO_PARSEOPT },
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
new file mode 100755
index 0000000000..f7c0783d30
--- /dev/null
+++ b/t/t1460-refs-migrate.sh
@@ -0,0 +1,243 @@
+#!/bin/sh
+
+test_description='migration of ref storage backends'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_migration () {
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >expect &&
+ git -C "$1" refs migrate --ref-format="$2" &&
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >actual &&
+ test_cmp expect actual &&
+
+ git -C "$1" rev-parse --show-ref-format >actual &&
+ echo "$2" >expect &&
+ test_cmp expect actual
+}
+
+test_expect_success 'setup' '
+ rm -rf .git &&
+ # The migration does not yet support reflogs.
+ git config --global core.logAllRefUpdates false
+'
+
+test_expect_success "superfluous arguments" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate foo 2>err &&
+ cat >expect <<-EOF &&
+ usage: too many arguments
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "missing ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate 2>err &&
+ cat >expect <<-EOF &&
+ usage: missing --ref-format=<format>
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "unknown ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=unknown 2>err &&
+ cat >expect <<-EOF &&
+ error: unknown ref storage format ${SQ}unknown${SQ}
+ EOF
+ test_cmp expect err
+'
+
+ref_formats="files reftable"
+for from_format in $ref_formats
+do
+ for to_format in $ref_formats
+ do
+ if test "$from_format" = "$to_format"
+ then
+ continue
+ fi
+
+ test_expect_success "$from_format: migration to same format fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$from_format 2>err &&
+ cat >expect <<-EOF &&
+ error: repository already uses ${SQ}$from_format${SQ} format
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with reflog fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_config -C repo core.logAllRefUpdates true &&
+ test_commit -C repo logged &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating reflogs is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with worktree fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ git -C repo worktree add wt &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating repositories with worktrees is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: unborn HEAD" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: single ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: bare repository" '
+ test_when_finished "rm -rf repo repo.git" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git clone --ref-format=$from_format --mirror repo repo.git &&
+ test_migration repo.git "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dangling symref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo symbolic-ref BROKEN_HEAD refs/heads/nonexistent &&
+ test_migration repo "$to_format" &&
+ echo refs/heads/nonexistent >expect &&
+ git -C repo symbolic-ref BROKEN_HEAD >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: broken ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test-tool -C repo ref-store main update-ref "" refs/heads/broken \
+ "$(test_oid 001)" "$ZERO_OID" REF_SKIP_CREATE_REFLOG,REF_SKIP_OID_VERIFICATION &&
+ test_migration repo "$to_format" &&
+ test_oid 001 >expect &&
+ git -C repo rev-parse refs/heads/broken >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: pseudo-refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo update-ref FOO_HEAD HEAD &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: special refs are left alone" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo rev-parse HEAD >repo/.git/MERGE_HEAD &&
+ git -C repo rev-parse MERGE_HEAD &&
+ test_migration repo "$to_format" &&
+ test_path_is_file repo/.git/MERGE_HEAD
+ '
+
+ test_expect_success "$from_format -> $to_format: a bunch of refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+
+ test_commit -C repo initial &&
+ cat >input <<-EOF &&
+ create FOO_HEAD HEAD
+ create refs/heads/branch-1 HEAD
+ create refs/heads/branch-2 HEAD
+ create refs/heads/branch-3 HEAD
+ create refs/heads/branch-4 HEAD
+ create refs/tags/tag-1 HEAD
+ create refs/tags/tag-2 HEAD
+ EOF
+ git -C repo update-ref --stdin <input &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dry-run migration does not modify repository" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo refs migrate --dry-run \
+ --ref-format=$to_format >output &&
+ grep "Finished dry-run migration of refs" output &&
+ test_path_is_dir repo/.git/ref_migration.* &&
+ echo $from_format >expect &&
+ git -C repo rev-parse --show-ref-format >actual &&
+ test_cmp expect actual
+ '
+ done
+done
+
+test_expect_success 'migrating from files format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=files repo &&
+ test_commit -C repo first &&
+ git -C repo pack-refs --all &&
+ test_commit -C repo second &&
+ git -C repo update-ref ORIG_HEAD HEAD &&
+ git -C repo rev-parse HEAD >repo/.git/FETCH_HEAD &&
+
+ test_path_is_file repo/.git/HEAD &&
+ test_path_is_file repo/.git/ORIG_HEAD &&
+ test_path_is_file repo/.git/refs/heads/main &&
+ test_path_is_file repo/.git/packed-refs &&
+
+ test_migration repo reftable &&
+
+ echo "ref: refs/heads/.invalid" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ echo "this repository uses the reftable format" >expect &&
+ test_cmp expect repo/.git/refs/heads &&
+ test_path_is_file repo/.git/FETCH_HEAD &&
+ test_path_is_missing repo/.git/ORIG_HEAD &&
+ test_path_is_missing repo/.git/refs/heads/main &&
+ test_path_is_missing repo/.git/logs &&
+ test_path_is_missing repo/.git/packed-refs
+'
+
+test_expect_success 'migrating from reftable format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=reftable repo &&
+ test_commit -C repo first &&
+
+ test_path_is_dir repo/.git/reftable &&
+ test_migration repo files &&
+
+ test_path_is_missing repo/.git/reftable &&
+ echo "ref: refs/heads/main" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ test_path_is_file repo/.git/refs/heads/main
+'
+
+test_done
--
2.45.1.216.g4365c6fcf9.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v2 9/9] builtin/refs: new command to migrate ref storage formats
2024-05-24 10:15 ` [PATCH v2 9/9] builtin/refs: new command to migrate " Patrick Steinhardt
@ 2024-05-24 18:24 ` Ramsay Jones
2024-05-24 19:29 ` Eric Sunshine
0 siblings, 1 reply; 103+ messages in thread
From: Ramsay Jones @ 2024-05-24 18:24 UTC (permalink / raw)
To: Patrick Steinhardt, git; +Cc: Eric Sunshine, Junio C Hamano
On 24/05/2024 11:15, Patrick Steinhardt wrote:
[snip]
> diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
> new file mode 100644
> index 0000000000..3f73ad6aa6
> --- /dev/null
> +++ b/Documentation/git-refs.txt
> @@ -0,0 +1,62 @@
> +git-refs(1)
> +===========
> +
> +NAME
> +----
> +
> +git-refs - Low-level access to refs
> +
> +SYNOPSIS
> +--------
> +
> +[verse]
> +'git refs migrate' --ref-format=<format> [--dry-run]
> +
> +DESCRIPTION
> +-----------
> +
> +This command provides low-level access to refs.
> +
> +COMMANDS
> +--------
> +
> +migrate::
> + Migrate ref store between different formats.
> +
> +OPTIONS
> +-------
> +
> +The following options are specific to 'git refs migrate':
> +
> +--ref-format=<format>::
> + The ref format to migrate the ref store to. Can be one of:
> ++
> +include::ref-storage-format.txt[]
> +
> +--dry-run::
> + Perform the migration, but do not modify the repository. The migrated
> + refs will be written into a separate directory that can be inspected
> + separately. The name of the directory will be reported on stdout. This
> + can be used to double check that the migration works as expected doing
> + performing the actual migration.
s/expected doing performing/expected before performing/ ?
ATB,
Ramsay Jones
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 9/9] builtin/refs: new command to migrate ref storage formats
2024-05-24 18:24 ` Ramsay Jones
@ 2024-05-24 19:29 ` Eric Sunshine
2024-05-28 5:14 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Eric Sunshine @ 2024-05-24 19:29 UTC (permalink / raw)
To: Ramsay Jones; +Cc: Patrick Steinhardt, git, Junio C Hamano
On Fri, May 24, 2024 at 2:24 PM Ramsay Jones
<ramsay@ramsayjones.plus.com> wrote:
> On 24/05/2024 11:15, Patrick Steinhardt wrote:
> > +--dry-run::
> > + Perform the migration, but do not modify the repository. The migrated
> > + refs will be written into a separate directory that can be inspected
> > + separately. The name of the directory will be reported on stdout. This
> > + can be used to double check that the migration works as expected doing
> > + performing the actual migration.
>
> s/expected doing performing/expected before performing/ ?
The "doing performing" bit was noticed earlier[1]. I suppose in trying
to fix it, Patrick accidentally removed "before" rather than removing
either "doing" or "performing".
[1] https://lore.kernel.org/git/xmqqv833maxu.fsf@gitster.g/T/#m2c3eced90c6cd61bf3acda1acc354b4ab76011d3
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 9/9] builtin/refs: new command to migrate ref storage formats
2024-05-24 19:29 ` Eric Sunshine
@ 2024-05-28 5:14 ` Patrick Steinhardt
0 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 5:14 UTC (permalink / raw)
To: Eric Sunshine; +Cc: Ramsay Jones, git, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 1034 bytes --]
On Fri, May 24, 2024 at 03:29:06PM -0400, Eric Sunshine wrote:
> On Fri, May 24, 2024 at 2:24 PM Ramsay Jones
> <ramsay@ramsayjones.plus.com> wrote:
> > On 24/05/2024 11:15, Patrick Steinhardt wrote:
> > > +--dry-run::
> > > + Perform the migration, but do not modify the repository. The migrated
> > > + refs will be written into a separate directory that can be inspected
> > > + separately. The name of the directory will be reported on stdout. This
> > > + can be used to double check that the migration works as expected doing
> > > + performing the actual migration.
> >
> > s/expected doing performing/expected before performing/ ?
>
> The "doing performing" bit was noticed earlier[1]. I suppose in trying
> to fix it, Patrick accidentally removed "before" rather than removing
> either "doing" or "performing".
>
> [1] https://lore.kernel.org/git/xmqqv833maxu.fsf@gitster.g/T/#m2c3eced90c6cd61bf3acda1acc354b4ab76011d3
Ugh, yeah, that's probably what happened. Thanks!
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v3 00/12] refs: ref storage format migrations
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (10 preceding siblings ...)
2024-05-24 10:14 ` [PATCH v2 " Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
` (12 more replies)
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
13 siblings, 13 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 15390 bytes --]
Hi,
this is the third version of my patch series that implements ref storage
format migrations.
Changes compared to v2:
- Perform sanity checks for worktrees and reflogs before we create the
temporary refdb directory.
- Swapped out calls to `remove_path()` to `unlink()`. We do not want
to walk up and remove empty parent directories, even though this is
harmless in practice.
- Release the reftable refdb before removing it. This closes the
cached "tables.list" file descriptor, which would otherwise break
removal of this file on Windows.
- Fix a bug with worktrees where we store the current worktree refdb
twice. This caused us to keep file descriptors open, which breaks
removal of the refdb on Windows.
- Simplify freeing reftable's merged tables. This isn't really needed
by this series, but I stumbled over this while investigating why
things break on Windows.
- Improve error messages to add `strerror(errno)`, which helped me in
debugging those errors.
- Print path to the migrated refs if things fail after we have
populated them such that users can recover.
- Fix segfault when releasing a partially initialized "files" ref
store.
- Some smallish improvements littered across the patches.
Thanks!
Patrick
Patrick Steinhardt (12):
setup: unset ref storage when reinitializing repository version
refs: convert ref storage format to an enum
refs: pass storage format to `ref_store_init()` explicitly
refs: allow to skip creation of reflog entries
refs/files: refactor `add_pseudoref_and_head_entries()`
refs/files: extract function to iterate through root refs
refs/files: fix NULL pointer deref when releasing ref store
reftable: inline `merged_table_release()`
worktree: don't store main worktree twice
refs: implement removal of ref storages
refs: implement logic to migrate between ref storage formats
builtin/refs: new command to migrate ref storage formats
.gitignore | 1 +
Documentation/git-refs.txt | 62 +++++++
Makefile | 1 +
builtin.h | 1 +
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
builtin/refs.c | 75 ++++++++
git.c | 1 +
refs.c | 340 +++++++++++++++++++++++++++++++++++--
refs.h | 41 ++++-
refs/files-backend.c | 123 ++++++++++++--
refs/packed-backend.c | 15 ++
refs/ref-cache.c | 2 +
refs/refs-internal.h | 7 +
refs/reftable-backend.c | 55 +++++-
reftable/merged.c | 12 +-
reftable/merged.h | 2 -
reftable/stack.c | 8 +-
repository.c | 3 +-
repository.h | 10 +-
setup.c | 10 +-
setup.h | 9 +-
t/helper/test-ref-store.c | 1 +
t/t1460-refs-migrate.sh | 243 ++++++++++++++++++++++++++
worktree.c | 29 ++--
25 files changed, 974 insertions(+), 81 deletions(-)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
Range-diff against v2:
1: 8b11127daf ! 1: afb705f6a0 setup: unset ref storage when reinitializing repository version
@@ Commit message
storages though, so this is about to become an issue there.
Prepare for this and unset the ref storage format when reinitializing a
- repoistory with the "files" format.
+ repository with the "files" format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
2: 25f740f395 = 2: 7989e82dcd refs: convert ref storage format to an enum
3: 6e7b9764f6 = 3: 7d1a86292c refs: pass storage format to `ref_store_init()` explicitly
4: 03f4ac6ee7 = 4: d0539b7456 refs: allow to skip creation of reflog entries
5: 71f31fe66c = 5: 7f9ce5af2e refs/files: refactor `add_pseudoref_and_head_entries()`
6: 6b696690ca = 6: f7577a0ab3 refs/files: extract function to iterate through root refs
-: ---------- > 7: 56baa798fb refs/files: fix NULL pointer deref when releasing ref store
-: ---------- > 8: c7e8ab40b5 reftable: inline `merged_table_release()`
-: ---------- > 9: 7a89aae515 worktree: don't store main worktree twice
7: b758c419c6 ! 10: f9d9420cf9 refs: implement removal of ref storages
@@ refs/files-backend.c: static int files_ref_store_create_on_disk(struct ref_store
+
+ strbuf_addf(&buf, "%s/%s", data->gitdir, refname);
+
-+ ret = remove_path(buf.buf);
++ ret = unlink(buf.buf);
+ if (ret < 0)
+ strbuf_addf(data->err, "could not delete %s: %s\n",
+ refname, strerror(errno));
@@ refs/files-backend.c: static int files_ref_store_create_on_disk(struct ref_store
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
-+ strbuf_addstr(err, "could not delete refs");
++ strbuf_addf(err, "could not delete refs: %s",
++ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
-+ strbuf_addstr(err, "could not delete logs\n");
++ strbuf_addf(err, "could not delete logs: %s",
++ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
@@ refs/reftable-backend.c: static int reftable_be_create_on_disk(struct ref_store
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
++ /*
++ * Release the ref store such that all stacks are closed. This is
++ * required so that the "tables.list" file is not open anymore, which
++ * would otherwise make it impossible to remove the file on Windows.
++ */
++ reftable_be_release(ref_store);
++
+ strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
-+ strbuf_addstr(err, "could not delete reftables");
++ strbuf_addf(err, "could not delete reftables: %s",
++ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
-+ if (remove_path(sb.buf) < 0) {
-+ strbuf_addstr(err, "could not delete stub HEAD");
++ if (unlink(sb.buf) < 0) {
++ strbuf_addf(err, "could not delete stub HEAD: %s",
++ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
-+ if (remove_path(sb.buf) < 0) {
-+ strbuf_addstr(err, "could not delete stub heads");
++ if (unlink(sb.buf) < 0) {
++ strbuf_addf(err, "could not delete stub heads: %s",
++ strerror(errno));
++ ret = -1;
++ }
++ strbuf_reset(&sb);
++
++ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
++ if (rmdir(sb.buf) < 0) {
++ strbuf_addf(err, "could not delete stub heads: %s",
++ strerror(errno));
+ ret = -1;
+ }
+
8: 4d3eb5ea89 ! 11: 1f26051eff refs: implement logic to migrate between ref storage formats
@@ Commit message
many reflog entries.
- We do not lock the repository for concurrent access, and thus
- concurrent writes may make use end up with weird in-between states.
- There is no way to fully lock the "files" backend for writes due to
- its format, and thus we punt on this topic altogether and defer to
- the user to avoid those from happening.
+ concurrent writes may end up with weird in-between states. There is
+ no way to fully lock the "files" backend for writes due to its
+ format, and thus we punt on this topic altogether and defer to the
+ user to avoid those from happening.
In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+
+struct migration_data {
+ struct ref_store *old_refs;
-+ struct ref_store *new_refs;
+ struct ref_transaction *transaction;
+ struct strbuf *errbuf;
-+ const char *refname;
+};
+
+static int migrate_one_ref(const char *refname, const struct object_id *oid,
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+
+ from_dir = opendir(from_path);
+ if (!from_dir) {
-+ strbuf_addf(errbuf, "could not open source directory: '%s'", from_path);
++ strbuf_addf(errbuf, "could not open source directory '%s': %s",
++ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+done:
+ strbuf_release(&from_buf);
+ strbuf_release(&to_buf);
-+ closedir(from_dir);
++ if (from_dir)
++ closedir(from_dir);
+ return ret;
+}
+
-+static int count_reflogs(const char *reflog, void *payload)
++static int count_reflogs(const char *reflog UNUSED, void *payload)
+{
+ size_t *reflog_count = payload;
+ (*reflog_count)++;
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ struct strbuf buf = STRBUF_INIT;
+ struct migration_data data;
+ size_t reflog_count = 0;
-+ char *new_gitdir;
++ char *new_gitdir = NULL;
++ int did_migrate_refs = 0;
+ int ret;
+
+ old_refs = get_main_ref_store(repo);
+
+ /*
++ * We do not have any interfaces that would allow us to write many
++ * reflog entries. Once we have them we can remove this restriction.
++ */
++ if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
++ strbuf_addstr(errbuf, "cannot count reflogs");
++ ret = -1;
++ goto done;
++ }
++ if (reflog_count) {
++ strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
++ ret = -1;
++ goto done;
++ }
++
++ /*
++ * Worktrees complicate the migration because every worktree has a
++ * separate ref storage. While it should be feasible to implement, this
++ * is pushed out to a future iteration.
++ *
++ * TODO: we should really be passing the caller-provided repository to
++ * `has_worktrees()`, but our worktree subsystem doesn't yet support
++ * that.
++ */
++ if (has_worktrees()) {
++ strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
++ ret = -1;
++ goto done;
++ }
++
++ /*
+ * The overall logic looks like this:
+ *
+ * 1. Set up a new temporary directory and initialize it with the new
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ * 6. Change the repository format to the new ref format.
+ */
+ strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
-+ new_gitdir = mkdtemp(buf.buf);
++ new_gitdir = mkdtemp(xstrdup(buf.buf));
+ if (!new_gitdir) {
+ strbuf_addf(errbuf, "cannot create migration directory: %s",
+ strerror(errno));
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ goto done;
+ }
+
-+ if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
-+ strbuf_addstr(errbuf, "cannot count reflogs");
-+ ret = -1;
-+ goto done;
-+ }
-+ if (reflog_count) {
-+ strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
-+ ret = -1;
-+ goto done;
-+ }
-+
-+ /*
-+ * TODO: we should really be passing the caller-provided repository to
-+ * `has_worktrees()`, but our worktree subsystem doesn't yet support
-+ * that.
-+ */
-+ if (has_worktrees()) {
-+ strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
-+ ret = -1;
-+ goto done;
-+ }
-+
+ new_refs = ref_store_init(repo, format, new_gitdir,
+ REF_STORE_ALL_CAPS);
+ ret = ref_store_create_on_disk(new_refs, 0, errbuf);
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ goto done;
+
+ data.old_refs = old_refs;
-+ data.new_refs = new_refs;
+ data.transaction = transaction;
+ data.errbuf = errbuf;
+
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ ret = ref_transaction_commit(transaction, errbuf);
+ if (ret < 0)
+ goto done;
++ did_migrate_refs = 1;
+
+ if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
+ printf(_("Finished dry-run migration of refs, "
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
+ if (ret < 0)
+ goto done;
-+ rmdir(new_gitdir);
++
++ if (rmdir(new_gitdir) < 0)
++ warning_errno(_("could not remove temporary migration directory '%s'"),
++ new_gitdir);
+
+ /*
+ * We have migrated the repository, so we now need to adjust the
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ */
+ initialize_repository_version(hash_algo_by_ptr(repo->hash_algo), format, 1);
+
++ free(new_refs->gitdir);
++ new_refs->gitdir = xstrdup(old_refs->gitdir);
+ repo->refs_private = new_refs;
+ ref_store_release(old_refs);
+
+ ret = 0;
+
+done:
++ if (ret && did_migrate_refs) {
++ strbuf_complete(errbuf, '\n');
++ strbuf_addf(errbuf, _("migrated refs can be found at '%s'"),
++ new_gitdir);
++ }
++
+ if (ret && new_refs)
+ ref_store_release(new_refs);
+ ref_transaction_free(transaction);
+ strbuf_release(&buf);
++ free(new_gitdir);
+ return ret;
+}
9: 0df17a51b4 ! 12: d832414d1f builtin/refs: new command to migrate ref storage formats
@@ Documentation/git-refs.txt (new)
+ Perform the migration, but do not modify the repository. The migrated
+ refs will be written into a separate directory that can be inspected
+ separately. The name of the directory will be reported on stdout. This
-+ can be used to double check that the migration works as expected doing
++ can be used to double check that the migration works as expected before
+ performing the actual migration.
+
+KNOWN LIMITATIONS
@@ builtin/refs.c (new)
+{
+ const char * const migrate_usage[] = {
+ REFS_MIGRATE_USAGE,
-+ NULL
++ NULL,
+ };
+ const char *format_str = NULL;
+ enum ref_storage_format format;
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v3 01/12] setup: unset ref storage when reinitializing repository version
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
` (11 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]
When reinitializing a repository's version we may end up unsetting the
hash algorithm when it matches the default hash algorithm. If we didn't
do that then the previously configured value might remain intact.
While the same issue exists for the ref storage extension, we don't do
this here. This has been fine for most of the part because it is not
supported to re-initialize a repository with a different ref storage
format anyway. We're about to introduce a new command to migrate ref
storages though, so this is about to become an issue there.
Prepare for this and unset the ref storage format when reinitializing a
repository with the "files" format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/setup.c b/setup.c
index 7975230ffb..8c84ec9d4b 100644
--- a/setup.c
+++ b/setup.c
@@ -2028,6 +2028,8 @@ void initialize_repository_version(int hash_algo,
if (ref_storage_format != REF_STORAGE_FORMAT_FILES)
git_config_set("extensions.refstorage",
ref_storage_format_to_name(ref_storage_format));
+ else if (reinit)
+ git_config_set_gently("extensions.refstorage", NULL);
}
static int is_reinit(void)
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 02/12] refs: convert ref storage format to an enum
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
` (10 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 8337 bytes --]
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.
Convert the ref storage format to instead be an enum.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
refs.c | 7 ++++---
refs.h | 10 ++++++++--
repository.c | 3 ++-
repository.h | 10 ++++------
setup.c | 8 ++++----
setup.h | 9 +++++----
8 files changed, 29 insertions(+), 22 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index 1e07524c53..e808e02017 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -970,7 +970,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
int submodule_progress;
int filter_submodules = 0;
int hash_algo;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
const int do_not_override_repo_unix_permissions = -1;
const char *template_dir;
char *template_dir_dup = NULL;
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 0170469b84..582dcf20f8 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -81,7 +81,7 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
const char *ref_format = NULL;
const char *initial_branch = NULL;
int hash_algo = GIT_HASH_UNKNOWN;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
int init_shared_repository = -1;
const struct option init_db_options[] = {
OPT_STRING(0, "template", &template_dir, N_("template-directory"),
diff --git a/refs.c b/refs.c
index 31032588e0..e6db85a165 100644
--- a/refs.c
+++ b/refs.c
@@ -37,14 +37,15 @@ static const struct ref_storage_be *refs_backends[] = {
[REF_STORAGE_FORMAT_REFTABLE] = &refs_be_reftable,
};
-static const struct ref_storage_be *find_ref_storage_backend(unsigned int ref_storage_format)
+static const struct ref_storage_be *find_ref_storage_backend(
+ enum ref_storage_format ref_storage_format)
{
if (ref_storage_format < ARRAY_SIZE(refs_backends))
return refs_backends[ref_storage_format];
return NULL;
}
-unsigned int ref_storage_format_by_name(const char *name)
+enum ref_storage_format ref_storage_format_by_name(const char *name)
{
for (unsigned int i = 0; i < ARRAY_SIZE(refs_backends); i++)
if (refs_backends[i] && !strcmp(refs_backends[i]->name, name))
@@ -52,7 +53,7 @@ unsigned int ref_storage_format_by_name(const char *name)
return REF_STORAGE_FORMAT_UNKNOWN;
}
-const char *ref_storage_format_to_name(unsigned int ref_storage_format)
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format)
{
const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
if (!be)
diff --git a/refs.h b/refs.h
index fe7f0db35e..a7afa9bede 100644
--- a/refs.h
+++ b/refs.h
@@ -11,8 +11,14 @@ struct string_list;
struct string_list_item;
struct worktree;
-unsigned int ref_storage_format_by_name(const char *name);
-const char *ref_storage_format_to_name(unsigned int ref_storage_format);
+enum ref_storage_format {
+ REF_STORAGE_FORMAT_UNKNOWN,
+ REF_STORAGE_FORMAT_FILES,
+ REF_STORAGE_FORMAT_REFTABLE,
+};
+
+enum ref_storage_format ref_storage_format_by_name(const char *name);
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format);
/*
* Resolve a reference, recursively following symbolic refererences.
diff --git a/repository.c b/repository.c
index d29b0304fb..166863f852 100644
--- a/repository.c
+++ b/repository.c
@@ -124,7 +124,8 @@ void repo_set_compat_hash_algo(struct repository *repo, int algo)
repo_read_loose_object_map(repo);
}
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format)
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format)
{
repo->ref_storage_format = format;
}
diff --git a/repository.h b/repository.h
index 4bd8969005..a35cd77c35 100644
--- a/repository.h
+++ b/repository.h
@@ -1,6 +1,7 @@
#ifndef REPOSITORY_H
#define REPOSITORY_H
+#include "refs.h"
#include "strmap.h"
struct config_set;
@@ -26,10 +27,6 @@ enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NOOP,
};
-#define REF_STORAGE_FORMAT_UNKNOWN 0
-#define REF_STORAGE_FORMAT_FILES 1
-#define REF_STORAGE_FORMAT_REFTABLE 2
-
struct repo_settings {
int initialized;
@@ -181,7 +178,7 @@ struct repository {
const struct git_hash_algo *compat_hash_algo;
/* Repository's reference storage format, as serialized on disk. */
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
/* A unique-id for tracing purposes. */
int trace2_repo_id;
@@ -220,7 +217,8 @@ void repo_set_gitdir(struct repository *repo, const char *root,
void repo_set_worktree(struct repository *repo, const char *path);
void repo_set_hash_algo(struct repository *repo, int algo);
void repo_set_compat_hash_algo(struct repository *repo, int compat_algo);
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format);
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format);
void initialize_repository(struct repository *repo);
RESULT_MUST_BE_USED
int repo_init(struct repository *r, const char *gitdir, const char *worktree);
diff --git a/setup.c b/setup.c
index 8c84ec9d4b..b49ee3e95f 100644
--- a/setup.c
+++ b/setup.c
@@ -1997,7 +1997,7 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
}
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit)
{
char repo_version_string[10];
@@ -2044,7 +2044,7 @@ static int is_reinit(void)
return ret;
}
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet)
{
struct strbuf err = STRBUF_INIT;
@@ -2243,7 +2243,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
}
static void validate_ref_storage_format(struct repository_format *repo_fmt,
- unsigned int format)
+ enum ref_storage_format format)
{
const char *name = getenv("GIT_DEFAULT_REF_FORMAT");
@@ -2263,7 +2263,7 @@ static void validate_ref_storage_format(struct repository_format *repo_fmt,
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch,
int init_shared_repository, unsigned int flags)
{
diff --git a/setup.h b/setup.h
index b3fd3bf45a..cd8dbc2497 100644
--- a/setup.h
+++ b/setup.h
@@ -1,6 +1,7 @@
#ifndef SETUP_H
#define SETUP_H
+#include "refs.h"
#include "string-list.h"
int is_inside_git_dir(void);
@@ -128,7 +129,7 @@ struct repository_format {
int is_bare;
int hash_algo;
int compat_hash_algo;
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
int sparse_index;
char *work_tree;
struct string_list unknown_extensions;
@@ -192,13 +193,13 @@ const char *get_template_dir(const char *option_template);
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch, int init_shared_repository,
unsigned int flags);
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit);
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet);
/*
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 03/12] refs: pass storage format to `ref_store_init()` explicitly
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
` (9 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2646 bytes --]
We're about to introduce logic to migrate refs from one storage format
to another one. This will require us to initialize a ref store with a
different format than the one used by the passed-in repository.
Prepare for this by accepting the desired ref storage format as
parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/refs.c b/refs.c
index e6db85a165..7c3f4df457 100644
--- a/refs.c
+++ b/refs.c
@@ -1894,13 +1894,14 @@ static struct ref_store *lookup_ref_store_map(struct strmap *map,
* gitdir.
*/
static struct ref_store *ref_store_init(struct repository *repo,
+ enum ref_storage_format format,
const char *gitdir,
unsigned int flags)
{
const struct ref_storage_be *be;
struct ref_store *refs;
- be = find_ref_storage_backend(repo->ref_storage_format);
+ be = find_ref_storage_backend(format);
if (!be)
BUG("reference backend is unknown");
@@ -1922,7 +1923,8 @@ struct ref_store *get_main_ref_store(struct repository *r)
if (!r->gitdir)
BUG("attempting to get main_ref_store outside of repository");
- r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
+ r->refs_private = ref_store_init(r, r->ref_storage_format,
+ r->gitdir, REF_STORE_ALL_CAPS);
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
return r->refs_private;
}
@@ -1982,7 +1984,8 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo,
free(subrepo);
goto done;
}
- refs = ref_store_init(subrepo, submodule_sb.buf,
+ refs = ref_store_init(subrepo, the_repository->ref_storage_format,
+ submodule_sb.buf,
REF_STORE_READ | REF_STORE_ODB);
register_ref_store_map(&repo->submodule_ref_stores, "submodule",
refs, submodule);
@@ -2011,12 +2014,12 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
struct strbuf common_path = STRBUF_INIT;
strbuf_git_common_path(&common_path, wt->repo,
"worktrees/%s", wt->id);
- refs = ref_store_init(wt->repo, common_path.buf,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, wt->repo->ref_storage_format,
+ common_path.buf, REF_STORE_ALL_CAPS);
strbuf_release(&common_path);
} else {
- refs = ref_store_init(wt->repo, wt->repo->commondir,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, the_repository->ref_storage_format,
+ wt->repo->commondir, REF_STORE_ALL_CAPS);
}
if (refs)
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 04/12] refs: allow to skip creation of reflog entries
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (2 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
` (8 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 3845 bytes --]
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 6 ++++++
refs.h | 8 +++++++-
refs/files-backend.c | 4 ++++
refs/reftable-backend.c | 3 ++-
t/helper/test-ref-store.c | 1 +
5 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/refs.c b/refs.c
index 7c3f4df457..66e9585767 100644
--- a/refs.c
+++ b/refs.c
@@ -1194,6 +1194,12 @@ int ref_transaction_update(struct ref_transaction *transaction,
{
assert(err);
+ if ((flags & REF_FORCE_CREATE_REFLOG) &&
+ (flags & REF_SKIP_CREATE_REFLOG)) {
+ strbuf_addstr(err, _("refusing to force and skip creation of reflog"));
+ return -1;
+ }
+
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
((new_oid && !is_null_oid(new_oid)) ?
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL) :
diff --git a/refs.h b/refs.h
index a7afa9bede..50a2b3ab09 100644
--- a/refs.h
+++ b/refs.h
@@ -659,13 +659,19 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
*/
#define REF_SKIP_REFNAME_VERIFICATION (1 << 11)
+/*
+ * Skip creation of a reflog entry, even if it would have otherwise been
+ * created.
+ */
+#define REF_SKIP_CREATE_REFLOG (1 << 12)
+
/*
* Bitmask of all of the flags that are allowed to be passed in to
* ref_transaction_update() and friends:
*/
#define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS \
(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
- REF_SKIP_REFNAME_VERIFICATION)
+ REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
/*
* Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 73380d7e99..bd0d63bcba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1750,6 +1750,9 @@ static int files_log_ref_write(struct files_ref_store *refs,
{
int logfd, result;
+ if (flags & REF_SKIP_CREATE_REFLOG)
+ return 0;
+
if (log_all_ref_updates == LOG_REFS_UNSET)
log_all_ref_updates = is_bare_repository() ? LOG_REFS_NONE : LOG_REFS_NORMAL;
@@ -2251,6 +2254,7 @@ static int split_head_update(struct ref_update *update,
struct ref_update *new_update;
if ((update->flags & REF_LOG_ONLY) ||
+ (update->flags & REF_SKIP_CREATE_REFLOG) ||
(update->flags & REF_IS_PRUNING) ||
(update->flags & REF_UPDATE_VIA_HEAD))
return 0;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f6edfdf5b3..bffed9257f 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1103,7 +1103,8 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
if (ret)
goto done;
- } else if (u->flags & REF_HAVE_NEW &&
+ } else if (!(u->flags & REF_SKIP_CREATE_REFLOG) &&
+ (u->flags & REF_HAVE_NEW) &&
(u->flags & REF_FORCE_CREATE_REFLOG ||
should_write_log(&arg->refs->base, u->refname))) {
struct reftable_log_record *log;
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index c9efd74c2b..ad24300170 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -126,6 +126,7 @@ static struct flag_definition transaction_flags[] = {
FLAG_DEF(REF_FORCE_CREATE_REFLOG),
FLAG_DEF(REF_SKIP_OID_VERIFICATION),
FLAG_DEF(REF_SKIP_REFNAME_VERIFICATION),
+ FLAG_DEF(REF_SKIP_CREATE_REFLOG),
{ NULL, 0 }
};
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 05/12] refs/files: refactor `add_pseudoref_and_head_entries()`
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (3 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
` (7 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]
The `add_pseudoref_and_head_entries()` function accepts both the ref
store as well as a directory name as input. This is unnecessary though
as the ref store already uniquely identifies the root directory of the
ref store anyway.
Furthermore, the function is misnamed now that we have clarified the
meaning of pseudorefs as it doesn't add pseudorefs, but root refs.
Rename it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index bd0d63bcba..b4e5437ffe 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -324,16 +324,14 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
}
/*
- * Add pseudorefs to the ref dir by parsing the directory for any files
- * which follow the pseudoref syntax.
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
*/
-static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
- struct ref_dir *dir,
- const char *dirname)
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
{
- struct files_ref_store *refs =
- files_downcast(ref_store, REF_STORE_READ, "fill_ref_dir");
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
+ const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
DIR *d;
@@ -388,8 +386,7 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
dir = get_ref_dir(refs->loose->root);
if (flags & DO_FOR_EACH_INCLUDE_ROOT_REFS)
- add_pseudoref_and_head_entries(dir->cache->ref_store, dir,
- refs->loose->root->name);
+ add_root_refs(refs, dir);
/*
* Add an incomplete entry for "refs/" (to be filled
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 06/12] refs/files: extract function to iterate through root refs
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (4 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
` (6 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]
Extract a new function that can be used to iterate through all root refs
known to the "files" backend. This will be used in the next commit,
where we start to teach ref backends to remove themselves.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 49 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 40 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b4e5437ffe..b7268b26c8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -323,17 +323,15 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
add_per_worktree_entries_to_dir(dir, dirname);
}
-/*
- * Add root refs to the ref dir by parsing the directory for any files which
- * follow the root ref syntax.
- */
-static void add_root_refs(struct files_ref_store *refs,
- struct ref_dir *dir)
+static int for_each_root_ref(struct files_ref_store *refs,
+ int (*cb)(const char *refname, void *cb_data),
+ void *cb_data)
{
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
+ int ret;
DIR *d;
files_ref_path(refs, &path, dirname);
@@ -341,7 +339,7 @@ static void add_root_refs(struct files_ref_store *refs,
d = opendir(path.buf);
if (!d) {
strbuf_release(&path);
- return;
+ return -1;
}
strbuf_addstr(&refname, dirname);
@@ -357,14 +355,47 @@ static void add_root_refs(struct files_ref_store *refs,
strbuf_addstr(&refname, de->d_name);
dtype = get_dtype(de, &path, 1);
- if (dtype == DT_REG && is_root_ref(de->d_name))
- loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
+ if (dtype == DT_REG && is_root_ref(de->d_name)) {
+ ret = cb(refname.buf, cb_data);
+ if (ret)
+ goto done;
+ }
strbuf_setlen(&refname, dirnamelen);
}
+
+done:
strbuf_release(&refname);
strbuf_release(&path);
closedir(d);
+ return ret;
+}
+
+struct fill_root_ref_data {
+ struct files_ref_store *refs;
+ struct ref_dir *dir;
+};
+
+static int fill_root_ref(const char *refname, void *cb_data)
+{
+ struct fill_root_ref_data *data = cb_data;
+ loose_fill_ref_dir_regular_file(data->refs, refname, data->dir);
+ return 0;
+}
+
+/*
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
+ */
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
+{
+ struct fill_root_ref_data data = {
+ .refs = refs,
+ .dir = dir,
+ };
+
+ for_each_root_ref(refs, fill_root_ref, &data);
}
static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 07/12] refs/files: fix NULL pointer deref when releasing ref store
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (5 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
` (5 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 702 bytes --]
The `free_ref_cache()` function is not `NULL` safe and will thus
segfault when being passed such a pointer. This can easily happen when
trying to release a partially initialized "files" ref store. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/ref-cache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index b6c53fc8ed..4ce519bbc8 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -71,6 +71,8 @@ static void free_ref_entry(struct ref_entry *entry)
void free_ref_cache(struct ref_cache *cache)
{
+ if (!cache)
+ return;
free_ref_entry(cache->root);
free(cache);
}
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 08/12] reftable: inline `merged_table_release()`
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (6 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 09/12] worktree: don't store main worktree twice Patrick Steinhardt
` (4 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2400 bytes --]
The function `merged_table_release()` releases a merged table, whereas
`reftable_merged_table_free()` releases a merged table and then also
free's its pointer. But all callsites of `merged_table_release()` are in
fact followed by `reftable_merged_table_free()`, which is redundant.
Inline `merged_table_release()` into `reftable_merged_table_free()` to
get rid of this redundance.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/merged.c | 12 ++----------
reftable/merged.h | 2 --
reftable/stack.c | 8 ++------
3 files changed, 4 insertions(+), 18 deletions(-)
diff --git a/reftable/merged.c b/reftable/merged.c
index f85a24c678..804fdc0de0 100644
--- a/reftable/merged.c
+++ b/reftable/merged.c
@@ -207,19 +207,11 @@ int reftable_new_merged_table(struct reftable_merged_table **dest,
return 0;
}
-/* clears the list of subtable, without affecting the readers themselves. */
-void merged_table_release(struct reftable_merged_table *mt)
-{
- FREE_AND_NULL(mt->stack);
- mt->stack_len = 0;
-}
-
void reftable_merged_table_free(struct reftable_merged_table *mt)
{
- if (!mt) {
+ if (!mt)
return;
- }
- merged_table_release(mt);
+ FREE_AND_NULL(mt->stack);
reftable_free(mt);
}
diff --git a/reftable/merged.h b/reftable/merged.h
index a2571dbc99..9db45c3196 100644
--- a/reftable/merged.h
+++ b/reftable/merged.h
@@ -24,6 +24,4 @@ struct reftable_merged_table {
uint64_t max;
};
-void merged_table_release(struct reftable_merged_table *mt);
-
#endif
diff --git a/reftable/stack.c b/reftable/stack.c
index a59ebe038d..984fd866d0 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -261,10 +261,8 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
new_tables = NULL;
st->readers_len = new_readers_len;
- if (st->merged) {
- merged_table_release(st->merged);
+ if (st->merged)
reftable_merged_table_free(st->merged);
- }
if (st->readers) {
reftable_free(st->readers);
}
@@ -968,10 +966,8 @@ static int stack_write_compact(struct reftable_stack *st,
done:
reftable_iterator_destroy(&it);
- if (mt) {
- merged_table_release(mt);
+ if (mt)
reftable_merged_table_free(mt);
- }
reftable_ref_record_release(&ref);
reftable_log_record_release(&log);
st->stats.entries_written += entries;
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 09/12] worktree: don't store main worktree twice
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (7 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:31 ` [PATCH v3 10/12] refs: implement removal of ref storages Patrick Steinhardt
` (3 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 3113 bytes --]
In `get_worktree_ref_store()` we either return the repository's main ref
store, or we look up the ref store via the map of worktree ref stores.
Which of these worktrees gets picked depends on the `is_current` bit of
the worktree, which indicates whether the worktree is the one that
corresponds to `the_repository`.
The bit is getting set in `get_worktrees()`, but only after we have
computed the list of all worktrees. This is too late though, because at
that time we have already called `get_worktree_ref_store()` on each of
the worktrees via `add_head_info()`. The consequence is that the current
worktree will not have been marked accordingly, which means that we did
not use the main ref store, but instead created a new ref store. We thus
have two separate ref stores now that map to the same ref database.
Fix this by setting `is_current` before we call `add_head_info()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
worktree.c | 29 +++++++++++------------------
1 file changed, 11 insertions(+), 18 deletions(-)
diff --git a/worktree.c b/worktree.c
index 12eadacc61..70844d023a 100644
--- a/worktree.c
+++ b/worktree.c
@@ -53,6 +53,15 @@ static void add_head_info(struct worktree *wt)
wt->is_detached = 1;
}
+static int is_current_worktree(struct worktree *wt)
+{
+ char *git_dir = absolute_pathdup(get_git_dir());
+ const char *wt_git_dir = get_worktree_git_dir(wt);
+ int is_current = !fspathcmp(git_dir, absolute_path(wt_git_dir));
+ free(git_dir);
+ return is_current;
+}
+
/**
* get the main worktree
*/
@@ -76,6 +85,7 @@ static struct worktree *get_main_worktree(int skip_reading_head)
*/
worktree->is_bare = (is_bare_repository_cfg == 1) ||
is_bare_repository();
+ worktree->is_current = is_current_worktree(worktree);
if (!skip_reading_head)
add_head_info(worktree);
return worktree;
@@ -102,6 +112,7 @@ struct worktree *get_linked_worktree(const char *id,
worktree->repo = the_repository;
worktree->path = strbuf_detach(&worktree_path, NULL);
worktree->id = xstrdup(id);
+ worktree->is_current = is_current_worktree(worktree);
if (!skip_reading_head)
add_head_info(worktree);
@@ -111,23 +122,6 @@ struct worktree *get_linked_worktree(const char *id,
return worktree;
}
-static void mark_current_worktree(struct worktree **worktrees)
-{
- char *git_dir = absolute_pathdup(get_git_dir());
- int i;
-
- for (i = 0; worktrees[i]; i++) {
- struct worktree *wt = worktrees[i];
- const char *wt_git_dir = get_worktree_git_dir(wt);
-
- if (!fspathcmp(git_dir, absolute_path(wt_git_dir))) {
- wt->is_current = 1;
- break;
- }
- }
- free(git_dir);
-}
-
/*
* NEEDSWORK: This function exists so that we can look up metadata of a
* worktree without trying to access any of its internals like the refdb. It
@@ -164,7 +158,6 @@ static struct worktree **get_worktrees_internal(int skip_reading_head)
ALLOC_GROW(list, counter + 1, alloc);
list[counter] = NULL;
- mark_current_worktree(list);
return list;
}
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 10/12] refs: implement removal of ref storages
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (8 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 09/12] worktree: don't store main worktree twice Patrick Steinhardt
@ 2024-05-28 6:31 ` Patrick Steinhardt
2024-05-28 6:32 ` [PATCH v3 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
` (2 subsequent siblings)
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 8487 bytes --]
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.
Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 5 ++++
refs.h | 5 ++++
refs/files-backend.c | 63 +++++++++++++++++++++++++++++++++++++++++
refs/packed-backend.c | 15 ++++++++++
refs/refs-internal.h | 7 +++++
refs/reftable-backend.c | 52 ++++++++++++++++++++++++++++++++++
6 files changed, 147 insertions(+)
diff --git a/refs.c b/refs.c
index 66e9585767..9b112b0527 100644
--- a/refs.c
+++ b/refs.c
@@ -1861,6 +1861,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
return refs->be->create_on_disk(refs, flags, err);
}
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err)
+{
+ return refs->be->remove_on_disk(refs, err);
+}
+
int repo_resolve_gitlink_ref(struct repository *r,
const char *submodule, const char *refname,
struct object_id *oid)
diff --git a/refs.h b/refs.h
index 50a2b3ab09..61ee7b7a15 100644
--- a/refs.h
+++ b/refs.h
@@ -129,6 +129,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
*/
void ref_store_release(struct ref_store *ref_store);
+/*
+ * Remove the ref store from disk. This deletes all associated data.
+ */
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err);
+
/*
* Return the peeled value of the oid currently being iterated via
* for_each_ref(), etc. This is equivalent to calling:
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b7268b26c8..cb752d32b6 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3340,11 +3340,74 @@ static int files_ref_store_create_on_disk(struct ref_store *ref_store,
return 0;
}
+struct remove_one_root_ref_data {
+ const char *gitdir;
+ struct strbuf *err;
+};
+
+static int remove_one_root_ref(const char *refname,
+ void *cb_data)
+{
+ struct remove_one_root_ref_data *data = cb_data;
+ struct strbuf buf = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&buf, "%s/%s", data->gitdir, refname);
+
+ ret = unlink(buf.buf);
+ if (ret < 0)
+ strbuf_addf(data->err, "could not delete %s: %s\n",
+ refname, strerror(errno));
+
+ strbuf_release(&buf);
+ return ret;
+}
+
+static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct files_ref_store *refs =
+ files_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct remove_one_root_ref_data data = {
+ .gitdir = refs->base.gitdir,
+ .err = err,
+ };
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete refs: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete logs: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ ret = for_each_root_ref(refs, remove_one_root_ref, &data);
+ if (ret < 0)
+ ret = -1;
+
+ if (ref_store_remove_on_disk(refs->packed_ref_store, err) < 0)
+ ret = -1;
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct ref_storage_be refs_be_files = {
.name = "files",
.init = files_ref_store_init,
.release = files_ref_store_release,
.create_on_disk = files_ref_store_create_on_disk,
+ .remove_on_disk = files_ref_store_remove_on_disk,
.transaction_prepare = files_transaction_prepare,
.transaction_finish = files_transaction_finish,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 2789fd92f5..c4c1e36aa2 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1,5 +1,6 @@
#include "../git-compat-util.h"
#include "../config.h"
+#include "../dir.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1266,6 +1267,19 @@ static int packed_ref_store_create_on_disk(struct ref_store *ref_store UNUSED,
return 0;
}
+static int packed_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct packed_ref_store *refs = packed_downcast(ref_store, 0, "remove");
+
+ if (remove_path(refs->path) < 0) {
+ strbuf_addstr(err, "could not delete packed-refs");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* Write the packed refs from the current snapshot to the packed-refs
* tempfile, incorporating any changes from `updates`. `updates` must
@@ -1724,6 +1738,7 @@ struct ref_storage_be refs_be_packed = {
.init = packed_ref_store_init,
.release = packed_ref_store_release,
.create_on_disk = packed_ref_store_create_on_disk,
+ .remove_on_disk = packed_ref_store_remove_on_disk,
.transaction_prepare = packed_transaction_prepare,
.transaction_finish = packed_transaction_finish,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 33749fbd83..cbcb6f9c36 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -517,6 +517,12 @@ typedef int ref_store_create_on_disk_fn(struct ref_store *refs,
int flags,
struct strbuf *err);
+/*
+ * Remove the reference store from disk.
+ */
+typedef int ref_store_remove_on_disk_fn(struct ref_store *refs,
+ struct strbuf *err);
+
typedef int ref_transaction_prepare_fn(struct ref_store *refs,
struct ref_transaction *transaction,
struct strbuf *err);
@@ -649,6 +655,7 @@ struct ref_storage_be {
ref_store_init_fn *init;
ref_store_release_fn *release;
ref_store_create_on_disk_fn *create_on_disk;
+ ref_store_remove_on_disk_fn *remove_on_disk;
ref_transaction_prepare_fn *transaction_prepare;
ref_transaction_finish_fn *transaction_finish;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index bffed9257f..e555be4671 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
#include "../git-compat-util.h"
#include "../abspath.h"
#include "../chdir-notify.h"
+#include "../dir.h"
#include "../environment.h"
#include "../gettext.h"
#include "../hash.h"
@@ -343,6 +344,56 @@ static int reftable_be_create_on_disk(struct ref_store *ref_store,
return 0;
}
+static int reftable_be_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct reftable_ref_store *refs =
+ reftable_be_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ /*
+ * Release the ref store such that all stacks are closed. This is
+ * required so that the "tables.list" file is not open anymore, which
+ * would otherwise make it impossible to remove the file on Windows.
+ */
+ reftable_be_release(ref_store);
+
+ strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete reftables: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+ if (unlink(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub HEAD: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+ if (unlink(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub heads: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (rmdir(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub heads: %s",
+ strerror(errno));
+ ret = -1;
+ }
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct reftable_ref_iterator {
struct ref_iterator base;
struct reftable_ref_store *refs;
@@ -2196,6 +2247,7 @@ struct ref_storage_be refs_be_reftable = {
.init = reftable_be_init,
.release = reftable_be_release,
.create_on_disk = reftable_be_create_on_disk,
+ .remove_on_disk = reftable_be_remove_on_disk,
.transaction_prepare = reftable_be_transaction_prepare,
.transaction_finish = reftable_be_transaction_finish,
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 11/12] refs: implement logic to migrate between ref storage formats
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (9 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 10/12] refs: implement removal of ref storages Patrick Steinhardt
@ 2024-05-28 6:32 ` Patrick Steinhardt
2024-05-28 6:32 ` [PATCH v3 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
2024-05-28 18:16 ` [PATCH v3 00/12] refs: ref storage format migrations Junio C Hamano
12 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:32 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 11939 bytes --]
With the introduction of the new "reftable" backend, users may want to
migrate repositories between the backends without having to recreate the
whole repository. Add the logic to do so.
The implementation is generic and works with arbitrary ref storage
formats so that a backend does not need to implement any migration
logic. It does have a few limitations though:
- We do not migrate repositories with worktrees, because worktrees
have separate ref storages. It makes the overall affair more complex
if we have to migrate multiple storages at once.
- We do not migrate reflogs, because we have no interfaces to write
many reflog entries.
- We do not lock the repository for concurrent access, and thus
concurrent writes may end up with weird in-between states. There is
no way to fully lock the "files" backend for writes due to its
format, and thus we punt on this topic altogether and defer to the
user to avoid those from happening.
In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
often have neither worktrees nor reflogs. But it will not work for many
other repositories without some preparations. These limitations are not
set into stone though, and ideally we will eventually address them over
time.
The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 305 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
refs.h | 18 ++++
2 files changed, 323 insertions(+)
diff --git a/refs.c b/refs.c
index 9b112b0527..f7c7765d23 100644
--- a/refs.c
+++ b/refs.c
@@ -2570,3 +2570,308 @@ int ref_update_check_old_target(const char *referent, struct ref_update *update,
referent, update->old_target);
return -1;
}
+
+struct migration_data {
+ struct ref_store *old_refs;
+ struct ref_transaction *transaction;
+ struct strbuf *errbuf;
+};
+
+static int migrate_one_ref(const char *refname, const struct object_id *oid,
+ int flags, void *cb_data)
+{
+ struct migration_data *data = cb_data;
+ struct strbuf symref_target = STRBUF_INIT;
+ int ret;
+
+ if (flags & REF_ISSYMREF) {
+ ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target);
+ if (ret < 0)
+ goto done;
+
+ ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(),
+ symref_target.buf, NULL,
+ REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ } else {
+ ret = ref_transaction_create(data->transaction, refname, oid,
+ REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION,
+ NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ }
+
+done:
+ strbuf_release(&symref_target);
+ return ret;
+}
+
+static int move_files(const char *from_path, const char *to_path, struct strbuf *errbuf)
+{
+ struct strbuf from_buf = STRBUF_INIT, to_buf = STRBUF_INIT;
+ size_t from_len, to_len;
+ DIR *from_dir;
+ int ret;
+
+ from_dir = opendir(from_path);
+ if (!from_dir) {
+ strbuf_addf(errbuf, "could not open source directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ strbuf_addstr(&from_buf, from_path);
+ strbuf_complete(&from_buf, '/');
+ from_len = from_buf.len;
+
+ strbuf_addstr(&to_buf, to_path);
+ strbuf_complete(&to_buf, '/');
+ to_len = to_buf.len;
+
+ while (1) {
+ struct dirent *ent;
+
+ errno = 0;
+ ent = readdir(from_dir);
+ if (!ent)
+ break;
+
+ if (!strcmp(ent->d_name, ".") ||
+ !strcmp(ent->d_name, ".."))
+ continue;
+
+ strbuf_setlen(&from_buf, from_len);
+ strbuf_addstr(&from_buf, ent->d_name);
+
+ strbuf_setlen(&to_buf, to_len);
+ strbuf_addstr(&to_buf, ent->d_name);
+
+ ret = rename(from_buf.buf, to_buf.buf);
+ if (ret < 0) {
+ strbuf_addf(errbuf, "could not link file '%s' to '%s': %s",
+ from_buf.buf, to_buf.buf, strerror(errno));
+ goto done;
+ }
+ }
+
+ if (errno) {
+ strbuf_addf(errbuf, "could not read entry from directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ ret = 0;
+
+done:
+ strbuf_release(&from_buf);
+ strbuf_release(&to_buf);
+ if (from_dir)
+ closedir(from_dir);
+ return ret;
+}
+
+static int count_reflogs(const char *reflog UNUSED, void *payload)
+{
+ size_t *reflog_count = payload;
+ (*reflog_count)++;
+ return 0;
+}
+
+static int has_worktrees(void)
+{
+ struct worktree **worktrees = get_worktrees();
+ int ret = 0;
+ size_t i;
+
+ for (i = 0; worktrees[i]; i++) {
+ if (is_main_worktree(worktrees[i]))
+ continue;
+ ret = 1;
+ }
+
+ free_worktrees(worktrees);
+ return ret;
+}
+
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *errbuf)
+{
+ struct ref_store *old_refs = NULL, *new_refs = NULL;
+ struct ref_transaction *transaction = NULL;
+ struct strbuf buf = STRBUF_INIT;
+ struct migration_data data;
+ size_t reflog_count = 0;
+ char *new_gitdir = NULL;
+ int did_migrate_refs = 0;
+ int ret;
+
+ old_refs = get_main_ref_store(repo);
+
+ /*
+ * We do not have any interfaces that would allow us to write many
+ * reflog entries. Once we have them we can remove this restriction.
+ */
+ if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
+ strbuf_addstr(errbuf, "cannot count reflogs");
+ ret = -1;
+ goto done;
+ }
+ if (reflog_count) {
+ strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * Worktrees complicate the migration because every worktree has a
+ * separate ref storage. While it should be feasible to implement, this
+ * is pushed out to a future iteration.
+ *
+ * TODO: we should really be passing the caller-provided repository to
+ * `has_worktrees()`, but our worktree subsystem doesn't yet support
+ * that.
+ */
+ if (has_worktrees()) {
+ strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * The overall logic looks like this:
+ *
+ * 1. Set up a new temporary directory and initialize it with the new
+ * format. This is where all refs will be migrated into.
+ *
+ * 2. Enumerate all refs and write them into the new ref storage.
+ * This operation is safe as we do not yet modify the main
+ * repository.
+ *
+ * 3. If we're in dry-run mode then we are done and can hand over the
+ * directory to the caller for inspection. If not, we now start
+ * with the destructive part.
+ *
+ * 4. Delete the old ref storage from disk. As we have a copy of refs
+ * in the new ref storage it's okay(ish) if we now get interrupted
+ * as there is an equivalent copy of all refs available.
+ *
+ * 5. Move the new ref storage files into place.
+ *
+ * 6. Change the repository format to the new ref format.
+ */
+ strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
+ new_gitdir = mkdtemp(xstrdup(buf.buf));
+ if (!new_gitdir) {
+ strbuf_addf(errbuf, "cannot create migration directory: %s",
+ strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ new_refs = ref_store_init(repo, format, new_gitdir,
+ REF_STORE_ALL_CAPS);
+ ret = ref_store_create_on_disk(new_refs, 0, errbuf);
+ if (ret < 0)
+ goto done;
+
+ transaction = ref_store_transaction_begin(new_refs, errbuf);
+ if (!transaction)
+ goto done;
+
+ data.old_refs = old_refs;
+ data.transaction = transaction;
+ data.errbuf = errbuf;
+
+ /*
+ * We need to use the internal `do_for_each_ref()` here so that we can
+ * also include broken refs and symrefs. These would otherwise be
+ * skipped silently.
+ *
+ * Ideally, we would do this call while locking the old ref storage
+ * such that there cannot be any concurrent modifications. We do not
+ * have the infra for that though, and the "files" backend does not
+ * allow for a central lock due to its design. It's thus on the user to
+ * ensure that there are no concurrent writes.
+ */
+ ret = do_for_each_ref(old_refs, "", NULL, migrate_one_ref, 0,
+ DO_FOR_EACH_INCLUDE_ROOT_REFS | DO_FOR_EACH_INCLUDE_BROKEN,
+ &data);
+ if (ret < 0)
+ goto done;
+
+ /*
+ * TODO: we might want to migrate to `initial_ref_transaction_commit()`
+ * here, which is more efficient for the files backend because it would
+ * write new refs into the packed-refs file directly. At this point,
+ * the files backend doesn't handle pseudo-refs and symrefs correctly
+ * though, so this requires some more work.
+ */
+ ret = ref_transaction_commit(transaction, errbuf);
+ if (ret < 0)
+ goto done;
+ did_migrate_refs = 1;
+
+ if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
+ printf(_("Finished dry-run migration of refs, "
+ "the result can be found at '%s'\n"), new_gitdir);
+ ret = 0;
+ goto done;
+ }
+
+ /*
+ * Until now we were in the non-destructive phase, where we only
+ * populated the new ref store. From hereon though we are about
+ * to get hands by deleting the old ref store and then moving
+ * the new one into place.
+ *
+ * Assuming that there were no concurrent writes, the new ref
+ * store should have all information. So if we fail from hereon
+ * we may be in an in-between state, but it would still be able
+ * to recover by manually moving remaining files from the
+ * temporary migration directory into place.
+ */
+ ret = ref_store_remove_on_disk(old_refs, errbuf);
+ if (ret < 0)
+ goto done;
+
+ ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
+ if (ret < 0)
+ goto done;
+
+ if (rmdir(new_gitdir) < 0)
+ warning_errno(_("could not remove temporary migration directory '%s'"),
+ new_gitdir);
+
+ /*
+ * We have migrated the repository, so we now need to adjust the
+ * repository format so that clients will use the new ref store.
+ * We also need to swap out the repository's main ref store.
+ */
+ initialize_repository_version(hash_algo_by_ptr(repo->hash_algo), format, 1);
+
+ free(new_refs->gitdir);
+ new_refs->gitdir = xstrdup(old_refs->gitdir);
+ repo->refs_private = new_refs;
+ ref_store_release(old_refs);
+
+ ret = 0;
+
+done:
+ if (ret && did_migrate_refs) {
+ strbuf_complete(errbuf, '\n');
+ strbuf_addf(errbuf, _("migrated refs can be found at '%s'"),
+ new_gitdir);
+ }
+
+ if (ret && new_refs)
+ ref_store_release(new_refs);
+ ref_transaction_free(transaction);
+ strbuf_release(&buf);
+ free(new_gitdir);
+ return ret;
+}
diff --git a/refs.h b/refs.h
index 61ee7b7a15..76d25df4de 100644
--- a/refs.h
+++ b/refs.h
@@ -1070,6 +1070,24 @@ int is_root_ref(const char *refname);
*/
int is_pseudo_ref(const char *refname);
+/*
+ * The following flags can be passed to `repo_migrate_ref_storage_format()`:
+ *
+ * - REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN: perform a dry-run migration
+ * without touching the main repository. The result will be written into a
+ * temporary ref storage directory.
+ */
+#define REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN (1 << 0)
+
+/*
+ * Migrate the ref storage format used by the repository to the
+ * specified one.
+ */
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *err);
+
/*
* The following functions have been removed in Git v2.45 in favor of functions
* that receive a `ref_store` as parameter. The intent of this section is
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 12/12] builtin/refs: new command to migrate ref storage formats
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (10 preceding siblings ...)
2024-05-28 6:32 ` [PATCH v3 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-05-28 6:32 ` Patrick Steinhardt
2024-05-31 23:46 ` Junio C Hamano
2024-05-28 18:16 ` [PATCH v3 00/12] refs: ref storage format migrations Junio C Hamano
12 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-05-28 6:32 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 15604 bytes --]
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:
- There is no good place to put the migration logic in existing
commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
not the correct place to put it, either.
- I had it in my mind to create a new low-level command for accessing
refs for quite a while already. git-refs(1) is that command and can
over time grow more functionality relating to refs. This should help
discoverability by consolidating low-level access to refs into a
single executable.
As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
.gitignore | 1 +
Documentation/git-refs.txt | 62 ++++++++++
Makefile | 1 +
builtin.h | 1 +
builtin/refs.c | 75 ++++++++++++
git.c | 1 +
t/t1460-refs-migrate.sh | 243 +++++++++++++++++++++++++++++++++++++
7 files changed, 384 insertions(+)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
diff --git a/.gitignore b/.gitignore
index 612c0f6a0f..8caf3700c2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -126,6 +126,7 @@
/git-rebase
/git-receive-pack
/git-reflog
+/git-refs
/git-remote
/git-remote-http
/git-remote-https
diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
new file mode 100644
index 0000000000..3e9c05185a
--- /dev/null
+++ b/Documentation/git-refs.txt
@@ -0,0 +1,62 @@
+git-refs(1)
+===========
+
+NAME
+----
+
+git-refs - Low-level access to refs
+
+SYNOPSIS
+--------
+
+[verse]
+'git refs migrate' --ref-format=<format> [--dry-run]
+
+DESCRIPTION
+-----------
+
+This command provides low-level access to refs.
+
+COMMANDS
+--------
+
+migrate::
+ Migrate ref store between different formats.
+
+OPTIONS
+-------
+
+The following options are specific to 'git refs migrate':
+
+--ref-format=<format>::
+ The ref format to migrate the ref store to. Can be one of:
++
+include::ref-storage-format.txt[]
+
+--dry-run::
+ Perform the migration, but do not modify the repository. The migrated
+ refs will be written into a separate directory that can be inspected
+ separately. The name of the directory will be reported on stdout. This
+ can be used to double check that the migration works as expected before
+ performing the actual migration.
+
+KNOWN LIMITATIONS
+-----------------
+
+The ref format migration has several known limitations in its current form:
+
+* It is not possible to migrate repositories that have reflogs.
+
+* It is not possible to migrate repositories that have worktrees.
+
+* There is no way to block concurrent writes to the repository during an
+ ongoing migration. Concurrent writes can lead to an inconsistent migrated
+ state. Users are expected to block writes on a higher level. If your
+ repository is registered for scheduled maintenance, it is recommended to
+ unregister it first with git-maintenance(1).
+
+These limitations may eventually be lifted.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index cf504963c2..2d702b552c 100644
--- a/Makefile
+++ b/Makefile
@@ -1283,6 +1283,7 @@ BUILTIN_OBJS += builtin/read-tree.o
BUILTIN_OBJS += builtin/rebase.o
BUILTIN_OBJS += builtin/receive-pack.o
BUILTIN_OBJS += builtin/reflog.o
+BUILTIN_OBJS += builtin/refs.o
BUILTIN_OBJS += builtin/remote-ext.o
BUILTIN_OBJS += builtin/remote-fd.o
BUILTIN_OBJS += builtin/remote.o
diff --git a/builtin.h b/builtin.h
index 28280636da..7eda9b2486 100644
--- a/builtin.h
+++ b/builtin.h
@@ -207,6 +207,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix);
int cmd_rebase__interactive(int argc, const char **argv, const char *prefix);
int cmd_receive_pack(int argc, const char **argv, const char *prefix);
int cmd_reflog(int argc, const char **argv, const char *prefix);
+int cmd_refs(int argc, const char **argv, const char *prefix);
int cmd_remote(int argc, const char **argv, const char *prefix);
int cmd_remote_ext(int argc, const char **argv, const char *prefix);
int cmd_remote_fd(int argc, const char **argv, const char *prefix);
diff --git a/builtin/refs.c b/builtin/refs.c
new file mode 100644
index 0000000000..46dcd150d4
--- /dev/null
+++ b/builtin/refs.c
@@ -0,0 +1,75 @@
+#include "builtin.h"
+#include "parse-options.h"
+#include "refs.h"
+#include "repository.h"
+#include "strbuf.h"
+
+#define REFS_MIGRATE_USAGE \
+ N_("git refs migrate --ref-format=<format> [--dry-run]")
+
+static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
+{
+ const char * const migrate_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ const char *format_str = NULL;
+ enum ref_storage_format format;
+ unsigned int flags = 0;
+ struct option options[] = {
+ OPT_STRING_F(0, "ref-format", &format_str, N_("format"),
+ N_("specify the reference format to convert to"),
+ PARSE_OPT_NONEG),
+ OPT_BIT(0, "dry-run", &flags,
+ N_("perform a non-destructive dry-run"),
+ REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN),
+ OPT_END(),
+ };
+ struct strbuf errbuf = STRBUF_INIT;
+ int err;
+
+ argc = parse_options(argc, argv, prefix, options, migrate_usage, 0);
+ if (argc)
+ usage(_("too many arguments"));
+ if (!format_str)
+ usage(_("missing --ref-format=<format>"));
+
+ format = ref_storage_format_by_name(format_str);
+ if (format == REF_STORAGE_FORMAT_UNKNOWN) {
+ err = error(_("unknown ref storage format '%s'"), format_str);
+ goto out;
+ }
+
+ if (the_repository->ref_storage_format == format) {
+ err = error(_("repository already uses '%s' format"),
+ ref_storage_format_to_name(format));
+ goto out;
+ }
+
+ if (repo_migrate_ref_storage_format(the_repository, format, flags, &errbuf) < 0) {
+ err = error("%s", errbuf.buf);
+ goto out;
+ }
+
+ err = 0;
+
+out:
+ strbuf_release(&errbuf);
+ return err;
+}
+
+int cmd_refs(int argc, const char **argv, const char *prefix)
+{
+ const char * const refs_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ parse_opt_subcommand_fn *fn = NULL;
+ struct option opts[] = {
+ OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate),
+ OPT_END(),
+ };
+
+ argc = parse_options(argc, argv, prefix, opts, refs_usage, 0);
+ return fn(argc, argv, prefix);
+}
diff --git a/git.c b/git.c
index 637c61ca9c..683bb69194 100644
--- a/git.c
+++ b/git.c
@@ -594,6 +594,7 @@ static struct cmd_struct commands[] = {
{ "rebase", cmd_rebase, RUN_SETUP | NEED_WORK_TREE },
{ "receive-pack", cmd_receive_pack },
{ "reflog", cmd_reflog, RUN_SETUP },
+ { "refs", cmd_refs, RUN_SETUP },
{ "remote", cmd_remote, RUN_SETUP },
{ "remote-ext", cmd_remote_ext, NO_PARSEOPT },
{ "remote-fd", cmd_remote_fd, NO_PARSEOPT },
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
new file mode 100755
index 0000000000..f7c0783d30
--- /dev/null
+++ b/t/t1460-refs-migrate.sh
@@ -0,0 +1,243 @@
+#!/bin/sh
+
+test_description='migration of ref storage backends'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_migration () {
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >expect &&
+ git -C "$1" refs migrate --ref-format="$2" &&
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >actual &&
+ test_cmp expect actual &&
+
+ git -C "$1" rev-parse --show-ref-format >actual &&
+ echo "$2" >expect &&
+ test_cmp expect actual
+}
+
+test_expect_success 'setup' '
+ rm -rf .git &&
+ # The migration does not yet support reflogs.
+ git config --global core.logAllRefUpdates false
+'
+
+test_expect_success "superfluous arguments" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate foo 2>err &&
+ cat >expect <<-EOF &&
+ usage: too many arguments
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "missing ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate 2>err &&
+ cat >expect <<-EOF &&
+ usage: missing --ref-format=<format>
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "unknown ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=unknown 2>err &&
+ cat >expect <<-EOF &&
+ error: unknown ref storage format ${SQ}unknown${SQ}
+ EOF
+ test_cmp expect err
+'
+
+ref_formats="files reftable"
+for from_format in $ref_formats
+do
+ for to_format in $ref_formats
+ do
+ if test "$from_format" = "$to_format"
+ then
+ continue
+ fi
+
+ test_expect_success "$from_format: migration to same format fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$from_format 2>err &&
+ cat >expect <<-EOF &&
+ error: repository already uses ${SQ}$from_format${SQ} format
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with reflog fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_config -C repo core.logAllRefUpdates true &&
+ test_commit -C repo logged &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating reflogs is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with worktree fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ git -C repo worktree add wt &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating repositories with worktrees is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: unborn HEAD" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: single ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: bare repository" '
+ test_when_finished "rm -rf repo repo.git" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git clone --ref-format=$from_format --mirror repo repo.git &&
+ test_migration repo.git "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dangling symref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo symbolic-ref BROKEN_HEAD refs/heads/nonexistent &&
+ test_migration repo "$to_format" &&
+ echo refs/heads/nonexistent >expect &&
+ git -C repo symbolic-ref BROKEN_HEAD >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: broken ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test-tool -C repo ref-store main update-ref "" refs/heads/broken \
+ "$(test_oid 001)" "$ZERO_OID" REF_SKIP_CREATE_REFLOG,REF_SKIP_OID_VERIFICATION &&
+ test_migration repo "$to_format" &&
+ test_oid 001 >expect &&
+ git -C repo rev-parse refs/heads/broken >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: pseudo-refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo update-ref FOO_HEAD HEAD &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: special refs are left alone" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo rev-parse HEAD >repo/.git/MERGE_HEAD &&
+ git -C repo rev-parse MERGE_HEAD &&
+ test_migration repo "$to_format" &&
+ test_path_is_file repo/.git/MERGE_HEAD
+ '
+
+ test_expect_success "$from_format -> $to_format: a bunch of refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+
+ test_commit -C repo initial &&
+ cat >input <<-EOF &&
+ create FOO_HEAD HEAD
+ create refs/heads/branch-1 HEAD
+ create refs/heads/branch-2 HEAD
+ create refs/heads/branch-3 HEAD
+ create refs/heads/branch-4 HEAD
+ create refs/tags/tag-1 HEAD
+ create refs/tags/tag-2 HEAD
+ EOF
+ git -C repo update-ref --stdin <input &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dry-run migration does not modify repository" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo refs migrate --dry-run \
+ --ref-format=$to_format >output &&
+ grep "Finished dry-run migration of refs" output &&
+ test_path_is_dir repo/.git/ref_migration.* &&
+ echo $from_format >expect &&
+ git -C repo rev-parse --show-ref-format >actual &&
+ test_cmp expect actual
+ '
+ done
+done
+
+test_expect_success 'migrating from files format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=files repo &&
+ test_commit -C repo first &&
+ git -C repo pack-refs --all &&
+ test_commit -C repo second &&
+ git -C repo update-ref ORIG_HEAD HEAD &&
+ git -C repo rev-parse HEAD >repo/.git/FETCH_HEAD &&
+
+ test_path_is_file repo/.git/HEAD &&
+ test_path_is_file repo/.git/ORIG_HEAD &&
+ test_path_is_file repo/.git/refs/heads/main &&
+ test_path_is_file repo/.git/packed-refs &&
+
+ test_migration repo reftable &&
+
+ echo "ref: refs/heads/.invalid" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ echo "this repository uses the reftable format" >expect &&
+ test_cmp expect repo/.git/refs/heads &&
+ test_path_is_file repo/.git/FETCH_HEAD &&
+ test_path_is_missing repo/.git/ORIG_HEAD &&
+ test_path_is_missing repo/.git/refs/heads/main &&
+ test_path_is_missing repo/.git/logs &&
+ test_path_is_missing repo/.git/packed-refs
+'
+
+test_expect_success 'migrating from reftable format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=reftable repo &&
+ test_commit -C repo first &&
+
+ test_path_is_dir repo/.git/reftable &&
+ test_migration repo files &&
+
+ test_path_is_missing repo/.git/reftable &&
+ echo "ref: refs/heads/main" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ test_path_is_file repo/.git/refs/heads/main
+'
+
+test_done
--
2.45.1.246.gb9cfe4845c.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v3 12/12] builtin/refs: new command to migrate ref storage formats
2024-05-28 6:32 ` [PATCH v3 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
@ 2024-05-31 23:46 ` Junio C Hamano
2024-06-02 1:03 ` Junio C Hamano
0 siblings, 1 reply; 103+ messages in thread
From: Junio C Hamano @ 2024-05-31 23:46 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Ramsay Jones, Justin Tobler
Patrick Steinhardt <ps@pks.im> writes:
> diff --git a/git.c b/git.c
> index 637c61ca9c..683bb69194 100644
> --- a/git.c
> +++ b/git.c
> @@ -594,6 +594,7 @@ static struct cmd_struct commands[] = {
> { "rebase", cmd_rebase, RUN_SETUP | NEED_WORK_TREE },
> { "receive-pack", cmd_receive_pack },
> { "reflog", cmd_reflog, RUN_SETUP },
> + { "refs", cmd_refs, RUN_SETUP },
> { "remote", cmd_remote, RUN_SETUP },
> { "remote-ext", cmd_remote_ext, NO_PARSEOPT },
> { "remote-fd", cmd_remote_fd, NO_PARSEOPT },
One thing missing is an entry in command-list.
If you ran "make check-docs", you would have seen
$ make check-docs
no link: git-refs
The Documentation/MyFirstContribution.txt file does mention
command-list, but it is rather messy and unorganized. I think the
checklist at the top of <builtin.h> would be the best source of
information at this moment.
Thanks.
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v3 12/12] builtin/refs: new command to migrate ref storage formats
2024-05-31 23:46 ` Junio C Hamano
@ 2024-06-02 1:03 ` Junio C Hamano
2024-06-03 7:37 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Junio C Hamano @ 2024-06-02 1:03 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Ramsay Jones, Justin Tobler
Junio C Hamano <gitster@pobox.com> writes:
> One thing missing is an entry in command-list.
>
> If you ran "make check-docs", you would have seen
>
> $ make check-docs
> no link: git-refs
>
> The Documentation/MyFirstContribution.txt file does mention
> command-list, but it is rather messy and unorganized. I think the
> checklist at the top of <builtin.h> would be the best source of
> information at this moment.
>
> Thanks.
You'd need something like this.
With the command missing from command-list.txt, git.1 (which has the
list of commands) will fail to mention the command, of course.
The fix to the documentation file itself is also crucial, as the
name section is where we grab the list of command descriptions used
in "git help -a", and with the extra blank line, git.1 will fail to
build.
--- >8 ---
Subject: SQUASH???
diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
index 3e9c05185a..5b99e04385 100644
--- a/Documentation/git-refs.txt
+++ b/Documentation/git-refs.txt
@@ -3,12 +3,11 @@ git-refs(1)
NAME
----
-
git-refs - Low-level access to refs
+
SYNOPSIS
--------
-
[verse]
'git refs migrate' --ref-format=<format> [--dry-run]
diff --git a/command-list.txt b/command-list.txt
index c4cd0f352b..e0bb87b3b5 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -157,6 +157,7 @@ git-read-tree plumbingmanipulators
git-rebase mainporcelain history
git-receive-pack synchelpers
git-reflog ancillarymanipulators complete
+git-refs ancillarymanipulators complete
git-remote ancillarymanipulators complete
git-repack ancillarymanipulators complete
git-replace ancillarymanipulators complete
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v3 12/12] builtin/refs: new command to migrate ref storage formats
2024-06-02 1:03 ` Junio C Hamano
@ 2024-06-03 7:37 ` Patrick Steinhardt
0 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 7:37 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Eric Sunshine, Ramsay Jones, Justin Tobler, psteinhardt
[-- Attachment #1: Type: text/plain, Size: 1311 bytes --]
On Sat, Jun 01, 2024 at 06:03:27PM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > One thing missing is an entry in command-list.
> >
> > If you ran "make check-docs", you would have seen
> >
> > $ make check-docs
> > no link: git-refs
> >
> > The Documentation/MyFirstContribution.txt file does mention
> > command-list, but it is rather messy and unorganized. I think the
> > checklist at the top of <builtin.h> would be the best source of
> > information at this moment.
> >
> > Thanks.
>
> You'd need something like this.
>
> With the command missing from command-list.txt, git.1 (which has the
> list of commands) will fail to mention the command, of course.
>
> The fix to the documentation file itself is also crucial, as the
> name section is where we grab the list of command descriptions used
> in "git help -a", and with the extra blank line, git.1 will fail to
> build.
Thanks, I'll squash this in and send a new version.
It would of course be great if CI had noticed this. And we do execute
`make check-docs` via "ci/test-documentation.sh" indeed. But the problem
is that `make check-docs` does not return an error when there is a
missing link.
I'll send a follow-up for this test gap later this week.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v3 00/12] refs: ref storage format migrations
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
` (11 preceding siblings ...)
2024-05-28 6:32 ` [PATCH v3 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
@ 2024-05-28 18:16 ` Junio C Hamano
2024-05-28 18:26 ` Junio C Hamano
12 siblings, 1 reply; 103+ messages in thread
From: Junio C Hamano @ 2024-05-28 18:16 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Ramsay Jones, Justin Tobler
Patrick Steinhardt <ps@pks.im> writes:
> - Swapped out calls to `remove_path()` to `unlink()`. We do not want
> to walk up and remove empty parent directories, even though this is
> harmless in practice.
Hmph.
It is customary to remove a directory when the last file in it gets
removed in the working tree, because Git tracks contents and does
not track directories, and it seems that the files backend does the
equivalent in the files_transaction_finish() method with
unlink_or_warn() followed by try_remove_empty_parents(). If we are
transitioning from the files backend to the reftable backend, don't
we want to end with no loose ref files under $GIT_DIR/refs/ and no
empty directories to house those loose ref files that would be
created in the future?
Let's find out why this is needed in [10/12]. It may just be a
simple matter of "let's not bother removing directories as we remove
loose ref files one by one---we know the whole hierarchy can be
removed after we are done", in which case I do think it is nicer.
> - Release the reftable refdb before removing it. This closes the
> cached "tables.list" file descriptor, which would otherwise break
> removal of this file on Windows.
>
> - Fix a bug with worktrees where we store the current worktree refdb
> twice. This caused us to keep file descriptors open, which breaks
> removal of the refdb on Windows.
Wow. Windows' limitation sometimes helps us catch real bugs ;-).
Thanks, will replace to take a look.
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v3 00/12] refs: ref storage format migrations
2024-05-28 18:16 ` [PATCH v3 00/12] refs: ref storage format migrations Junio C Hamano
@ 2024-05-28 18:26 ` Junio C Hamano
0 siblings, 0 replies; 103+ messages in thread
From: Junio C Hamano @ 2024-05-28 18:26 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Ramsay Jones, Justin Tobler
Junio C Hamano <gitster@pobox.com> writes:
> Patrick Steinhardt <ps@pks.im> writes:
>
>> - Swapped out calls to `remove_path()` to `unlink()`. We do not want
>> to walk up and remove empty parent directories, even though this is
>> harmless in practice.
> ...
> Let's find out why this is needed in [10/12]. It may just be a
> simple matter of "let's not bother removing directories as we remove
> loose ref files one by one---we know the whole hierarchy can be
> removed after we are done", in which case I do think it is nicer.
Ah, it is not something as sophisticated like that. It simply is
wrong to use remove_path() to remove files used by files_backend, as
the helper is designed to work on working tree files.
The reason it is wrong is because "now I removed this path, if the
containing directory has become empty, I need to remove that
directory, and I need to go up recursively doing that" has to stop
somewhere, and for remove_path() that is on the working tree side
the natural place to stop is at the root of the working tree,
i.e. above ".git/" directory. Of course, when removing extra
directories above a loose ref file, the recursion must stop much
earlier than going up to ".git/", and try_remove_empty_parents()
in files-backend.c is the helper that was designed for the task.
Looking at the difference between the result of applying v2 and v3,
I think this "unlink" thing is only about removing root refs? So I
agree that a simple unlink() is not just adequate but is absolutely
the right thing to do. There is no reason for us to go up and remove
empty directories when we remove ORIG_HEAD or other stuff.
Thanks.
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v4 00/12] refs: ref storage migrations
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (11 preceding siblings ...)
2024-05-28 6:31 ` [PATCH v3 00/12] refs: ref storage format migrations Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
` (11 more replies)
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
13 siblings, 12 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 4537 bytes --]
Hi,
this is the fourth version of my patch series that implements the logic
to migrate between ref storage formats.
Changes compared to v3:
- Add missing entry in "command-list.txt".
- Adapt manpage so that relevant data can be extracted from it.
Thanks!
Patrick
Patrick Steinhardt (12):
setup: unset ref storage when reinitializing repository version
refs: convert ref storage format to an enum
refs: pass storage format to `ref_store_init()` explicitly
refs: allow to skip creation of reflog entries
refs/files: refactor `add_pseudoref_and_head_entries()`
refs/files: extract function to iterate through root refs
refs/files: fix NULL pointer deref when releasing ref store
reftable: inline `merged_table_release()`
worktree: don't store main worktree twice
refs: implement removal of ref storages
refs: implement logic to migrate between ref storage formats
builtin/refs: new command to migrate ref storage formats
.gitignore | 1 +
Documentation/git-refs.txt | 61 +++++++
Makefile | 1 +
builtin.h | 1 +
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
builtin/refs.c | 75 ++++++++
command-list.txt | 1 +
git.c | 1 +
refs.c | 340 +++++++++++++++++++++++++++++++++++--
refs.h | 41 ++++-
refs/files-backend.c | 123 ++++++++++++--
refs/packed-backend.c | 15 ++
refs/ref-cache.c | 2 +
refs/refs-internal.h | 7 +
refs/reftable-backend.c | 55 +++++-
reftable/merged.c | 12 +-
reftable/merged.h | 2 -
reftable/stack.c | 8 +-
repository.c | 3 +-
repository.h | 10 +-
setup.c | 10 +-
setup.h | 9 +-
t/helper/test-ref-store.c | 1 +
t/t1460-refs-migrate.sh | 243 ++++++++++++++++++++++++++
worktree.c | 29 ++--
26 files changed, 974 insertions(+), 81 deletions(-)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
Range-diff against v3:
1: afb705f6a0 = 1: afb705f6a0 setup: unset ref storage when reinitializing repository version
2: 7989e82dcd = 2: 7989e82dcd refs: convert ref storage format to an enum
3: 7d1a86292c = 3: 7d1a86292c refs: pass storage format to `ref_store_init()` explicitly
4: d0539b7456 = 4: d0539b7456 refs: allow to skip creation of reflog entries
5: 7f9ce5af2e = 5: 7f9ce5af2e refs/files: refactor `add_pseudoref_and_head_entries()`
6: f7577a0ab3 = 6: f7577a0ab3 refs/files: extract function to iterate through root refs
7: 56baa798fb = 7: 56baa798fb refs/files: fix NULL pointer deref when releasing ref store
8: c7e8ab40b5 = 8: c7e8ab40b5 reftable: inline `merged_table_release()`
9: 7a89aae515 = 9: 7a89aae515 worktree: don't store main worktree twice
10: f9d9420cf9 = 10: f9d9420cf9 refs: implement removal of ref storages
11: 1f26051eff = 11: 1f26051eff refs: implement logic to migrate between ref storage formats
12: d832414d1f ! 12: 83cb3f8c96 builtin/refs: new command to migrate ref storage formats
@@ Documentation/git-refs.txt (new)
+
+NAME
+----
-+
+git-refs - Low-level access to refs
+
++
+SYNOPSIS
+--------
-+
+[verse]
+'git refs migrate' --ref-format=<format> [--dry-run]
+
@@ builtin/refs.c (new)
+ return fn(argc, argv, prefix);
+}
+ ## command-list.txt ##
+@@ command-list.txt: git-read-tree plumbingmanipulators
+ git-rebase mainporcelain history
+ git-receive-pack synchelpers
+ git-reflog ancillarymanipulators complete
++git-refs ancillarymanipulators complete
+ git-remote ancillarymanipulators complete
+ git-repack ancillarymanipulators complete
+ git-replace ancillarymanipulators complete
+
## git.c ##
@@ git.c: static struct cmd_struct commands[] = {
{ "rebase", cmd_rebase, RUN_SETUP | NEED_WORK_TREE },
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v4 01/12] setup: unset ref storage when reinitializing repository version
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
` (10 subsequent siblings)
11 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]
When reinitializing a repository's version we may end up unsetting the
hash algorithm when it matches the default hash algorithm. If we didn't
do that then the previously configured value might remain intact.
While the same issue exists for the ref storage extension, we don't do
this here. This has been fine for most of the part because it is not
supported to re-initialize a repository with a different ref storage
format anyway. We're about to introduce a new command to migrate ref
storages though, so this is about to become an issue there.
Prepare for this and unset the ref storage format when reinitializing a
repository with the "files" format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/setup.c b/setup.c
index 7975230ffb..8c84ec9d4b 100644
--- a/setup.c
+++ b/setup.c
@@ -2028,6 +2028,8 @@ void initialize_repository_version(int hash_algo,
if (ref_storage_format != REF_STORAGE_FORMAT_FILES)
git_config_set("extensions.refstorage",
ref_storage_format_to_name(ref_storage_format));
+ else if (reinit)
+ git_config_set_gently("extensions.refstorage", NULL);
}
static int is_reinit(void)
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v4 02/12] refs: convert ref storage format to an enum
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
` (9 subsequent siblings)
11 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 8337 bytes --]
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.
Convert the ref storage format to instead be an enum.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
refs.c | 7 ++++---
refs.h | 10 ++++++++--
repository.c | 3 ++-
repository.h | 10 ++++------
setup.c | 8 ++++----
setup.h | 9 +++++----
8 files changed, 29 insertions(+), 22 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index 1e07524c53..e808e02017 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -970,7 +970,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
int submodule_progress;
int filter_submodules = 0;
int hash_algo;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
const int do_not_override_repo_unix_permissions = -1;
const char *template_dir;
char *template_dir_dup = NULL;
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 0170469b84..582dcf20f8 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -81,7 +81,7 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
const char *ref_format = NULL;
const char *initial_branch = NULL;
int hash_algo = GIT_HASH_UNKNOWN;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
int init_shared_repository = -1;
const struct option init_db_options[] = {
OPT_STRING(0, "template", &template_dir, N_("template-directory"),
diff --git a/refs.c b/refs.c
index 31032588e0..e6db85a165 100644
--- a/refs.c
+++ b/refs.c
@@ -37,14 +37,15 @@ static const struct ref_storage_be *refs_backends[] = {
[REF_STORAGE_FORMAT_REFTABLE] = &refs_be_reftable,
};
-static const struct ref_storage_be *find_ref_storage_backend(unsigned int ref_storage_format)
+static const struct ref_storage_be *find_ref_storage_backend(
+ enum ref_storage_format ref_storage_format)
{
if (ref_storage_format < ARRAY_SIZE(refs_backends))
return refs_backends[ref_storage_format];
return NULL;
}
-unsigned int ref_storage_format_by_name(const char *name)
+enum ref_storage_format ref_storage_format_by_name(const char *name)
{
for (unsigned int i = 0; i < ARRAY_SIZE(refs_backends); i++)
if (refs_backends[i] && !strcmp(refs_backends[i]->name, name))
@@ -52,7 +53,7 @@ unsigned int ref_storage_format_by_name(const char *name)
return REF_STORAGE_FORMAT_UNKNOWN;
}
-const char *ref_storage_format_to_name(unsigned int ref_storage_format)
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format)
{
const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
if (!be)
diff --git a/refs.h b/refs.h
index fe7f0db35e..a7afa9bede 100644
--- a/refs.h
+++ b/refs.h
@@ -11,8 +11,14 @@ struct string_list;
struct string_list_item;
struct worktree;
-unsigned int ref_storage_format_by_name(const char *name);
-const char *ref_storage_format_to_name(unsigned int ref_storage_format);
+enum ref_storage_format {
+ REF_STORAGE_FORMAT_UNKNOWN,
+ REF_STORAGE_FORMAT_FILES,
+ REF_STORAGE_FORMAT_REFTABLE,
+};
+
+enum ref_storage_format ref_storage_format_by_name(const char *name);
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format);
/*
* Resolve a reference, recursively following symbolic refererences.
diff --git a/repository.c b/repository.c
index d29b0304fb..166863f852 100644
--- a/repository.c
+++ b/repository.c
@@ -124,7 +124,8 @@ void repo_set_compat_hash_algo(struct repository *repo, int algo)
repo_read_loose_object_map(repo);
}
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format)
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format)
{
repo->ref_storage_format = format;
}
diff --git a/repository.h b/repository.h
index 4bd8969005..a35cd77c35 100644
--- a/repository.h
+++ b/repository.h
@@ -1,6 +1,7 @@
#ifndef REPOSITORY_H
#define REPOSITORY_H
+#include "refs.h"
#include "strmap.h"
struct config_set;
@@ -26,10 +27,6 @@ enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NOOP,
};
-#define REF_STORAGE_FORMAT_UNKNOWN 0
-#define REF_STORAGE_FORMAT_FILES 1
-#define REF_STORAGE_FORMAT_REFTABLE 2
-
struct repo_settings {
int initialized;
@@ -181,7 +178,7 @@ struct repository {
const struct git_hash_algo *compat_hash_algo;
/* Repository's reference storage format, as serialized on disk. */
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
/* A unique-id for tracing purposes. */
int trace2_repo_id;
@@ -220,7 +217,8 @@ void repo_set_gitdir(struct repository *repo, const char *root,
void repo_set_worktree(struct repository *repo, const char *path);
void repo_set_hash_algo(struct repository *repo, int algo);
void repo_set_compat_hash_algo(struct repository *repo, int compat_algo);
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format);
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format);
void initialize_repository(struct repository *repo);
RESULT_MUST_BE_USED
int repo_init(struct repository *r, const char *gitdir, const char *worktree);
diff --git a/setup.c b/setup.c
index 8c84ec9d4b..b49ee3e95f 100644
--- a/setup.c
+++ b/setup.c
@@ -1997,7 +1997,7 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
}
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit)
{
char repo_version_string[10];
@@ -2044,7 +2044,7 @@ static int is_reinit(void)
return ret;
}
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet)
{
struct strbuf err = STRBUF_INIT;
@@ -2243,7 +2243,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
}
static void validate_ref_storage_format(struct repository_format *repo_fmt,
- unsigned int format)
+ enum ref_storage_format format)
{
const char *name = getenv("GIT_DEFAULT_REF_FORMAT");
@@ -2263,7 +2263,7 @@ static void validate_ref_storage_format(struct repository_format *repo_fmt,
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch,
int init_shared_repository, unsigned int flags)
{
diff --git a/setup.h b/setup.h
index b3fd3bf45a..cd8dbc2497 100644
--- a/setup.h
+++ b/setup.h
@@ -1,6 +1,7 @@
#ifndef SETUP_H
#define SETUP_H
+#include "refs.h"
#include "string-list.h"
int is_inside_git_dir(void);
@@ -128,7 +129,7 @@ struct repository_format {
int is_bare;
int hash_algo;
int compat_hash_algo;
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
int sparse_index;
char *work_tree;
struct string_list unknown_extensions;
@@ -192,13 +193,13 @@ const char *get_template_dir(const char *option_template);
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch, int init_shared_repository,
unsigned int flags);
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit);
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet);
/*
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v4 03/12] refs: pass storage format to `ref_store_init()` explicitly
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-04 8:23 ` Karthik Nayak
2024-06-03 9:30 ` [PATCH v4 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
` (8 subsequent siblings)
11 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2646 bytes --]
We're about to introduce logic to migrate refs from one storage format
to another one. This will require us to initialize a ref store with a
different format than the one used by the passed-in repository.
Prepare for this by accepting the desired ref storage format as
parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/refs.c b/refs.c
index e6db85a165..7c3f4df457 100644
--- a/refs.c
+++ b/refs.c
@@ -1894,13 +1894,14 @@ static struct ref_store *lookup_ref_store_map(struct strmap *map,
* gitdir.
*/
static struct ref_store *ref_store_init(struct repository *repo,
+ enum ref_storage_format format,
const char *gitdir,
unsigned int flags)
{
const struct ref_storage_be *be;
struct ref_store *refs;
- be = find_ref_storage_backend(repo->ref_storage_format);
+ be = find_ref_storage_backend(format);
if (!be)
BUG("reference backend is unknown");
@@ -1922,7 +1923,8 @@ struct ref_store *get_main_ref_store(struct repository *r)
if (!r->gitdir)
BUG("attempting to get main_ref_store outside of repository");
- r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
+ r->refs_private = ref_store_init(r, r->ref_storage_format,
+ r->gitdir, REF_STORE_ALL_CAPS);
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
return r->refs_private;
}
@@ -1982,7 +1984,8 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo,
free(subrepo);
goto done;
}
- refs = ref_store_init(subrepo, submodule_sb.buf,
+ refs = ref_store_init(subrepo, the_repository->ref_storage_format,
+ submodule_sb.buf,
REF_STORE_READ | REF_STORE_ODB);
register_ref_store_map(&repo->submodule_ref_stores, "submodule",
refs, submodule);
@@ -2011,12 +2014,12 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
struct strbuf common_path = STRBUF_INIT;
strbuf_git_common_path(&common_path, wt->repo,
"worktrees/%s", wt->id);
- refs = ref_store_init(wt->repo, common_path.buf,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, wt->repo->ref_storage_format,
+ common_path.buf, REF_STORE_ALL_CAPS);
strbuf_release(&common_path);
} else {
- refs = ref_store_init(wt->repo, wt->repo->commondir,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, the_repository->ref_storage_format,
+ wt->repo->commondir, REF_STORE_ALL_CAPS);
}
if (refs)
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v4 03/12] refs: pass storage format to `ref_store_init()` explicitly
2024-06-03 9:30 ` [PATCH v4 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
@ 2024-06-04 8:23 ` Karthik Nayak
0 siblings, 0 replies; 103+ messages in thread
From: Karthik Nayak @ 2024-06-04 8:23 UTC (permalink / raw)
To: Patrick Steinhardt, git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 794 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> We're about to introduce logic to migrate refs from one storage format
> to another one. This will require us to initialize a ref store with a
> different format than the one used by the passed-in repository.
>
> Prepare for this by accepting the desired ref storage format as
> parameter.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> refs.c | 17 ++++++++++-------
> 1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index e6db85a165..7c3f4df457 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1894,13 +1894,14 @@ static struct ref_store *lookup_ref_store_map(struct strmap *map,
> * gitdir.
Nit: Would be nice to update the documentation here, simply even
s/gitdir/gitdir and ref storage format.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v4 04/12] refs: allow to skip creation of reflog entries
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (2 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-04 11:04 ` Karthik Nayak
2024-06-03 9:30 ` [PATCH v4 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
` (7 subsequent siblings)
11 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 3845 bytes --]
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 6 ++++++
refs.h | 8 +++++++-
refs/files-backend.c | 4 ++++
refs/reftable-backend.c | 3 ++-
t/helper/test-ref-store.c | 1 +
5 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/refs.c b/refs.c
index 7c3f4df457..66e9585767 100644
--- a/refs.c
+++ b/refs.c
@@ -1194,6 +1194,12 @@ int ref_transaction_update(struct ref_transaction *transaction,
{
assert(err);
+ if ((flags & REF_FORCE_CREATE_REFLOG) &&
+ (flags & REF_SKIP_CREATE_REFLOG)) {
+ strbuf_addstr(err, _("refusing to force and skip creation of reflog"));
+ return -1;
+ }
+
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
((new_oid && !is_null_oid(new_oid)) ?
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL) :
diff --git a/refs.h b/refs.h
index a7afa9bede..50a2b3ab09 100644
--- a/refs.h
+++ b/refs.h
@@ -659,13 +659,19 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
*/
#define REF_SKIP_REFNAME_VERIFICATION (1 << 11)
+/*
+ * Skip creation of a reflog entry, even if it would have otherwise been
+ * created.
+ */
+#define REF_SKIP_CREATE_REFLOG (1 << 12)
+
/*
* Bitmask of all of the flags that are allowed to be passed in to
* ref_transaction_update() and friends:
*/
#define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS \
(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
- REF_SKIP_REFNAME_VERIFICATION)
+ REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
/*
* Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 73380d7e99..bd0d63bcba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1750,6 +1750,9 @@ static int files_log_ref_write(struct files_ref_store *refs,
{
int logfd, result;
+ if (flags & REF_SKIP_CREATE_REFLOG)
+ return 0;
+
if (log_all_ref_updates == LOG_REFS_UNSET)
log_all_ref_updates = is_bare_repository() ? LOG_REFS_NONE : LOG_REFS_NORMAL;
@@ -2251,6 +2254,7 @@ static int split_head_update(struct ref_update *update,
struct ref_update *new_update;
if ((update->flags & REF_LOG_ONLY) ||
+ (update->flags & REF_SKIP_CREATE_REFLOG) ||
(update->flags & REF_IS_PRUNING) ||
(update->flags & REF_UPDATE_VIA_HEAD))
return 0;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f6edfdf5b3..bffed9257f 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1103,7 +1103,8 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
if (ret)
goto done;
- } else if (u->flags & REF_HAVE_NEW &&
+ } else if (!(u->flags & REF_SKIP_CREATE_REFLOG) &&
+ (u->flags & REF_HAVE_NEW) &&
(u->flags & REF_FORCE_CREATE_REFLOG ||
should_write_log(&arg->refs->base, u->refname))) {
struct reftable_log_record *log;
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index c9efd74c2b..ad24300170 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -126,6 +126,7 @@ static struct flag_definition transaction_flags[] = {
FLAG_DEF(REF_FORCE_CREATE_REFLOG),
FLAG_DEF(REF_SKIP_OID_VERIFICATION),
FLAG_DEF(REF_SKIP_REFNAME_VERIFICATION),
+ FLAG_DEF(REF_SKIP_CREATE_REFLOG),
{ NULL, 0 }
};
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v4 04/12] refs: allow to skip creation of reflog entries
2024-06-03 9:30 ` [PATCH v4 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
@ 2024-06-04 11:04 ` Karthik Nayak
0 siblings, 0 replies; 103+ messages in thread
From: Karthik Nayak @ 2024-06-04 11:04 UTC (permalink / raw)
To: Patrick Steinhardt, git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2104 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
[snip]
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 73380d7e99..bd0d63bcba 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1750,6 +1750,9 @@ static int files_log_ref_write(struct files_ref_store *refs,
> {
> int logfd, result;
>
> + if (flags & REF_SKIP_CREATE_REFLOG)
> + return 0;
> +
> if (log_all_ref_updates == LOG_REFS_UNSET)
> log_all_ref_updates = is_bare_repository() ? LOG_REFS_NONE : LOG_REFS_NORMAL;
>
> @@ -2251,6 +2254,7 @@ static int split_head_update(struct ref_update *update,
> struct ref_update *new_update;
>
> if ((update->flags & REF_LOG_ONLY) ||
> + (update->flags & REF_SKIP_CREATE_REFLOG) ||
> (update->flags & REF_IS_PRUNING) ||
> (update->flags & REF_UPDATE_VIA_HEAD))
> return 0;
So updates to refs which are pointed by HEAD usually trigger a reflog
entry for HEAD itself. Here we skip that since REF_SKIP_CREATE_REFLOG is
set. Nice, this is an edge case that could have been easy to miss.
> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> index f6edfdf5b3..bffed9257f 100644
> --- a/refs/reftable-backend.c
> +++ b/refs/reftable-backend.c
> @@ -1103,7 +1103,8 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
>
> if (ret)
> goto done;
> - } else if (u->flags & REF_HAVE_NEW &&
> + } else if (!(u->flags & REF_SKIP_CREATE_REFLOG) &&
> + (u->flags & REF_HAVE_NEW) &&
> (u->flags & REF_FORCE_CREATE_REFLOG ||
> should_write_log(&arg->refs->base, u->refname))) {
> struct reftable_log_record *log;
> diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
> index c9efd74c2b..ad24300170 100644
> --- a/t/helper/test-ref-store.c
> +++ b/t/helper/test-ref-store.c
> @@ -126,6 +126,7 @@ static struct flag_definition transaction_flags[] = {
> FLAG_DEF(REF_FORCE_CREATE_REFLOG),
> FLAG_DEF(REF_SKIP_OID_VERIFICATION),
> FLAG_DEF(REF_SKIP_REFNAME_VERIFICATION),
> + FLAG_DEF(REF_SKIP_CREATE_REFLOG),
> { NULL, 0 }
> };
>
> --
> 2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v4 05/12] refs/files: refactor `add_pseudoref_and_head_entries()`
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (3 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
` (6 subsequent siblings)
11 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]
The `add_pseudoref_and_head_entries()` function accepts both the ref
store as well as a directory name as input. This is unnecessary though
as the ref store already uniquely identifies the root directory of the
ref store anyway.
Furthermore, the function is misnamed now that we have clarified the
meaning of pseudorefs as it doesn't add pseudorefs, but root refs.
Rename it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index bd0d63bcba..b4e5437ffe 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -324,16 +324,14 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
}
/*
- * Add pseudorefs to the ref dir by parsing the directory for any files
- * which follow the pseudoref syntax.
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
*/
-static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
- struct ref_dir *dir,
- const char *dirname)
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
{
- struct files_ref_store *refs =
- files_downcast(ref_store, REF_STORE_READ, "fill_ref_dir");
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
+ const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
DIR *d;
@@ -388,8 +386,7 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
dir = get_ref_dir(refs->loose->root);
if (flags & DO_FOR_EACH_INCLUDE_ROOT_REFS)
- add_pseudoref_and_head_entries(dir->cache->ref_store, dir,
- refs->loose->root->name);
+ add_root_refs(refs, dir);
/*
* Add an incomplete entry for "refs/" (to be filled
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v4 06/12] refs/files: extract function to iterate through root refs
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (4 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-05 10:07 ` Jeff King
2024-06-03 9:30 ` [PATCH v4 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
` (5 subsequent siblings)
11 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]
Extract a new function that can be used to iterate through all root refs
known to the "files" backend. This will be used in the next commit,
where we start to teach ref backends to remove themselves.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 49 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 40 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b4e5437ffe..b7268b26c8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -323,17 +323,15 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
add_per_worktree_entries_to_dir(dir, dirname);
}
-/*
- * Add root refs to the ref dir by parsing the directory for any files which
- * follow the root ref syntax.
- */
-static void add_root_refs(struct files_ref_store *refs,
- struct ref_dir *dir)
+static int for_each_root_ref(struct files_ref_store *refs,
+ int (*cb)(const char *refname, void *cb_data),
+ void *cb_data)
{
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
+ int ret;
DIR *d;
files_ref_path(refs, &path, dirname);
@@ -341,7 +339,7 @@ static void add_root_refs(struct files_ref_store *refs,
d = opendir(path.buf);
if (!d) {
strbuf_release(&path);
- return;
+ return -1;
}
strbuf_addstr(&refname, dirname);
@@ -357,14 +355,47 @@ static void add_root_refs(struct files_ref_store *refs,
strbuf_addstr(&refname, de->d_name);
dtype = get_dtype(de, &path, 1);
- if (dtype == DT_REG && is_root_ref(de->d_name))
- loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
+ if (dtype == DT_REG && is_root_ref(de->d_name)) {
+ ret = cb(refname.buf, cb_data);
+ if (ret)
+ goto done;
+ }
strbuf_setlen(&refname, dirnamelen);
}
+
+done:
strbuf_release(&refname);
strbuf_release(&path);
closedir(d);
+ return ret;
+}
+
+struct fill_root_ref_data {
+ struct files_ref_store *refs;
+ struct ref_dir *dir;
+};
+
+static int fill_root_ref(const char *refname, void *cb_data)
+{
+ struct fill_root_ref_data *data = cb_data;
+ loose_fill_ref_dir_regular_file(data->refs, refname, data->dir);
+ return 0;
+}
+
+/*
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
+ */
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
+{
+ struct fill_root_ref_data data = {
+ .refs = refs,
+ .dir = dir,
+ };
+
+ for_each_root_ref(refs, fill_root_ref, &data);
}
static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v4 06/12] refs/files: extract function to iterate through root refs
2024-06-03 9:30 ` [PATCH v4 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
@ 2024-06-05 10:07 ` Jeff King
2024-06-06 4:50 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Jeff King @ 2024-06-05 10:07 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
On Mon, Jun 03, 2024 at 11:30:35AM +0200, Patrick Steinhardt wrote:
> +static int for_each_root_ref(struct files_ref_store *refs,
> + int (*cb)(const char *refname, void *cb_data),
> + void *cb_data)
> {
> struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
> const char *dirname = refs->loose->root->name;
> struct dirent *de;
> size_t dirnamelen;
> + int ret;
> DIR *d;
Should we initialize ret to 0 here?
We set it only inside the loop over dir entries:
> @@ -357,14 +355,47 @@ static void add_root_refs(struct files_ref_store *refs,
> strbuf_addstr(&refname, de->d_name);
>
> dtype = get_dtype(de, &path, 1);
> - if (dtype == DT_REG && is_root_ref(de->d_name))
> - loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
> + if (dtype == DT_REG && is_root_ref(de->d_name)) {
> + ret = cb(refname.buf, cb_data);
> + if (ret)
> + goto done;
> + }
>
> strbuf_setlen(&refname, dirnamelen);
> }
...but if the directory is empty (or only has "." files and ".lock"
files), we won't call "cb" at all, and hence won't ever set "ret".
And then at the end:
> +done:
> strbuf_release(&refname);
> strbuf_release(&path);
> closedir(d);
> + return ret;
> +}
We return uninitialized garbage.
(Sorry for the late review; this got flagged by coverity since the topic
hit 'next').
-Peff
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 06/12] refs/files: extract function to iterate through root refs
2024-06-05 10:07 ` Jeff King
@ 2024-06-06 4:50 ` Patrick Steinhardt
2024-06-06 5:15 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 4:50 UTC (permalink / raw)
To: Jeff King; +Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 628 bytes --]
On Wed, Jun 05, 2024 at 06:07:28AM -0400, Jeff King wrote:
> On Mon, Jun 03, 2024 at 11:30:35AM +0200, Patrick Steinhardt wrote:
>
> > +static int for_each_root_ref(struct files_ref_store *refs,
> > + int (*cb)(const char *refname, void *cb_data),
> > + void *cb_data)
> > {
> > struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
> > const char *dirname = refs->loose->root->name;
> > struct dirent *de;
> > size_t dirnamelen;
> > + int ret;
> > DIR *d;
>
> Should we initialize ret to 0 here?
Yeah, we should. Or rather, I'll set `ret = 0;` on the successful path.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 06/12] refs/files: extract function to iterate through root refs
2024-06-06 4:50 ` Patrick Steinhardt
@ 2024-06-06 5:15 ` Patrick Steinhardt
2024-06-06 6:32 ` Patrick Steinhardt
0 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:15 UTC (permalink / raw)
To: Jeff King; +Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1592 bytes --]
On Thu, Jun 06, 2024 at 06:50:56AM +0200, Patrick Steinhardt wrote:
> On Wed, Jun 05, 2024 at 06:07:28AM -0400, Jeff King wrote:
> > On Mon, Jun 03, 2024 at 11:30:35AM +0200, Patrick Steinhardt wrote:
> >
> > > +static int for_each_root_ref(struct files_ref_store *refs,
> > > + int (*cb)(const char *refname, void *cb_data),
> > > + void *cb_data)
> > > {
> > > struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
> > > const char *dirname = refs->loose->root->name;
> > > struct dirent *de;
> > > size_t dirnamelen;
> > > + int ret;
> > > DIR *d;
> >
> > Should we initialize ret to 0 here?
>
> Yeah, we should. Or rather, I'll set `ret = 0;` on the successful path.
>
> Patrick
I was wondering why the compiler didn't flag it, because I know that GCC
has `-Wmaybe-uninitialized`. Turns out that this warning only works when
having optimizations enabled, but if we do then it correctly flags this
use:
(git) ~/Development/git:HEAD $ make refs/files-backend.o
* new build flags
CC refs/files-backend.o
refs/files-backend.c: In function ‘for_each_root_ref’:
refs/files-backend.c:371:16: error: ‘ret’ may be used uninitialized [-Werror=maybe-uninitialized]
371 | return ret;
| ^~~
refs/files-backend.c:334:13: note: ‘ret’ was declared here
334 | int ret;
| ^~~
cc1: all warnings being treated as errors
I'll have a look at our CI jobs and adapt my own config.mak to include
`-Og`.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 06/12] refs/files: extract function to iterate through root refs
2024-06-06 5:15 ` Patrick Steinhardt
@ 2024-06-06 6:32 ` Patrick Steinhardt
0 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 6:32 UTC (permalink / raw)
To: Jeff King; +Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]
On Thu, Jun 06, 2024 at 07:15:38AM +0200, Patrick Steinhardt wrote:
> On Thu, Jun 06, 2024 at 06:50:56AM +0200, Patrick Steinhardt wrote:
> > On Wed, Jun 05, 2024 at 06:07:28AM -0400, Jeff King wrote:
> > > On Mon, Jun 03, 2024 at 11:30:35AM +0200, Patrick Steinhardt wrote:
> > >
> > > > +static int for_each_root_ref(struct files_ref_store *refs,
> > > > + int (*cb)(const char *refname, void *cb_data),
> > > > + void *cb_data)
> > > > {
> > > > struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
> > > > const char *dirname = refs->loose->root->name;
> > > > struct dirent *de;
> > > > size_t dirnamelen;
> > > > + int ret;
> > > > DIR *d;
> > >
> > > Should we initialize ret to 0 here?
> >
> > Yeah, we should. Or rather, I'll set `ret = 0;` on the successful path.
> >
> > Patrick
>
> I was wondering why the compiler didn't flag it, because I know that GCC
> has `-Wmaybe-uninitialized`. Turns out that this warning only works when
> having optimizations enabled, but if we do then it correctly flags this
> use:
>
> (git) ~/Development/git:HEAD $ make refs/files-backend.o
> * new build flags
> CC refs/files-backend.o
> refs/files-backend.c: In function ‘for_each_root_ref’:
> refs/files-backend.c:371:16: error: ‘ret’ may be used uninitialized [-Werror=maybe-uninitialized]
> 371 | return ret;
> | ^~~
> refs/files-backend.c:334:13: note: ‘ret’ was declared here
> 334 | int ret;
> | ^~~
> cc1: all warnings being treated as errors
>
> I'll have a look at our CI jobs and adapt my own config.mak to include
> `-Og`.
>
> Patrick
I've sent out a patch series [1] that would have made CI detect this
issue before hitting any of the mainline branches.
[1]: https://lore.kernel.org/git/cover.1717655210.git.ps@pks.im/
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v4 07/12] refs/files: fix NULL pointer deref when releasing ref store
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (5 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
` (4 subsequent siblings)
11 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 702 bytes --]
The `free_ref_cache()` function is not `NULL` safe and will thus
segfault when being passed such a pointer. This can easily happen when
trying to release a partially initialized "files" ref store. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/ref-cache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index b6c53fc8ed..4ce519bbc8 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -71,6 +71,8 @@ static void free_ref_entry(struct ref_entry *entry)
void free_ref_cache(struct ref_cache *cache)
{
+ if (!cache)
+ return;
free_ref_entry(cache->root);
free(cache);
}
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v4 08/12] reftable: inline `merged_table_release()`
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (6 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 09/12] worktree: don't store main worktree twice Patrick Steinhardt
` (3 subsequent siblings)
11 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2400 bytes --]
The function `merged_table_release()` releases a merged table, whereas
`reftable_merged_table_free()` releases a merged table and then also
free's its pointer. But all callsites of `merged_table_release()` are in
fact followed by `reftable_merged_table_free()`, which is redundant.
Inline `merged_table_release()` into `reftable_merged_table_free()` to
get rid of this redundance.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/merged.c | 12 ++----------
reftable/merged.h | 2 --
reftable/stack.c | 8 ++------
3 files changed, 4 insertions(+), 18 deletions(-)
diff --git a/reftable/merged.c b/reftable/merged.c
index f85a24c678..804fdc0de0 100644
--- a/reftable/merged.c
+++ b/reftable/merged.c
@@ -207,19 +207,11 @@ int reftable_new_merged_table(struct reftable_merged_table **dest,
return 0;
}
-/* clears the list of subtable, without affecting the readers themselves. */
-void merged_table_release(struct reftable_merged_table *mt)
-{
- FREE_AND_NULL(mt->stack);
- mt->stack_len = 0;
-}
-
void reftable_merged_table_free(struct reftable_merged_table *mt)
{
- if (!mt) {
+ if (!mt)
return;
- }
- merged_table_release(mt);
+ FREE_AND_NULL(mt->stack);
reftable_free(mt);
}
diff --git a/reftable/merged.h b/reftable/merged.h
index a2571dbc99..9db45c3196 100644
--- a/reftable/merged.h
+++ b/reftable/merged.h
@@ -24,6 +24,4 @@ struct reftable_merged_table {
uint64_t max;
};
-void merged_table_release(struct reftable_merged_table *mt);
-
#endif
diff --git a/reftable/stack.c b/reftable/stack.c
index a59ebe038d..984fd866d0 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -261,10 +261,8 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
new_tables = NULL;
st->readers_len = new_readers_len;
- if (st->merged) {
- merged_table_release(st->merged);
+ if (st->merged)
reftable_merged_table_free(st->merged);
- }
if (st->readers) {
reftable_free(st->readers);
}
@@ -968,10 +966,8 @@ static int stack_write_compact(struct reftable_stack *st,
done:
reftable_iterator_destroy(&it);
- if (mt) {
- merged_table_release(mt);
+ if (mt)
reftable_merged_table_free(mt);
- }
reftable_ref_record_release(&ref);
reftable_log_record_release(&log);
st->stats.entries_written += entries;
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v4 09/12] worktree: don't store main worktree twice
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (7 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-03 9:30 ` [PATCH v4 10/12] refs: implement removal of ref storages Patrick Steinhardt
` (2 subsequent siblings)
11 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 3113 bytes --]
In `get_worktree_ref_store()` we either return the repository's main ref
store, or we look up the ref store via the map of worktree ref stores.
Which of these worktrees gets picked depends on the `is_current` bit of
the worktree, which indicates whether the worktree is the one that
corresponds to `the_repository`.
The bit is getting set in `get_worktrees()`, but only after we have
computed the list of all worktrees. This is too late though, because at
that time we have already called `get_worktree_ref_store()` on each of
the worktrees via `add_head_info()`. The consequence is that the current
worktree will not have been marked accordingly, which means that we did
not use the main ref store, but instead created a new ref store. We thus
have two separate ref stores now that map to the same ref database.
Fix this by setting `is_current` before we call `add_head_info()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
worktree.c | 29 +++++++++++------------------
1 file changed, 11 insertions(+), 18 deletions(-)
diff --git a/worktree.c b/worktree.c
index 12eadacc61..70844d023a 100644
--- a/worktree.c
+++ b/worktree.c
@@ -53,6 +53,15 @@ static void add_head_info(struct worktree *wt)
wt->is_detached = 1;
}
+static int is_current_worktree(struct worktree *wt)
+{
+ char *git_dir = absolute_pathdup(get_git_dir());
+ const char *wt_git_dir = get_worktree_git_dir(wt);
+ int is_current = !fspathcmp(git_dir, absolute_path(wt_git_dir));
+ free(git_dir);
+ return is_current;
+}
+
/**
* get the main worktree
*/
@@ -76,6 +85,7 @@ static struct worktree *get_main_worktree(int skip_reading_head)
*/
worktree->is_bare = (is_bare_repository_cfg == 1) ||
is_bare_repository();
+ worktree->is_current = is_current_worktree(worktree);
if (!skip_reading_head)
add_head_info(worktree);
return worktree;
@@ -102,6 +112,7 @@ struct worktree *get_linked_worktree(const char *id,
worktree->repo = the_repository;
worktree->path = strbuf_detach(&worktree_path, NULL);
worktree->id = xstrdup(id);
+ worktree->is_current = is_current_worktree(worktree);
if (!skip_reading_head)
add_head_info(worktree);
@@ -111,23 +122,6 @@ struct worktree *get_linked_worktree(const char *id,
return worktree;
}
-static void mark_current_worktree(struct worktree **worktrees)
-{
- char *git_dir = absolute_pathdup(get_git_dir());
- int i;
-
- for (i = 0; worktrees[i]; i++) {
- struct worktree *wt = worktrees[i];
- const char *wt_git_dir = get_worktree_git_dir(wt);
-
- if (!fspathcmp(git_dir, absolute_path(wt_git_dir))) {
- wt->is_current = 1;
- break;
- }
- }
- free(git_dir);
-}
-
/*
* NEEDSWORK: This function exists so that we can look up metadata of a
* worktree without trying to access any of its internals like the refdb. It
@@ -164,7 +158,6 @@ static struct worktree **get_worktrees_internal(int skip_reading_head)
ALLOC_GROW(list, counter + 1, alloc);
list[counter] = NULL;
- mark_current_worktree(list);
return list;
}
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v4 10/12] refs: implement removal of ref storages
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (8 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 09/12] worktree: don't store main worktree twice Patrick Steinhardt
@ 2024-06-03 9:30 ` Patrick Steinhardt
2024-06-04 11:17 ` Karthik Nayak
2024-06-05 10:12 ` Jeff King
2024-06-03 9:31 ` [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-06-03 9:31 ` [PATCH v4 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
11 siblings, 2 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:30 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 8487 bytes --]
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.
Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 5 ++++
refs.h | 5 ++++
refs/files-backend.c | 63 +++++++++++++++++++++++++++++++++++++++++
refs/packed-backend.c | 15 ++++++++++
refs/refs-internal.h | 7 +++++
refs/reftable-backend.c | 52 ++++++++++++++++++++++++++++++++++
6 files changed, 147 insertions(+)
diff --git a/refs.c b/refs.c
index 66e9585767..9b112b0527 100644
--- a/refs.c
+++ b/refs.c
@@ -1861,6 +1861,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
return refs->be->create_on_disk(refs, flags, err);
}
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err)
+{
+ return refs->be->remove_on_disk(refs, err);
+}
+
int repo_resolve_gitlink_ref(struct repository *r,
const char *submodule, const char *refname,
struct object_id *oid)
diff --git a/refs.h b/refs.h
index 50a2b3ab09..61ee7b7a15 100644
--- a/refs.h
+++ b/refs.h
@@ -129,6 +129,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
*/
void ref_store_release(struct ref_store *ref_store);
+/*
+ * Remove the ref store from disk. This deletes all associated data.
+ */
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err);
+
/*
* Return the peeled value of the oid currently being iterated via
* for_each_ref(), etc. This is equivalent to calling:
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b7268b26c8..cb752d32b6 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3340,11 +3340,74 @@ static int files_ref_store_create_on_disk(struct ref_store *ref_store,
return 0;
}
+struct remove_one_root_ref_data {
+ const char *gitdir;
+ struct strbuf *err;
+};
+
+static int remove_one_root_ref(const char *refname,
+ void *cb_data)
+{
+ struct remove_one_root_ref_data *data = cb_data;
+ struct strbuf buf = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&buf, "%s/%s", data->gitdir, refname);
+
+ ret = unlink(buf.buf);
+ if (ret < 0)
+ strbuf_addf(data->err, "could not delete %s: %s\n",
+ refname, strerror(errno));
+
+ strbuf_release(&buf);
+ return ret;
+}
+
+static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct files_ref_store *refs =
+ files_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct remove_one_root_ref_data data = {
+ .gitdir = refs->base.gitdir,
+ .err = err,
+ };
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete refs: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete logs: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ ret = for_each_root_ref(refs, remove_one_root_ref, &data);
+ if (ret < 0)
+ ret = -1;
+
+ if (ref_store_remove_on_disk(refs->packed_ref_store, err) < 0)
+ ret = -1;
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct ref_storage_be refs_be_files = {
.name = "files",
.init = files_ref_store_init,
.release = files_ref_store_release,
.create_on_disk = files_ref_store_create_on_disk,
+ .remove_on_disk = files_ref_store_remove_on_disk,
.transaction_prepare = files_transaction_prepare,
.transaction_finish = files_transaction_finish,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 2789fd92f5..c4c1e36aa2 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1,5 +1,6 @@
#include "../git-compat-util.h"
#include "../config.h"
+#include "../dir.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1266,6 +1267,19 @@ static int packed_ref_store_create_on_disk(struct ref_store *ref_store UNUSED,
return 0;
}
+static int packed_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct packed_ref_store *refs = packed_downcast(ref_store, 0, "remove");
+
+ if (remove_path(refs->path) < 0) {
+ strbuf_addstr(err, "could not delete packed-refs");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* Write the packed refs from the current snapshot to the packed-refs
* tempfile, incorporating any changes from `updates`. `updates` must
@@ -1724,6 +1738,7 @@ struct ref_storage_be refs_be_packed = {
.init = packed_ref_store_init,
.release = packed_ref_store_release,
.create_on_disk = packed_ref_store_create_on_disk,
+ .remove_on_disk = packed_ref_store_remove_on_disk,
.transaction_prepare = packed_transaction_prepare,
.transaction_finish = packed_transaction_finish,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 33749fbd83..cbcb6f9c36 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -517,6 +517,12 @@ typedef int ref_store_create_on_disk_fn(struct ref_store *refs,
int flags,
struct strbuf *err);
+/*
+ * Remove the reference store from disk.
+ */
+typedef int ref_store_remove_on_disk_fn(struct ref_store *refs,
+ struct strbuf *err);
+
typedef int ref_transaction_prepare_fn(struct ref_store *refs,
struct ref_transaction *transaction,
struct strbuf *err);
@@ -649,6 +655,7 @@ struct ref_storage_be {
ref_store_init_fn *init;
ref_store_release_fn *release;
ref_store_create_on_disk_fn *create_on_disk;
+ ref_store_remove_on_disk_fn *remove_on_disk;
ref_transaction_prepare_fn *transaction_prepare;
ref_transaction_finish_fn *transaction_finish;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index bffed9257f..e555be4671 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
#include "../git-compat-util.h"
#include "../abspath.h"
#include "../chdir-notify.h"
+#include "../dir.h"
#include "../environment.h"
#include "../gettext.h"
#include "../hash.h"
@@ -343,6 +344,56 @@ static int reftable_be_create_on_disk(struct ref_store *ref_store,
return 0;
}
+static int reftable_be_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct reftable_ref_store *refs =
+ reftable_be_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ /*
+ * Release the ref store such that all stacks are closed. This is
+ * required so that the "tables.list" file is not open anymore, which
+ * would otherwise make it impossible to remove the file on Windows.
+ */
+ reftable_be_release(ref_store);
+
+ strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete reftables: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+ if (unlink(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub HEAD: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+ if (unlink(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub heads: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (rmdir(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub heads: %s",
+ strerror(errno));
+ ret = -1;
+ }
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct reftable_ref_iterator {
struct ref_iterator base;
struct reftable_ref_store *refs;
@@ -2196,6 +2247,7 @@ struct ref_storage_be refs_be_reftable = {
.init = reftable_be_init,
.release = reftable_be_release,
.create_on_disk = reftable_be_create_on_disk,
+ .remove_on_disk = reftable_be_remove_on_disk,
.transaction_prepare = reftable_be_transaction_prepare,
.transaction_finish = reftable_be_transaction_finish,
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v4 10/12] refs: implement removal of ref storages
2024-06-03 9:30 ` [PATCH v4 10/12] refs: implement removal of ref storages Patrick Steinhardt
@ 2024-06-04 11:17 ` Karthik Nayak
2024-06-05 10:12 ` Jeff King
1 sibling, 0 replies; 103+ messages in thread
From: Karthik Nayak @ 2024-06-04 11:17 UTC (permalink / raw)
To: Patrick Steinhardt, git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2117 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
[snip]
> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> index bffed9257f..e555be4671 100644
> --- a/refs/reftable-backend.c
> +++ b/refs/reftable-backend.c
> @@ -1,6 +1,7 @@
> #include "../git-compat-util.h"
> #include "../abspath.h"
> #include "../chdir-notify.h"
> +#include "../dir.h"
> #include "../environment.h"
> #include "../gettext.h"
> #include "../hash.h"
> @@ -343,6 +344,56 @@ static int reftable_be_create_on_disk(struct ref_store *ref_store,
> return 0;
> }
>
> +static int reftable_be_remove_on_disk(struct ref_store *ref_store,
> + struct strbuf *err)
> +{
> + struct reftable_ref_store *refs =
> + reftable_be_downcast(ref_store, REF_STORE_WRITE, "remove");
> + struct strbuf sb = STRBUF_INIT;
> + int ret = 0;
> +
> + /*
> + * Release the ref store such that all stacks are closed. This is
> + * required so that the "tables.list" file is not open anymore, which
> + * would otherwise make it impossible to remove the file on Windows.
> + */
> + reftable_be_release(ref_store);
> +
> + strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
> + if (remove_dir_recursively(&sb, 0) < 0) {
> + strbuf_addf(err, "could not delete reftables: %s",
> + strerror(errno));
> + ret = -1;
> + }
> + strbuf_reset(&sb);
> +
> + strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
> + if (unlink(sb.buf) < 0) {
> + strbuf_addf(err, "could not delete stub HEAD: %s",
> + strerror(errno));
> + ret = -1;
> + }
> + strbuf_reset(&sb);
> +
> + strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
> + if (unlink(sb.buf) < 0) {
> + strbuf_addf(err, "could not delete stub heads: %s",
> + strerror(errno));
> + ret = -1;
> + }
> + strbuf_reset(&sb);
> +
> + strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
> + if (rmdir(sb.buf) < 0) {
> + strbuf_addf(err, "could not delete stub heads: %s",
Nit: Wouldn't it be nicer to be able to differentiate this from the
previous case? Both have the same error message.
Otherwise this patch looks good, since we use unlink(2), that handles
symrefs which are symbolic links too.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 10/12] refs: implement removal of ref storages
2024-06-03 9:30 ` [PATCH v4 10/12] refs: implement removal of ref storages Patrick Steinhardt
2024-06-04 11:17 ` Karthik Nayak
@ 2024-06-05 10:12 ` Jeff King
2024-06-05 16:54 ` Junio C Hamano
2024-06-06 4:51 ` Patrick Steinhardt
1 sibling, 2 replies; 103+ messages in thread
From: Jeff King @ 2024-06-05 10:12 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
On Mon, Jun 03, 2024 at 11:30:55AM +0200, Patrick Steinhardt wrote:
> +static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
> + struct strbuf *err)
> +{
> [...]
> + strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
> + if (remove_dir_recursively(&sb, 0) < 0) {
> + strbuf_addf(err, "could not delete refs: %s",
> + strerror(errno));
> + ret = -1;
> + }
> + strbuf_reset(&sb);
> +
> + strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
> + if (remove_dir_recursively(&sb, 0) < 0) {
> + strbuf_addf(err, "could not delete logs: %s",
> + strerror(errno));
> + ret = -1;
> + }
> + strbuf_reset(&sb);
If removing either of the directories fails, we set ret to "-1". Make
sense. But...
> + ret = for_each_root_ref(refs, remove_one_root_ref, &data);
> + if (ret < 0)
> + ret = -1;
...then we unconditionally overwrite it, forgetting the earlier error.
Either we should jump to the end on the first failure, or if the goal is
to do as much as possible, should we |= the result? I'm not clear why we
assign "ret" and then immediately check it to assign "-1" again.
Is that a mistake, or are we normalizing other negative values? Maybe
just:
if (for_each_root_ref(refs, remove_one_root_ref, &data) < 0)
ret = -1;
would work?
-Peff
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 10/12] refs: implement removal of ref storages
2024-06-05 10:12 ` Jeff King
@ 2024-06-05 16:54 ` Junio C Hamano
2024-06-06 4:51 ` Patrick Steinhardt
1 sibling, 0 replies; 103+ messages in thread
From: Junio C Hamano @ 2024-06-05 16:54 UTC (permalink / raw)
To: Jeff King
Cc: Patrick Steinhardt, git, Eric Sunshine, Ramsay Jones,
Justin Tobler
Jeff King <peff@peff.net> writes:
> If removing either of the directories fails, we set ret to "-1". Make
> sense. But...
>
>> + ret = for_each_root_ref(refs, remove_one_root_ref, &data);
>> + if (ret < 0)
>> + ret = -1;
>
> ...then we unconditionally overwrite it, forgetting the earlier error.
Ouch.
> Is that a mistake, or are we normalizing other negative values? Maybe
> just:
>
> if (for_each_root_ref(refs, remove_one_root_ref, &data) < 0)
> ret = -1;
>
> would work?
Sounds sensible.
Thanks for carefully reading.
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 10/12] refs: implement removal of ref storages
2024-06-05 10:12 ` Jeff King
2024-06-05 16:54 ` Junio C Hamano
@ 2024-06-06 4:51 ` Patrick Steinhardt
1 sibling, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 4:51 UTC (permalink / raw)
To: Jeff King; +Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]
On Wed, Jun 05, 2024 at 06:12:00AM -0400, Jeff King wrote:
> On Mon, Jun 03, 2024 at 11:30:55AM +0200, Patrick Steinhardt wrote:
>
> > +static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
> > + struct strbuf *err)
> > +{
> > [...]
> > + strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
> > + if (remove_dir_recursively(&sb, 0) < 0) {
> > + strbuf_addf(err, "could not delete refs: %s",
> > + strerror(errno));
> > + ret = -1;
> > + }
> > + strbuf_reset(&sb);
> > +
> > + strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
> > + if (remove_dir_recursively(&sb, 0) < 0) {
> > + strbuf_addf(err, "could not delete logs: %s",
> > + strerror(errno));
> > + ret = -1;
> > + }
> > + strbuf_reset(&sb);
>
> If removing either of the directories fails, we set ret to "-1". Make
> sense. But...
>
> > + ret = for_each_root_ref(refs, remove_one_root_ref, &data);
> > + if (ret < 0)
> > + ret = -1;
>
> ...then we unconditionally overwrite it, forgetting the earlier error.
> Either we should jump to the end on the first failure, or if the goal is
> to do as much as possible, should we |= the result? I'm not clear why we
> assign "ret" and then immediately check it to assign "-1" again.
>
> Is that a mistake, or are we normalizing other negative values? Maybe
> just:
>
> if (for_each_root_ref(refs, remove_one_root_ref, &data) < 0)
> ret = -1;
>
> would work?
Yup, that would work, good catch.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (9 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 10/12] refs: implement removal of ref storages Patrick Steinhardt
@ 2024-06-03 9:31 ` Patrick Steinhardt
2024-06-04 15:28 ` Karthik Nayak
2024-06-05 10:03 ` Jeff King
2024-06-03 9:31 ` [PATCH v4 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
11 siblings, 2 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 11939 bytes --]
With the introduction of the new "reftable" backend, users may want to
migrate repositories between the backends without having to recreate the
whole repository. Add the logic to do so.
The implementation is generic and works with arbitrary ref storage
formats so that a backend does not need to implement any migration
logic. It does have a few limitations though:
- We do not migrate repositories with worktrees, because worktrees
have separate ref storages. It makes the overall affair more complex
if we have to migrate multiple storages at once.
- We do not migrate reflogs, because we have no interfaces to write
many reflog entries.
- We do not lock the repository for concurrent access, and thus
concurrent writes may end up with weird in-between states. There is
no way to fully lock the "files" backend for writes due to its
format, and thus we punt on this topic altogether and defer to the
user to avoid those from happening.
In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
often have neither worktrees nor reflogs. But it will not work for many
other repositories without some preparations. These limitations are not
set into stone though, and ideally we will eventually address them over
time.
The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 305 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
refs.h | 18 ++++
2 files changed, 323 insertions(+)
diff --git a/refs.c b/refs.c
index 9b112b0527..f7c7765d23 100644
--- a/refs.c
+++ b/refs.c
@@ -2570,3 +2570,308 @@ int ref_update_check_old_target(const char *referent, struct ref_update *update,
referent, update->old_target);
return -1;
}
+
+struct migration_data {
+ struct ref_store *old_refs;
+ struct ref_transaction *transaction;
+ struct strbuf *errbuf;
+};
+
+static int migrate_one_ref(const char *refname, const struct object_id *oid,
+ int flags, void *cb_data)
+{
+ struct migration_data *data = cb_data;
+ struct strbuf symref_target = STRBUF_INIT;
+ int ret;
+
+ if (flags & REF_ISSYMREF) {
+ ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target);
+ if (ret < 0)
+ goto done;
+
+ ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(),
+ symref_target.buf, NULL,
+ REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ } else {
+ ret = ref_transaction_create(data->transaction, refname, oid,
+ REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION,
+ NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ }
+
+done:
+ strbuf_release(&symref_target);
+ return ret;
+}
+
+static int move_files(const char *from_path, const char *to_path, struct strbuf *errbuf)
+{
+ struct strbuf from_buf = STRBUF_INIT, to_buf = STRBUF_INIT;
+ size_t from_len, to_len;
+ DIR *from_dir;
+ int ret;
+
+ from_dir = opendir(from_path);
+ if (!from_dir) {
+ strbuf_addf(errbuf, "could not open source directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ strbuf_addstr(&from_buf, from_path);
+ strbuf_complete(&from_buf, '/');
+ from_len = from_buf.len;
+
+ strbuf_addstr(&to_buf, to_path);
+ strbuf_complete(&to_buf, '/');
+ to_len = to_buf.len;
+
+ while (1) {
+ struct dirent *ent;
+
+ errno = 0;
+ ent = readdir(from_dir);
+ if (!ent)
+ break;
+
+ if (!strcmp(ent->d_name, ".") ||
+ !strcmp(ent->d_name, ".."))
+ continue;
+
+ strbuf_setlen(&from_buf, from_len);
+ strbuf_addstr(&from_buf, ent->d_name);
+
+ strbuf_setlen(&to_buf, to_len);
+ strbuf_addstr(&to_buf, ent->d_name);
+
+ ret = rename(from_buf.buf, to_buf.buf);
+ if (ret < 0) {
+ strbuf_addf(errbuf, "could not link file '%s' to '%s': %s",
+ from_buf.buf, to_buf.buf, strerror(errno));
+ goto done;
+ }
+ }
+
+ if (errno) {
+ strbuf_addf(errbuf, "could not read entry from directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ ret = 0;
+
+done:
+ strbuf_release(&from_buf);
+ strbuf_release(&to_buf);
+ if (from_dir)
+ closedir(from_dir);
+ return ret;
+}
+
+static int count_reflogs(const char *reflog UNUSED, void *payload)
+{
+ size_t *reflog_count = payload;
+ (*reflog_count)++;
+ return 0;
+}
+
+static int has_worktrees(void)
+{
+ struct worktree **worktrees = get_worktrees();
+ int ret = 0;
+ size_t i;
+
+ for (i = 0; worktrees[i]; i++) {
+ if (is_main_worktree(worktrees[i]))
+ continue;
+ ret = 1;
+ }
+
+ free_worktrees(worktrees);
+ return ret;
+}
+
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *errbuf)
+{
+ struct ref_store *old_refs = NULL, *new_refs = NULL;
+ struct ref_transaction *transaction = NULL;
+ struct strbuf buf = STRBUF_INIT;
+ struct migration_data data;
+ size_t reflog_count = 0;
+ char *new_gitdir = NULL;
+ int did_migrate_refs = 0;
+ int ret;
+
+ old_refs = get_main_ref_store(repo);
+
+ /*
+ * We do not have any interfaces that would allow us to write many
+ * reflog entries. Once we have them we can remove this restriction.
+ */
+ if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
+ strbuf_addstr(errbuf, "cannot count reflogs");
+ ret = -1;
+ goto done;
+ }
+ if (reflog_count) {
+ strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * Worktrees complicate the migration because every worktree has a
+ * separate ref storage. While it should be feasible to implement, this
+ * is pushed out to a future iteration.
+ *
+ * TODO: we should really be passing the caller-provided repository to
+ * `has_worktrees()`, but our worktree subsystem doesn't yet support
+ * that.
+ */
+ if (has_worktrees()) {
+ strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * The overall logic looks like this:
+ *
+ * 1. Set up a new temporary directory and initialize it with the new
+ * format. This is where all refs will be migrated into.
+ *
+ * 2. Enumerate all refs and write them into the new ref storage.
+ * This operation is safe as we do not yet modify the main
+ * repository.
+ *
+ * 3. If we're in dry-run mode then we are done and can hand over the
+ * directory to the caller for inspection. If not, we now start
+ * with the destructive part.
+ *
+ * 4. Delete the old ref storage from disk. As we have a copy of refs
+ * in the new ref storage it's okay(ish) if we now get interrupted
+ * as there is an equivalent copy of all refs available.
+ *
+ * 5. Move the new ref storage files into place.
+ *
+ * 6. Change the repository format to the new ref format.
+ */
+ strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
+ new_gitdir = mkdtemp(xstrdup(buf.buf));
+ if (!new_gitdir) {
+ strbuf_addf(errbuf, "cannot create migration directory: %s",
+ strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ new_refs = ref_store_init(repo, format, new_gitdir,
+ REF_STORE_ALL_CAPS);
+ ret = ref_store_create_on_disk(new_refs, 0, errbuf);
+ if (ret < 0)
+ goto done;
+
+ transaction = ref_store_transaction_begin(new_refs, errbuf);
+ if (!transaction)
+ goto done;
+
+ data.old_refs = old_refs;
+ data.transaction = transaction;
+ data.errbuf = errbuf;
+
+ /*
+ * We need to use the internal `do_for_each_ref()` here so that we can
+ * also include broken refs and symrefs. These would otherwise be
+ * skipped silently.
+ *
+ * Ideally, we would do this call while locking the old ref storage
+ * such that there cannot be any concurrent modifications. We do not
+ * have the infra for that though, and the "files" backend does not
+ * allow for a central lock due to its design. It's thus on the user to
+ * ensure that there are no concurrent writes.
+ */
+ ret = do_for_each_ref(old_refs, "", NULL, migrate_one_ref, 0,
+ DO_FOR_EACH_INCLUDE_ROOT_REFS | DO_FOR_EACH_INCLUDE_BROKEN,
+ &data);
+ if (ret < 0)
+ goto done;
+
+ /*
+ * TODO: we might want to migrate to `initial_ref_transaction_commit()`
+ * here, which is more efficient for the files backend because it would
+ * write new refs into the packed-refs file directly. At this point,
+ * the files backend doesn't handle pseudo-refs and symrefs correctly
+ * though, so this requires some more work.
+ */
+ ret = ref_transaction_commit(transaction, errbuf);
+ if (ret < 0)
+ goto done;
+ did_migrate_refs = 1;
+
+ if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
+ printf(_("Finished dry-run migration of refs, "
+ "the result can be found at '%s'\n"), new_gitdir);
+ ret = 0;
+ goto done;
+ }
+
+ /*
+ * Until now we were in the non-destructive phase, where we only
+ * populated the new ref store. From hereon though we are about
+ * to get hands by deleting the old ref store and then moving
+ * the new one into place.
+ *
+ * Assuming that there were no concurrent writes, the new ref
+ * store should have all information. So if we fail from hereon
+ * we may be in an in-between state, but it would still be able
+ * to recover by manually moving remaining files from the
+ * temporary migration directory into place.
+ */
+ ret = ref_store_remove_on_disk(old_refs, errbuf);
+ if (ret < 0)
+ goto done;
+
+ ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
+ if (ret < 0)
+ goto done;
+
+ if (rmdir(new_gitdir) < 0)
+ warning_errno(_("could not remove temporary migration directory '%s'"),
+ new_gitdir);
+
+ /*
+ * We have migrated the repository, so we now need to adjust the
+ * repository format so that clients will use the new ref store.
+ * We also need to swap out the repository's main ref store.
+ */
+ initialize_repository_version(hash_algo_by_ptr(repo->hash_algo), format, 1);
+
+ free(new_refs->gitdir);
+ new_refs->gitdir = xstrdup(old_refs->gitdir);
+ repo->refs_private = new_refs;
+ ref_store_release(old_refs);
+
+ ret = 0;
+
+done:
+ if (ret && did_migrate_refs) {
+ strbuf_complete(errbuf, '\n');
+ strbuf_addf(errbuf, _("migrated refs can be found at '%s'"),
+ new_gitdir);
+ }
+
+ if (ret && new_refs)
+ ref_store_release(new_refs);
+ ref_transaction_free(transaction);
+ strbuf_release(&buf);
+ free(new_gitdir);
+ return ret;
+}
diff --git a/refs.h b/refs.h
index 61ee7b7a15..76d25df4de 100644
--- a/refs.h
+++ b/refs.h
@@ -1070,6 +1070,24 @@ int is_root_ref(const char *refname);
*/
int is_pseudo_ref(const char *refname);
+/*
+ * The following flags can be passed to `repo_migrate_ref_storage_format()`:
+ *
+ * - REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN: perform a dry-run migration
+ * without touching the main repository. The result will be written into a
+ * temporary ref storage directory.
+ */
+#define REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN (1 << 0)
+
+/*
+ * Migrate the ref storage format used by the repository to the
+ * specified one.
+ */
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *err);
+
/*
* The following functions have been removed in Git v2.45 in favor of functions
* that receive a `ref_store` as parameter. The intent of this section is
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-03 9:31 ` [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-06-04 15:28 ` Karthik Nayak
2024-06-05 5:52 ` Patrick Steinhardt
2024-06-05 10:03 ` Jeff King
1 sibling, 1 reply; 103+ messages in thread
From: Karthik Nayak @ 2024-06-04 15:28 UTC (permalink / raw)
To: Patrick Steinhardt, git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 7054 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
[snip]
> diff --git a/refs.c b/refs.c
> index 9b112b0527..f7c7765d23 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2570,3 +2570,308 @@ int ref_update_check_old_target(const char *referent, struct ref_update *update,
> referent, update->old_target);
> return -1;
> }
> +
> +struct migration_data {
> + struct ref_store *old_refs;
> + struct ref_transaction *transaction;
> + struct strbuf *errbuf;
> +};
> +
> +static int migrate_one_ref(const char *refname, const struct object_id *oid,
> + int flags, void *cb_data)
> +{
> + struct migration_data *data = cb_data;
> + struct strbuf symref_target = STRBUF_INIT;
> + int ret;
> +
> + if (flags & REF_ISSYMREF) {
> + ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target);
> + if (ret < 0)
> + goto done;
> +
> + ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(),
> + symref_target.buf, NULL,
> + REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf);
> + if (ret < 0)
> + goto done;
> + } else {
> + ret = ref_transaction_create(data->transaction, refname, oid,
> + REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION,
> + NULL, data->errbuf);
> + if (ret < 0)
> + goto done;
> + }
I was a little perplexed about the first scenario being
`ref_transaction_update` and the second being `ref_transaction_create`,
I then realized that this is because the latter doesn't support creating
symrefs yet (changes in my series kn/update-ref-symref), makes sense to
do it this way.
[snip]
> +int repo_migrate_ref_storage_format(struct repository *repo,
> + enum ref_storage_format format,
> + unsigned int flags,
> + struct strbuf *errbuf)
> +{
> + struct ref_store *old_refs = NULL, *new_refs = NULL;
> + struct ref_transaction *transaction = NULL;
> + struct strbuf buf = STRBUF_INIT;
> + struct migration_data data;
> + size_t reflog_count = 0;
> + char *new_gitdir = NULL;
> + int did_migrate_refs = 0;
> + int ret;
> +
> + old_refs = get_main_ref_store(repo);
Should we add a check to ensure the `old_refs->repo->ref_storage_format`
and `format` are different?
> +
> + /*
> + * We do not have any interfaces that would allow us to write many
> + * reflog entries. Once we have them we can remove this restriction.
> + */
> + if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
> + strbuf_addstr(errbuf, "cannot count reflogs");
> + ret = -1;
> + goto done;
> + }
> + if (reflog_count) {
> + strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
> + ret = -1;
> + goto done;
> + }
Isn't this restrictive? It would be nice to perhaps say "git refs
migrate --ignore-reflogs", which could make it possible to not care
about reflogs. But maybe that can be part of a follow up.
> + /*
> + * Worktrees complicate the migration because every worktree has a
> + * separate ref storage. While it should be feasible to implement, this
> + * is pushed out to a future iteration.
> + *
> + * TODO: we should really be passing the caller-provided repository to
> + * `has_worktrees()`, but our worktree subsystem doesn't yet support
> + * that.
> + */
> + if (has_worktrees()) {
> + strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
> + ret = -1;
> + goto done;
> + }
> +
Same as above.
> + /*
> + * The overall logic looks like this:
> + *
> + * 1. Set up a new temporary directory and initialize it with the new
> + * format. This is where all refs will be migrated into.
> + *
> + * 2. Enumerate all refs and write them into the new ref storage.
> + * This operation is safe as we do not yet modify the main
> + * repository.
> + *
> + * 3. If we're in dry-run mode then we are done and can hand over the
> + * directory to the caller for inspection. If not, we now start
> + * with the destructive part.
> + *
> + * 4. Delete the old ref storage from disk. As we have a copy of refs
> + * in the new ref storage it's okay(ish) if we now get interrupted
> + * as there is an equivalent copy of all refs available.
> + *
> + * 5. Move the new ref storage files into place.
> + *
> + * 6. Change the repository format to the new ref format.
> + */
> + strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
> + new_gitdir = mkdtemp(xstrdup(buf.buf));
> + if (!new_gitdir) {
> + strbuf_addf(errbuf, "cannot create migration directory: %s",
> + strerror(errno));
> + ret = -1;
> + goto done;
> + }
> +
> + new_refs = ref_store_init(repo, format, new_gitdir,
> + REF_STORE_ALL_CAPS);
> + ret = ref_store_create_on_disk(new_refs, 0, errbuf);
> + if (ret < 0)
> + goto done;
> +
> + transaction = ref_store_transaction_begin(new_refs, errbuf);
> + if (!transaction)
> + goto done;
> +
> + data.old_refs = old_refs;
> + data.transaction = transaction;
> + data.errbuf = errbuf;
> +
> + /*
> + * We need to use the internal `do_for_each_ref()` here so that we can
> + * also include broken refs and symrefs. These would otherwise be
> + * skipped silently.
> + *
> + * Ideally, we would do this call while locking the old ref storage
> + * such that there cannot be any concurrent modifications. We do not
> + * have the infra for that though, and the "files" backend does not
> + * allow for a central lock due to its design. It's thus on the user to
> + * ensure that there are no concurrent writes.
> + */
> + ret = do_for_each_ref(old_refs, "", NULL, migrate_one_ref, 0,
> + DO_FOR_EACH_INCLUDE_ROOT_REFS | DO_FOR_EACH_INCLUDE_BROKEN,
> + &data);
> + if (ret < 0)
> + goto done;
> +
> + /*
> + * TODO: we might want to migrate to `initial_ref_transaction_commit()`
> + * here, which is more efficient for the files backend because it would
> + * write new refs into the packed-refs file directly. At this point,
> + * the files backend doesn't handle pseudo-refs and symrefs correctly
> + * though, so this requires some more work.
> + */
> + ret = ref_transaction_commit(transaction, errbuf);
> + if (ret < 0)
> + goto done;
> + did_migrate_refs = 1;
> +
> + if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
> + printf(_("Finished dry-run migration of refs, "
> + "the result can be found at '%s'\n"), new_gitdir);
> + ret = 0;
> + goto done;
> + }
> +
> + /*
> + * Until now we were in the non-destructive phase, where we only
> + * populated the new ref store. From hereon though we are about
> + * to get hands by deleting the old ref store and then moving
> + * the new one into place.
> + *
> + * Assuming that there were no concurrent writes, the new ref
> + * store should have all information. So if we fail from hereon
> + * we may be in an in-between state, but it would still be able
> + * to recover by manually moving remaining files from the
> + * temporary migration directory into place.
> + */
This also means that the recovery would only be possible into the new
format. Makes sense.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-04 15:28 ` Karthik Nayak
@ 2024-06-05 5:52 ` Patrick Steinhardt
0 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-05 5:52 UTC (permalink / raw)
To: Karthik Nayak
Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 2810 bytes --]
On Tue, Jun 04, 2024 at 03:28:07PM +0000, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > +int repo_migrate_ref_storage_format(struct repository *repo,
> > + enum ref_storage_format format,
> > + unsigned int flags,
> > + struct strbuf *errbuf)
> > +{
> > + struct ref_store *old_refs = NULL, *new_refs = NULL;
> > + struct ref_transaction *transaction = NULL;
> > + struct strbuf buf = STRBUF_INIT;
> > + struct migration_data data;
> > + size_t reflog_count = 0;
> > + char *new_gitdir = NULL;
> > + int did_migrate_refs = 0;
> > + int ret;
> > +
> > + old_refs = get_main_ref_store(repo);
>
> Should we add a check to ensure the `old_refs->repo->ref_storage_format`
> and `format` are different?
Hm, yeah. We do have that check in git-refs(1), but having it here as
well wouldn't hurt. As the patch series has been merged to `next`, I'll
leave this for a future iteration though. Probably the one where I
implement support for migrating reflogs.
> > +
> > + /*
> > + * We do not have any interfaces that would allow us to write many
> > + * reflog entries. Once we have them we can remove this restriction.
> > + */
> > + if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
> > + strbuf_addstr(errbuf, "cannot count reflogs");
> > + ret = -1;
> > + goto done;
> > + }
> > + if (reflog_count) {
> > + strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
> > + ret = -1;
> > + goto done;
> > + }
>
> Isn't this restrictive? It would be nice to perhaps say "git refs
> migrate --ignore-reflogs", which could make it possible to not care
> about reflogs. But maybe that can be part of a follow up.
Oh yeah, it is. In this case it would be possible to add a flag to
override this check, because the result would be that we simply discard
all reflogs altogether. But I don't think adding such a flag makes
sense, because I'd much rather want to remove the underlying restriction
itself and start handling the migration of reflogs.
> > + /*
> > + * Worktrees complicate the migration because every worktree has a
> > + * separate ref storage. While it should be feasible to implement, this
> > + * is pushed out to a future iteration.
> > + *
> > + * TODO: we should really be passing the caller-provided repository to
> > + * `has_worktrees()`, but our worktree subsystem doesn't yet support
> > + * that.
> > + */
> > + if (has_worktrees()) {
> > + strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
> > + ret = -1;
> > + goto done;
> > + }
> > +
>
> Same as above.
Allowing users to override this would leave them with broken worktree
refdbs, so I don't think we should add such a flag, either.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-03 9:31 ` [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
2024-06-04 15:28 ` Karthik Nayak
@ 2024-06-05 10:03 ` Jeff King
2024-06-05 16:59 ` Junio C Hamano
2024-06-06 4:51 ` Patrick Steinhardt
1 sibling, 2 replies; 103+ messages in thread
From: Jeff King @ 2024-06-05 10:03 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
On Mon, Jun 03, 2024 at 11:31:00AM +0200, Patrick Steinhardt wrote:
> +int repo_migrate_ref_storage_format(struct repository *repo,
> + enum ref_storage_format format,
> + unsigned int flags,
> + struct strbuf *errbuf)
> +{
> [...]
> + new_gitdir = mkdtemp(xstrdup(buf.buf));
> + if (!new_gitdir) {
> + strbuf_addf(errbuf, "cannot create migration directory: %s",
> + strerror(errno));
> + ret = -1;
> + goto done;
> + }
Coverity complains here of a leak of the xstrdup(). The return from
mkdtemp() should generally point to the same buffer we passed in, but if
it sees an error it will return NULL and the new heap buffer will be
lost.
Probably unlikely, but since you are on a leak-checking kick, I thought
I'd mention it. ;)
Since you have a writable strbuf already, maybe:
new_gitdir = mkdtemp(buf.buf);
if (!new_gitdir)
...
new_gitdir = strbuf_detach(&buf, NULL); /* same pointer, but now we own it */
Or since "buf" is not used for anything else, we could just leave it
attached to the strbuf. And probably give it a better name. Maybe:
diff --git a/refs.c b/refs.c
index 166b6f269e..9a6655abee 100644
--- a/refs.c
+++ b/refs.c
@@ -2726,10 +2726,9 @@ int repo_migrate_ref_storage_format(struct repository *repo,
{
struct ref_store *old_refs = NULL, *new_refs = NULL;
struct ref_transaction *transaction = NULL;
- struct strbuf buf = STRBUF_INIT;
+ struct strbuf new_gitdir = STRBUF_INIT;
struct migration_data data;
size_t reflog_count = 0;
- char *new_gitdir = NULL;
int did_migrate_refs = 0;
int ret;
@@ -2787,16 +2786,15 @@ int repo_migrate_ref_storage_format(struct repository *repo,
*
* 6. Change the repository format to the new ref format.
*/
- strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
- new_gitdir = mkdtemp(xstrdup(buf.buf));
- if (!new_gitdir) {
+ strbuf_addf(&new_gitdir, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
+ if (!mkdtemp(new_gitdir.buf)) {
strbuf_addf(errbuf, "cannot create migration directory: %s",
strerror(errno));
ret = -1;
goto done;
}
- new_refs = ref_store_init(repo, format, new_gitdir,
+ new_refs = ref_store_init(repo, format, new_gitdir.buf,
REF_STORE_ALL_CAPS);
ret = ref_store_create_on_disk(new_refs, 0, errbuf);
if (ret < 0)
@@ -2841,7 +2839,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
printf(_("Finished dry-run migration of refs, "
- "the result can be found at '%s'\n"), new_gitdir);
+ "the result can be found at '%s'\n"), new_gitdir.buf);
ret = 0;
goto done;
}
@@ -2862,13 +2860,13 @@ int repo_migrate_ref_storage_format(struct repository *repo,
if (ret < 0)
goto done;
- ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
+ ret = move_files(new_gitdir.buf, old_refs->gitdir, errbuf);
if (ret < 0)
goto done;
- if (rmdir(new_gitdir) < 0)
+ if (rmdir(new_gitdir.buf) < 0)
warning_errno(_("could not remove temporary migration directory '%s'"),
- new_gitdir);
+ new_gitdir.buf);
/*
* We have migrated the repository, so we now need to adjust the
@@ -2888,13 +2886,12 @@ int repo_migrate_ref_storage_format(struct repository *repo,
if (ret && did_migrate_refs) {
strbuf_complete(errbuf, '\n');
strbuf_addf(errbuf, _("migrated refs can be found at '%s'"),
- new_gitdir);
+ new_gitdir.buf);
}
if (ret && new_refs)
ref_store_release(new_refs);
ref_transaction_free(transaction);
- strbuf_release(&buf);
- free(new_gitdir);
+ strbuf_release(&new_gitdir);
return ret;
}
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-05 10:03 ` Jeff King
@ 2024-06-05 16:59 ` Junio C Hamano
2024-06-06 4:51 ` Patrick Steinhardt
2024-06-06 4:51 ` Patrick Steinhardt
1 sibling, 1 reply; 103+ messages in thread
From: Junio C Hamano @ 2024-06-05 16:59 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Jeff King, git, Eric Sunshine, Ramsay Jones, Justin Tobler
Jeff King <peff@peff.net> writes:
> Coverity complains here of a leak of the xstrdup(). The return from
> mkdtemp() should generally point to the same buffer we passed in, but if
> it sees an error it will return NULL and the new heap buffer will be
> lost.
>
> Probably unlikely, but since you are on a leak-checking kick, I thought
> I'd mention it. ;)
>
> Since you have a writable strbuf already, maybe:
>
> new_gitdir = mkdtemp(buf.buf);
> if (!new_gitdir)
> ...
> new_gitdir = strbuf_detach(&buf, NULL); /* same pointer, but now we own it */
>
> Or since "buf" is not used for anything else, we could just leave it
> attached to the strbuf. And probably give it a better name. Maybe:
> ...
Hmph, I think this is the second one we want to amend on the topic
and it seems that I merged it a bit too prematurely.
I do not mind reverting the topic out of 'next' and actually would
prefer replacing it with a corrected version, which would allow us
to merge the clean copy to the next release.
Thanks.
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-05 16:59 ` Junio C Hamano
@ 2024-06-06 4:51 ` Patrick Steinhardt
2024-06-06 7:01 ` Jeff King
0 siblings, 1 reply; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 4:51 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git, Eric Sunshine, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]
On Wed, Jun 05, 2024 at 09:59:14AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Coverity complains here of a leak of the xstrdup(). The return from
> > mkdtemp() should generally point to the same buffer we passed in, but if
> > it sees an error it will return NULL and the new heap buffer will be
> > lost.
> >
> > Probably unlikely, but since you are on a leak-checking kick, I thought
> > I'd mention it. ;)
> >
> > Since you have a writable strbuf already, maybe:
> >
> > new_gitdir = mkdtemp(buf.buf);
> > if (!new_gitdir)
> > ...
> > new_gitdir = strbuf_detach(&buf, NULL); /* same pointer, but now we own it */
> >
> > Or since "buf" is not used for anything else, we could just leave it
> > attached to the strbuf. And probably give it a better name. Maybe:
> > ...
>
> Hmph, I think this is the second one we want to amend on the topic
> and it seems that I merged it a bit too prematurely.
>
> I do not mind reverting the topic out of 'next' and actually would
> prefer replacing it with a corrected version, which would allow us
> to merge the clean copy to the next release.
I wouldn't exactly say prematurely, given that it likely wouldn't have
gotten a review without the merge because it was spurred by Coverity :)
I really wish that the Coverity tooling was available to run at will and
locally in our pipelines so that we can stop reacting to it, but instead
address whatever it flags _before_ the code hits the target branch. But,
well, that's not how Coverity works.
Anyway, I'll send a revised version in a bit. Thanks for your extra
review, Peff!
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-06 4:51 ` Patrick Steinhardt
@ 2024-06-06 7:01 ` Jeff King
2024-06-06 15:41 ` Junio C Hamano
0 siblings, 1 reply; 103+ messages in thread
From: Jeff King @ 2024-06-06 7:01 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Junio C Hamano, git, Eric Sunshine, Ramsay Jones, Justin Tobler
On Thu, Jun 06, 2024 at 06:51:15AM +0200, Patrick Steinhardt wrote:
> > I do not mind reverting the topic out of 'next' and actually would
> > prefer replacing it with a corrected version, which would allow us
> > to merge the clean copy to the next release.
>
> I wouldn't exactly say prematurely, given that it likely wouldn't have
> gotten a review without the merge because it was spurred by Coverity :)
> I really wish that the Coverity tooling was available to run at will and
> locally in our pipelines so that we can stop reacting to it, but instead
> address whatever it flags _before_ the code hits the target branch. But,
> well, that's not how Coverity works.
Yeah, I'd agree with the analysis here. While somebody _could_ have
found these by inspection, in practice it was the merge to next that led
me to them.
It may imply that we should be running Coverity earlier, though.
In my fork I trigger Coverity runs based on my personal integration
branch, which is based on next plus a list of non-garbage topics I'm
working on. So I get to see (and fix) my own bugs before anybody else
does. But I don't see other people's bugs until they're in next.
I could try running against "seen", but it's a minor hassle. I don't
otherwise touch that branch at all, and I certainly don't want my daily
driver built off of it. Plus it sometimes has test failures or other
hiccups, and I already get enough false positive noise from Coverity (so
even if I ran it, I'd be unlikely to spend much time digging into
failures).
So what I'd suggest is that you try setting up the Coverity workflow
yourself. There are rough instructions in the GitHub workflow file, and
I imagine you'd be able to port it to GitLab. Coverity does do SSO login
with GitHub, but I don't think it's relevant once you've got an account
there. The opaque token they give you is all you need to upload a build.
-Peff
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-06 7:01 ` Jeff King
@ 2024-06-06 15:41 ` Junio C Hamano
2024-06-08 11:36 ` Jeff King
0 siblings, 1 reply; 103+ messages in thread
From: Junio C Hamano @ 2024-06-06 15:41 UTC (permalink / raw)
To: Jeff King
Cc: Patrick Steinhardt, git, Eric Sunshine, Ramsay Jones,
Justin Tobler
Jeff King <peff@peff.net> writes:
> In my fork I trigger Coverity runs based on my personal integration
> branch, which is based on next plus a list of non-garbage topics I'm
> working on. So I get to see (and fix) my own bugs before anybody else
> does. But I don't see other people's bugs until they're in next.
I am on a mostly same boat but doing a bit better ;-) in that my
daily driver is a point marked as 'jch', somewhere between 'next'
and 'seen', that appears on "git log --first-parent --oneline
master..seen", and this serves as a very small way [*] to see
breakages by others before they hit 'next'.
Side note: This does not work as well as I should, because my
use cases are too narrow to prevent all breakage from getting
into 'next'.
> I could try running against "seen", but it's a minor hassle. I don't
> otherwise touch that branch at all, and I certainly don't want my daily
> driver built off of it. Plus it sometimes has test failures or other
> hiccups, and I already get enough false positive noise from Coverity (so
> even if I ran it, I'd be unlikely to spend much time digging into
> failures).
I'd recommend against anybody using "seen" as their daily driver.
Being in 'seen' merely is "I happened to have seen it floating on
the list", and the only guarantee I can give them is that I at least
have read sections of their code that happened to conflict with
other topics more carefully than just giving a casual reading over
them.
If CI is broken for more than a few days for 'seen', I may look at
them a bit more carefully, only to see which one is causing the
breakage. But that is not necessarily to fix the breakage myself
but to just eject it out of 'seen' ;-).
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-06 15:41 ` Junio C Hamano
@ 2024-06-08 11:36 ` Jeff King
2024-06-08 19:06 ` Junio C Hamano
0 siblings, 1 reply; 103+ messages in thread
From: Jeff King @ 2024-06-08 11:36 UTC (permalink / raw)
To: Junio C Hamano
Cc: Patrick Steinhardt, git, Eric Sunshine, Ramsay Jones,
Justin Tobler
On Thu, Jun 06, 2024 at 08:41:10AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > In my fork I trigger Coverity runs based on my personal integration
> > branch, which is based on next plus a list of non-garbage topics I'm
> > working on. So I get to see (and fix) my own bugs before anybody else
> > does. But I don't see other people's bugs until they're in next.
>
> I am on a mostly same boat but doing a bit better ;-) in that my
> daily driver is a point marked as 'jch', somewhere between 'next'
> and 'seen', that appears on "git log --first-parent --oneline
> master..seen", and this serves as a very small way [*] to see
> breakages by others before they hit 'next'.
>
> Side note: This does not work as well as I should, because my
> use cases are too narrow to prevent all breakage from getting
> into 'next'.
Possibly I should base my daily driver branch on "jch". Like you, there
are many parts of the code I won't exercise day to day. But it would
mean I'd do more testing (and CI) on those topics. The big question is
whether that would introduce a bunch of noise from not-quite-ready
topics being merged to jch. It depends how careful / conservative you
are. :)
-Peff
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-08 11:36 ` Jeff King
@ 2024-06-08 19:06 ` Junio C Hamano
0 siblings, 0 replies; 103+ messages in thread
From: Junio C Hamano @ 2024-06-08 19:06 UTC (permalink / raw)
To: Jeff King
Cc: Patrick Steinhardt, git, Eric Sunshine, Ramsay Jones,
Justin Tobler
Jeff King <peff@peff.net> writes:
> Possibly I should base my daily driver branch on "jch". Like you, there
> are many parts of the code I won't exercise day to day. But it would
> mean I'd do more testing (and CI) on those topics. The big question is
> whether that would introduce a bunch of noise from not-quite-ready
> topics being merged to jch. It depends how careful / conservative you
> are. :)
I am not all that conservative. Especially with the parts of the
system that I do not exercise myself.
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats
2024-06-05 10:03 ` Jeff King
2024-06-05 16:59 ` Junio C Hamano
@ 2024-06-06 4:51 ` Patrick Steinhardt
1 sibling, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 4:51 UTC (permalink / raw)
To: Jeff King; +Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 1288 bytes --]
On Wed, Jun 05, 2024 at 06:03:18AM -0400, Jeff King wrote:
> On Mon, Jun 03, 2024 at 11:31:00AM +0200, Patrick Steinhardt wrote:
>
> > +int repo_migrate_ref_storage_format(struct repository *repo,
> > + enum ref_storage_format format,
> > + unsigned int flags,
> > + struct strbuf *errbuf)
> > +{
> > [...]
> > + new_gitdir = mkdtemp(xstrdup(buf.buf));
> > + if (!new_gitdir) {
> > + strbuf_addf(errbuf, "cannot create migration directory: %s",
> > + strerror(errno));
> > + ret = -1;
> > + goto done;
> > + }
>
> Coverity complains here of a leak of the xstrdup(). The return from
> mkdtemp() should generally point to the same buffer we passed in, but if
> it sees an error it will return NULL and the new heap buffer will be
> lost.
>
> Probably unlikely, but since you are on a leak-checking kick, I thought
> I'd mention it. ;)
>
> Since you have a writable strbuf already, maybe:
>
> new_gitdir = mkdtemp(buf.buf);
> if (!new_gitdir)
> ...
> new_gitdir = strbuf_detach(&buf, NULL); /* same pointer, but now we own it */
>
> Or since "buf" is not used for anything else, we could just leave it
> attached to the strbuf. And probably give it a better name. Maybe:
I like that version, thanks!
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v4 12/12] builtin/refs: new command to migrate ref storage formats
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
` (10 preceding siblings ...)
2024-06-03 9:31 ` [PATCH v4 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-06-03 9:31 ` Patrick Steinhardt
11 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-03 9:31 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler
[-- Attachment #1: Type: text/plain, Size: 16398 bytes --]
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:
- There is no good place to put the migration logic in existing
commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
not the correct place to put it, either.
- I had it in my mind to create a new low-level command for accessing
refs for quite a while already. git-refs(1) is that command and can
over time grow more functionality relating to refs. This should help
discoverability by consolidating low-level access to refs into a
single executable.
As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
.gitignore | 1 +
Documentation/git-refs.txt | 61 ++++++++++
Makefile | 1 +
builtin.h | 1 +
builtin/refs.c | 75 ++++++++++++
command-list.txt | 1 +
git.c | 1 +
t/t1460-refs-migrate.sh | 243 +++++++++++++++++++++++++++++++++++++
8 files changed, 384 insertions(+)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
diff --git a/.gitignore b/.gitignore
index 612c0f6a0f..8caf3700c2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -126,6 +126,7 @@
/git-rebase
/git-receive-pack
/git-reflog
+/git-refs
/git-remote
/git-remote-http
/git-remote-https
diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
new file mode 100644
index 0000000000..5b99e04385
--- /dev/null
+++ b/Documentation/git-refs.txt
@@ -0,0 +1,61 @@
+git-refs(1)
+===========
+
+NAME
+----
+git-refs - Low-level access to refs
+
+
+SYNOPSIS
+--------
+[verse]
+'git refs migrate' --ref-format=<format> [--dry-run]
+
+DESCRIPTION
+-----------
+
+This command provides low-level access to refs.
+
+COMMANDS
+--------
+
+migrate::
+ Migrate ref store between different formats.
+
+OPTIONS
+-------
+
+The following options are specific to 'git refs migrate':
+
+--ref-format=<format>::
+ The ref format to migrate the ref store to. Can be one of:
++
+include::ref-storage-format.txt[]
+
+--dry-run::
+ Perform the migration, but do not modify the repository. The migrated
+ refs will be written into a separate directory that can be inspected
+ separately. The name of the directory will be reported on stdout. This
+ can be used to double check that the migration works as expected before
+ performing the actual migration.
+
+KNOWN LIMITATIONS
+-----------------
+
+The ref format migration has several known limitations in its current form:
+
+* It is not possible to migrate repositories that have reflogs.
+
+* It is not possible to migrate repositories that have worktrees.
+
+* There is no way to block concurrent writes to the repository during an
+ ongoing migration. Concurrent writes can lead to an inconsistent migrated
+ state. Users are expected to block writes on a higher level. If your
+ repository is registered for scheduled maintenance, it is recommended to
+ unregister it first with git-maintenance(1).
+
+These limitations may eventually be lifted.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index cf504963c2..2d702b552c 100644
--- a/Makefile
+++ b/Makefile
@@ -1283,6 +1283,7 @@ BUILTIN_OBJS += builtin/read-tree.o
BUILTIN_OBJS += builtin/rebase.o
BUILTIN_OBJS += builtin/receive-pack.o
BUILTIN_OBJS += builtin/reflog.o
+BUILTIN_OBJS += builtin/refs.o
BUILTIN_OBJS += builtin/remote-ext.o
BUILTIN_OBJS += builtin/remote-fd.o
BUILTIN_OBJS += builtin/remote.o
diff --git a/builtin.h b/builtin.h
index 28280636da..7eda9b2486 100644
--- a/builtin.h
+++ b/builtin.h
@@ -207,6 +207,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix);
int cmd_rebase__interactive(int argc, const char **argv, const char *prefix);
int cmd_receive_pack(int argc, const char **argv, const char *prefix);
int cmd_reflog(int argc, const char **argv, const char *prefix);
+int cmd_refs(int argc, const char **argv, const char *prefix);
int cmd_remote(int argc, const char **argv, const char *prefix);
int cmd_remote_ext(int argc, const char **argv, const char *prefix);
int cmd_remote_fd(int argc, const char **argv, const char *prefix);
diff --git a/builtin/refs.c b/builtin/refs.c
new file mode 100644
index 0000000000..46dcd150d4
--- /dev/null
+++ b/builtin/refs.c
@@ -0,0 +1,75 @@
+#include "builtin.h"
+#include "parse-options.h"
+#include "refs.h"
+#include "repository.h"
+#include "strbuf.h"
+
+#define REFS_MIGRATE_USAGE \
+ N_("git refs migrate --ref-format=<format> [--dry-run]")
+
+static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
+{
+ const char * const migrate_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ const char *format_str = NULL;
+ enum ref_storage_format format;
+ unsigned int flags = 0;
+ struct option options[] = {
+ OPT_STRING_F(0, "ref-format", &format_str, N_("format"),
+ N_("specify the reference format to convert to"),
+ PARSE_OPT_NONEG),
+ OPT_BIT(0, "dry-run", &flags,
+ N_("perform a non-destructive dry-run"),
+ REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN),
+ OPT_END(),
+ };
+ struct strbuf errbuf = STRBUF_INIT;
+ int err;
+
+ argc = parse_options(argc, argv, prefix, options, migrate_usage, 0);
+ if (argc)
+ usage(_("too many arguments"));
+ if (!format_str)
+ usage(_("missing --ref-format=<format>"));
+
+ format = ref_storage_format_by_name(format_str);
+ if (format == REF_STORAGE_FORMAT_UNKNOWN) {
+ err = error(_("unknown ref storage format '%s'"), format_str);
+ goto out;
+ }
+
+ if (the_repository->ref_storage_format == format) {
+ err = error(_("repository already uses '%s' format"),
+ ref_storage_format_to_name(format));
+ goto out;
+ }
+
+ if (repo_migrate_ref_storage_format(the_repository, format, flags, &errbuf) < 0) {
+ err = error("%s", errbuf.buf);
+ goto out;
+ }
+
+ err = 0;
+
+out:
+ strbuf_release(&errbuf);
+ return err;
+}
+
+int cmd_refs(int argc, const char **argv, const char *prefix)
+{
+ const char * const refs_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ parse_opt_subcommand_fn *fn = NULL;
+ struct option opts[] = {
+ OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate),
+ OPT_END(),
+ };
+
+ argc = parse_options(argc, argv, prefix, opts, refs_usage, 0);
+ return fn(argc, argv, prefix);
+}
diff --git a/command-list.txt b/command-list.txt
index c4cd0f352b..e0bb87b3b5 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -157,6 +157,7 @@ git-read-tree plumbingmanipulators
git-rebase mainporcelain history
git-receive-pack synchelpers
git-reflog ancillarymanipulators complete
+git-refs ancillarymanipulators complete
git-remote ancillarymanipulators complete
git-repack ancillarymanipulators complete
git-replace ancillarymanipulators complete
diff --git a/git.c b/git.c
index 637c61ca9c..683bb69194 100644
--- a/git.c
+++ b/git.c
@@ -594,6 +594,7 @@ static struct cmd_struct commands[] = {
{ "rebase", cmd_rebase, RUN_SETUP | NEED_WORK_TREE },
{ "receive-pack", cmd_receive_pack },
{ "reflog", cmd_reflog, RUN_SETUP },
+ { "refs", cmd_refs, RUN_SETUP },
{ "remote", cmd_remote, RUN_SETUP },
{ "remote-ext", cmd_remote_ext, NO_PARSEOPT },
{ "remote-fd", cmd_remote_fd, NO_PARSEOPT },
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
new file mode 100755
index 0000000000..f7c0783d30
--- /dev/null
+++ b/t/t1460-refs-migrate.sh
@@ -0,0 +1,243 @@
+#!/bin/sh
+
+test_description='migration of ref storage backends'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_migration () {
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >expect &&
+ git -C "$1" refs migrate --ref-format="$2" &&
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >actual &&
+ test_cmp expect actual &&
+
+ git -C "$1" rev-parse --show-ref-format >actual &&
+ echo "$2" >expect &&
+ test_cmp expect actual
+}
+
+test_expect_success 'setup' '
+ rm -rf .git &&
+ # The migration does not yet support reflogs.
+ git config --global core.logAllRefUpdates false
+'
+
+test_expect_success "superfluous arguments" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate foo 2>err &&
+ cat >expect <<-EOF &&
+ usage: too many arguments
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "missing ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate 2>err &&
+ cat >expect <<-EOF &&
+ usage: missing --ref-format=<format>
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "unknown ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=unknown 2>err &&
+ cat >expect <<-EOF &&
+ error: unknown ref storage format ${SQ}unknown${SQ}
+ EOF
+ test_cmp expect err
+'
+
+ref_formats="files reftable"
+for from_format in $ref_formats
+do
+ for to_format in $ref_formats
+ do
+ if test "$from_format" = "$to_format"
+ then
+ continue
+ fi
+
+ test_expect_success "$from_format: migration to same format fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$from_format 2>err &&
+ cat >expect <<-EOF &&
+ error: repository already uses ${SQ}$from_format${SQ} format
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with reflog fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_config -C repo core.logAllRefUpdates true &&
+ test_commit -C repo logged &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating reflogs is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with worktree fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ git -C repo worktree add wt &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating repositories with worktrees is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: unborn HEAD" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: single ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: bare repository" '
+ test_when_finished "rm -rf repo repo.git" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git clone --ref-format=$from_format --mirror repo repo.git &&
+ test_migration repo.git "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dangling symref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo symbolic-ref BROKEN_HEAD refs/heads/nonexistent &&
+ test_migration repo "$to_format" &&
+ echo refs/heads/nonexistent >expect &&
+ git -C repo symbolic-ref BROKEN_HEAD >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: broken ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test-tool -C repo ref-store main update-ref "" refs/heads/broken \
+ "$(test_oid 001)" "$ZERO_OID" REF_SKIP_CREATE_REFLOG,REF_SKIP_OID_VERIFICATION &&
+ test_migration repo "$to_format" &&
+ test_oid 001 >expect &&
+ git -C repo rev-parse refs/heads/broken >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: pseudo-refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo update-ref FOO_HEAD HEAD &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: special refs are left alone" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo rev-parse HEAD >repo/.git/MERGE_HEAD &&
+ git -C repo rev-parse MERGE_HEAD &&
+ test_migration repo "$to_format" &&
+ test_path_is_file repo/.git/MERGE_HEAD
+ '
+
+ test_expect_success "$from_format -> $to_format: a bunch of refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+
+ test_commit -C repo initial &&
+ cat >input <<-EOF &&
+ create FOO_HEAD HEAD
+ create refs/heads/branch-1 HEAD
+ create refs/heads/branch-2 HEAD
+ create refs/heads/branch-3 HEAD
+ create refs/heads/branch-4 HEAD
+ create refs/tags/tag-1 HEAD
+ create refs/tags/tag-2 HEAD
+ EOF
+ git -C repo update-ref --stdin <input &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dry-run migration does not modify repository" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo refs migrate --dry-run \
+ --ref-format=$to_format >output &&
+ grep "Finished dry-run migration of refs" output &&
+ test_path_is_dir repo/.git/ref_migration.* &&
+ echo $from_format >expect &&
+ git -C repo rev-parse --show-ref-format >actual &&
+ test_cmp expect actual
+ '
+ done
+done
+
+test_expect_success 'migrating from files format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=files repo &&
+ test_commit -C repo first &&
+ git -C repo pack-refs --all &&
+ test_commit -C repo second &&
+ git -C repo update-ref ORIG_HEAD HEAD &&
+ git -C repo rev-parse HEAD >repo/.git/FETCH_HEAD &&
+
+ test_path_is_file repo/.git/HEAD &&
+ test_path_is_file repo/.git/ORIG_HEAD &&
+ test_path_is_file repo/.git/refs/heads/main &&
+ test_path_is_file repo/.git/packed-refs &&
+
+ test_migration repo reftable &&
+
+ echo "ref: refs/heads/.invalid" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ echo "this repository uses the reftable format" >expect &&
+ test_cmp expect repo/.git/refs/heads &&
+ test_path_is_file repo/.git/FETCH_HEAD &&
+ test_path_is_missing repo/.git/ORIG_HEAD &&
+ test_path_is_missing repo/.git/refs/heads/main &&
+ test_path_is_missing repo/.git/logs &&
+ test_path_is_missing repo/.git/packed-refs
+'
+
+test_expect_success 'migrating from reftable format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=reftable repo &&
+ test_commit -C repo first &&
+
+ test_path_is_dir repo/.git/reftable &&
+ test_migration repo files &&
+
+ test_path_is_missing repo/.git/reftable &&
+ echo "ref: refs/heads/main" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ test_path_is_file repo/.git/refs/heads/main
+'
+
+test_done
--
2.45.1.410.g58bac47f8e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 00/12] refs: ref storage migrations
2024-05-23 8:25 [PATCH 0/9] refs: ref storage format migrations Patrick Steinhardt
` (12 preceding siblings ...)
2024-06-03 9:30 ` [PATCH v4 00/12] refs: ref storage migrations Patrick Steinhardt
@ 2024-06-06 5:28 ` Patrick Steinhardt
2024-06-06 5:28 ` [PATCH v5 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
` (13 more replies)
13 siblings, 14 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:28 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 8227 bytes --]
Hi,
the ref storage migration was merged to `next`, but got reverted due to
some additional findings by Peff and/or Coverity.
Changes compared to v4:
- Adapt comment of `ref_store_init()` to the new parameter.
- Fix use of an uninitialized return value in `for_each_root_ref()`.
- Fix overwrite of ret code in `files_ref_store_remove_on_disk()`.
- Adapt an error message to more clearly point out that deletion of
"refs/" directory failed in `reftable_be_remove_on_disk()`.
- Fix a leak when `mkdtemp()` fails.
Thanks!
Patrick
Patrick Steinhardt (12):
setup: unset ref storage when reinitializing repository version
refs: convert ref storage format to an enum
refs: pass storage format to `ref_store_init()` explicitly
refs: allow to skip creation of reflog entries
refs/files: refactor `add_pseudoref_and_head_entries()`
refs/files: extract function to iterate through root refs
refs/files: fix NULL pointer deref when releasing ref store
reftable: inline `merged_table_release()`
worktree: don't store main worktree twice
refs: implement removal of ref storages
refs: implement logic to migrate between ref storage formats
builtin/refs: new command to migrate ref storage formats
.gitignore | 1 +
Documentation/git-refs.txt | 61 +++++++
Makefile | 1 +
builtin.h | 1 +
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
builtin/refs.c | 75 ++++++++
command-list.txt | 1 +
git.c | 1 +
refs.c | 345 +++++++++++++++++++++++++++++++++++--
refs.h | 41 ++++-
refs/files-backend.c | 124 +++++++++++--
refs/packed-backend.c | 15 ++
refs/ref-cache.c | 2 +
refs/refs-internal.h | 7 +
refs/reftable-backend.c | 55 +++++-
reftable/merged.c | 12 +-
reftable/merged.h | 2 -
reftable/stack.c | 8 +-
repository.c | 3 +-
repository.h | 10 +-
setup.c | 10 +-
setup.h | 9 +-
t/helper/test-ref-store.c | 1 +
t/t1460-refs-migrate.sh | 243 ++++++++++++++++++++++++++
worktree.c | 29 ++--
26 files changed, 979 insertions(+), 82 deletions(-)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
Range-diff against v4:
1: afb705f6a0 = 1: afb705f6a0 setup: unset ref storage when reinitializing repository version
2: 7989e82dcd = 2: 7989e82dcd refs: convert ref storage format to an enum
3: 7d1a86292c ! 3: 26005abb28 refs: pass storage format to `ref_store_init()` explicitly
@@ Commit message
## refs.c ##
@@ refs.c: static struct ref_store *lookup_ref_store_map(struct strmap *map,
- * gitdir.
+
+ /*
+ * Create, record, and return a ref_store instance for the specified
+- * gitdir.
++ * gitdir using the given ref storage format.
*/
static struct ref_store *ref_store_init(struct repository *repo,
+ enum ref_storage_format format,
4: d0539b7456 = 4: 053f1be657 refs: allow to skip creation of reflog entries
5: 7f9ce5af2e = 5: 29147da2b9 refs/files: refactor `add_pseudoref_and_head_entries()`
6: f7577a0ab3 ! 6: 86cf0c84b1 refs/files: extract function to iterate through root refs
@@ refs/files-backend.c: static void add_root_refs(struct files_ref_store *refs,
strbuf_setlen(&refname, dirnamelen);
}
+
++ ret = 0;
++
+done:
strbuf_release(&refname);
strbuf_release(&path);
7: 56baa798fb = 7: 6b0aaf2ac8 refs/files: fix NULL pointer deref when releasing ref store
8: c7e8ab40b5 = 8: 0690d5eae9 reftable: inline `merged_table_release()`
9: 7a89aae515 = 9: 89699a641d worktree: don't store main worktree twice
10: f9d9420cf9 ! 10: 7b5fee2185 refs: implement removal of ref storages
@@ refs/files-backend.c: static int files_ref_store_create_on_disk(struct ref_store
+ }
+ strbuf_reset(&sb);
+
-+ ret = for_each_root_ref(refs, remove_one_root_ref, &data);
-+ if (ret < 0)
++ if (for_each_root_ref(refs, remove_one_root_ref, &data) < 0)
+ ret = -1;
+
+ if (ref_store_remove_on_disk(refs->packed_ref_store, err) < 0)
@@ refs/reftable-backend.c: static int reftable_be_create_on_disk(struct ref_store
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (rmdir(sb.buf) < 0) {
-+ strbuf_addf(err, "could not delete stub heads: %s",
++ strbuf_addf(err, "could not delete refs directory: %s",
+ strerror(errno));
+ ret = -1;
+ }
11: 1f26051eff ! 11: 893d99e98e refs: implement logic to migrate between ref storage formats
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+{
+ struct ref_store *old_refs = NULL, *new_refs = NULL;
+ struct ref_transaction *transaction = NULL;
-+ struct strbuf buf = STRBUF_INIT;
++ struct strbuf new_gitdir = STRBUF_INIT;
+ struct migration_data data;
+ size_t reflog_count = 0;
-+ char *new_gitdir = NULL;
+ int did_migrate_refs = 0;
+ int ret;
+
++ if (repo->ref_storage_format == format) {
++ strbuf_addstr(errbuf, "current and new ref storage format are equal");
++ ret = -1;
++ goto done;
++ }
++
+ old_refs = get_main_ref_store(repo);
+
+ /*
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ *
+ * 6. Change the repository format to the new ref format.
+ */
-+ strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
-+ new_gitdir = mkdtemp(xstrdup(buf.buf));
-+ if (!new_gitdir) {
++ strbuf_addf(&new_gitdir, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
++ if (!mkdtemp(new_gitdir.buf)) {
+ strbuf_addf(errbuf, "cannot create migration directory: %s",
+ strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
-+ new_refs = ref_store_init(repo, format, new_gitdir,
++ new_refs = ref_store_init(repo, format, new_gitdir.buf,
+ REF_STORE_ALL_CAPS);
+ ret = ref_store_create_on_disk(new_refs, 0, errbuf);
+ if (ret < 0)
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+
+ if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
+ printf(_("Finished dry-run migration of refs, "
-+ "the result can be found at '%s'\n"), new_gitdir);
++ "the result can be found at '%s'\n"), new_gitdir.buf);
+ ret = 0;
+ goto done;
+ }
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ if (ret < 0)
+ goto done;
+
-+ ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
++ ret = move_files(new_gitdir.buf, old_refs->gitdir, errbuf);
+ if (ret < 0)
+ goto done;
+
-+ if (rmdir(new_gitdir) < 0)
++ if (rmdir(new_gitdir.buf) < 0)
+ warning_errno(_("could not remove temporary migration directory '%s'"),
-+ new_gitdir);
++ new_gitdir.buf);
+
+ /*
+ * We have migrated the repository, so we now need to adjust the
@@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
+ if (ret && did_migrate_refs) {
+ strbuf_complete(errbuf, '\n');
+ strbuf_addf(errbuf, _("migrated refs can be found at '%s'"),
-+ new_gitdir);
++ new_gitdir.buf);
+ }
+
+ if (ret && new_refs)
+ ref_store_release(new_refs);
+ ref_transaction_free(transaction);
-+ strbuf_release(&buf);
-+ free(new_gitdir);
++ strbuf_release(&new_gitdir);
+ return ret;
+}
12: 83cb3f8c96 = 12: ec0c6d3cf1 builtin/refs: new command to migrate ref storage formats
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v5 01/12] setup: unset ref storage when reinitializing repository version
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
@ 2024-06-06 5:28 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
` (12 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:28 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]
When reinitializing a repository's version we may end up unsetting the
hash algorithm when it matches the default hash algorithm. If we didn't
do that then the previously configured value might remain intact.
While the same issue exists for the ref storage extension, we don't do
this here. This has been fine for most of the part because it is not
supported to re-initialize a repository with a different ref storage
format anyway. We're about to introduce a new command to migrate ref
storages though, so this is about to become an issue there.
Prepare for this and unset the ref storage format when reinitializing a
repository with the "files" format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/setup.c b/setup.c
index 7975230ffb..8c84ec9d4b 100644
--- a/setup.c
+++ b/setup.c
@@ -2028,6 +2028,8 @@ void initialize_repository_version(int hash_algo,
if (ref_storage_format != REF_STORAGE_FORMAT_FILES)
git_config_set("extensions.refstorage",
ref_storage_format_to_name(ref_storage_format));
+ else if (reinit)
+ git_config_set_gently("extensions.refstorage", NULL);
}
static int is_reinit(void)
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 02/12] refs: convert ref storage format to an enum
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
2024-06-06 5:28 ` [PATCH v5 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
` (11 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 8337 bytes --]
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.
Convert the ref storage format to instead be an enum.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/clone.c | 2 +-
builtin/init-db.c | 2 +-
refs.c | 7 ++++---
refs.h | 10 ++++++++--
repository.c | 3 ++-
repository.h | 10 ++++------
setup.c | 8 ++++----
setup.h | 9 +++++----
8 files changed, 29 insertions(+), 22 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index 1e07524c53..e808e02017 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -970,7 +970,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
int submodule_progress;
int filter_submodules = 0;
int hash_algo;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
const int do_not_override_repo_unix_permissions = -1;
const char *template_dir;
char *template_dir_dup = NULL;
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 0170469b84..582dcf20f8 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -81,7 +81,7 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
const char *ref_format = NULL;
const char *initial_branch = NULL;
int hash_algo = GIT_HASH_UNKNOWN;
- unsigned int ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
+ enum ref_storage_format ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN;
int init_shared_repository = -1;
const struct option init_db_options[] = {
OPT_STRING(0, "template", &template_dir, N_("template-directory"),
diff --git a/refs.c b/refs.c
index 31032588e0..e6db85a165 100644
--- a/refs.c
+++ b/refs.c
@@ -37,14 +37,15 @@ static const struct ref_storage_be *refs_backends[] = {
[REF_STORAGE_FORMAT_REFTABLE] = &refs_be_reftable,
};
-static const struct ref_storage_be *find_ref_storage_backend(unsigned int ref_storage_format)
+static const struct ref_storage_be *find_ref_storage_backend(
+ enum ref_storage_format ref_storage_format)
{
if (ref_storage_format < ARRAY_SIZE(refs_backends))
return refs_backends[ref_storage_format];
return NULL;
}
-unsigned int ref_storage_format_by_name(const char *name)
+enum ref_storage_format ref_storage_format_by_name(const char *name)
{
for (unsigned int i = 0; i < ARRAY_SIZE(refs_backends); i++)
if (refs_backends[i] && !strcmp(refs_backends[i]->name, name))
@@ -52,7 +53,7 @@ unsigned int ref_storage_format_by_name(const char *name)
return REF_STORAGE_FORMAT_UNKNOWN;
}
-const char *ref_storage_format_to_name(unsigned int ref_storage_format)
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format)
{
const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
if (!be)
diff --git a/refs.h b/refs.h
index fe7f0db35e..a7afa9bede 100644
--- a/refs.h
+++ b/refs.h
@@ -11,8 +11,14 @@ struct string_list;
struct string_list_item;
struct worktree;
-unsigned int ref_storage_format_by_name(const char *name);
-const char *ref_storage_format_to_name(unsigned int ref_storage_format);
+enum ref_storage_format {
+ REF_STORAGE_FORMAT_UNKNOWN,
+ REF_STORAGE_FORMAT_FILES,
+ REF_STORAGE_FORMAT_REFTABLE,
+};
+
+enum ref_storage_format ref_storage_format_by_name(const char *name);
+const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format);
/*
* Resolve a reference, recursively following symbolic refererences.
diff --git a/repository.c b/repository.c
index d29b0304fb..166863f852 100644
--- a/repository.c
+++ b/repository.c
@@ -124,7 +124,8 @@ void repo_set_compat_hash_algo(struct repository *repo, int algo)
repo_read_loose_object_map(repo);
}
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format)
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format)
{
repo->ref_storage_format = format;
}
diff --git a/repository.h b/repository.h
index 4bd8969005..a35cd77c35 100644
--- a/repository.h
+++ b/repository.h
@@ -1,6 +1,7 @@
#ifndef REPOSITORY_H
#define REPOSITORY_H
+#include "refs.h"
#include "strmap.h"
struct config_set;
@@ -26,10 +27,6 @@ enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NOOP,
};
-#define REF_STORAGE_FORMAT_UNKNOWN 0
-#define REF_STORAGE_FORMAT_FILES 1
-#define REF_STORAGE_FORMAT_REFTABLE 2
-
struct repo_settings {
int initialized;
@@ -181,7 +178,7 @@ struct repository {
const struct git_hash_algo *compat_hash_algo;
/* Repository's reference storage format, as serialized on disk. */
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
/* A unique-id for tracing purposes. */
int trace2_repo_id;
@@ -220,7 +217,8 @@ void repo_set_gitdir(struct repository *repo, const char *root,
void repo_set_worktree(struct repository *repo, const char *path);
void repo_set_hash_algo(struct repository *repo, int algo);
void repo_set_compat_hash_algo(struct repository *repo, int compat_algo);
-void repo_set_ref_storage_format(struct repository *repo, unsigned int format);
+void repo_set_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format);
void initialize_repository(struct repository *repo);
RESULT_MUST_BE_USED
int repo_init(struct repository *r, const char *gitdir, const char *worktree);
diff --git a/setup.c b/setup.c
index 8c84ec9d4b..b49ee3e95f 100644
--- a/setup.c
+++ b/setup.c
@@ -1997,7 +1997,7 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
}
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit)
{
char repo_version_string[10];
@@ -2044,7 +2044,7 @@ static int is_reinit(void)
return ret;
}
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet)
{
struct strbuf err = STRBUF_INIT;
@@ -2243,7 +2243,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
}
static void validate_ref_storage_format(struct repository_format *repo_fmt,
- unsigned int format)
+ enum ref_storage_format format)
{
const char *name = getenv("GIT_DEFAULT_REF_FORMAT");
@@ -2263,7 +2263,7 @@ static void validate_ref_storage_format(struct repository_format *repo_fmt,
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch,
int init_shared_repository, unsigned int flags)
{
diff --git a/setup.h b/setup.h
index b3fd3bf45a..cd8dbc2497 100644
--- a/setup.h
+++ b/setup.h
@@ -1,6 +1,7 @@
#ifndef SETUP_H
#define SETUP_H
+#include "refs.h"
#include "string-list.h"
int is_inside_git_dir(void);
@@ -128,7 +129,7 @@ struct repository_format {
int is_bare;
int hash_algo;
int compat_hash_algo;
- unsigned int ref_storage_format;
+ enum ref_storage_format ref_storage_format;
int sparse_index;
char *work_tree;
struct string_list unknown_extensions;
@@ -192,13 +193,13 @@ const char *get_template_dir(const char *option_template);
int init_db(const char *git_dir, const char *real_git_dir,
const char *template_dir, int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
const char *initial_branch, int init_shared_repository,
unsigned int flags);
void initialize_repository_version(int hash_algo,
- unsigned int ref_storage_format,
+ enum ref_storage_format ref_storage_format,
int reinit);
-void create_reference_database(unsigned int ref_storage_format,
+void create_reference_database(enum ref_storage_format ref_storage_format,
const char *initial_branch, int quiet);
/*
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 03/12] refs: pass storage format to `ref_store_init()` explicitly
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
2024-06-06 5:28 ` [PATCH v5 01/12] setup: unset ref storage when reinitializing repository version Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 02/12] refs: convert ref storage format to an enum Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
` (10 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 2775 bytes --]
We're about to introduce logic to migrate refs from one storage format
to another one. This will require us to initialize a ref store with a
different format than the one used by the passed-in repository.
Prepare for this by accepting the desired ref storage format as
parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/refs.c b/refs.c
index e6db85a165..423684b8b8 100644
--- a/refs.c
+++ b/refs.c
@@ -1891,16 +1891,17 @@ static struct ref_store *lookup_ref_store_map(struct strmap *map,
/*
* Create, record, and return a ref_store instance for the specified
- * gitdir.
+ * gitdir using the given ref storage format.
*/
static struct ref_store *ref_store_init(struct repository *repo,
+ enum ref_storage_format format,
const char *gitdir,
unsigned int flags)
{
const struct ref_storage_be *be;
struct ref_store *refs;
- be = find_ref_storage_backend(repo->ref_storage_format);
+ be = find_ref_storage_backend(format);
if (!be)
BUG("reference backend is unknown");
@@ -1922,7 +1923,8 @@ struct ref_store *get_main_ref_store(struct repository *r)
if (!r->gitdir)
BUG("attempting to get main_ref_store outside of repository");
- r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
+ r->refs_private = ref_store_init(r, r->ref_storage_format,
+ r->gitdir, REF_STORE_ALL_CAPS);
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
return r->refs_private;
}
@@ -1982,7 +1984,8 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo,
free(subrepo);
goto done;
}
- refs = ref_store_init(subrepo, submodule_sb.buf,
+ refs = ref_store_init(subrepo, the_repository->ref_storage_format,
+ submodule_sb.buf,
REF_STORE_READ | REF_STORE_ODB);
register_ref_store_map(&repo->submodule_ref_stores, "submodule",
refs, submodule);
@@ -2011,12 +2014,12 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
struct strbuf common_path = STRBUF_INIT;
strbuf_git_common_path(&common_path, wt->repo,
"worktrees/%s", wt->id);
- refs = ref_store_init(wt->repo, common_path.buf,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, wt->repo->ref_storage_format,
+ common_path.buf, REF_STORE_ALL_CAPS);
strbuf_release(&common_path);
} else {
- refs = ref_store_init(wt->repo, wt->repo->commondir,
- REF_STORE_ALL_CAPS);
+ refs = ref_store_init(wt->repo, the_repository->ref_storage_format,
+ wt->repo->commondir, REF_STORE_ALL_CAPS);
}
if (refs)
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 04/12] refs: allow to skip creation of reflog entries
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (2 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 03/12] refs: pass storage format to `ref_store_init()` explicitly Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
` (9 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 3845 bytes --]
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 6 ++++++
refs.h | 8 +++++++-
refs/files-backend.c | 4 ++++
refs/reftable-backend.c | 3 ++-
t/helper/test-ref-store.c | 1 +
5 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/refs.c b/refs.c
index 423684b8b8..fa3b0a82d4 100644
--- a/refs.c
+++ b/refs.c
@@ -1194,6 +1194,12 @@ int ref_transaction_update(struct ref_transaction *transaction,
{
assert(err);
+ if ((flags & REF_FORCE_CREATE_REFLOG) &&
+ (flags & REF_SKIP_CREATE_REFLOG)) {
+ strbuf_addstr(err, _("refusing to force and skip creation of reflog"));
+ return -1;
+ }
+
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
((new_oid && !is_null_oid(new_oid)) ?
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL) :
diff --git a/refs.h b/refs.h
index a7afa9bede..50a2b3ab09 100644
--- a/refs.h
+++ b/refs.h
@@ -659,13 +659,19 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
*/
#define REF_SKIP_REFNAME_VERIFICATION (1 << 11)
+/*
+ * Skip creation of a reflog entry, even if it would have otherwise been
+ * created.
+ */
+#define REF_SKIP_CREATE_REFLOG (1 << 12)
+
/*
* Bitmask of all of the flags that are allowed to be passed in to
* ref_transaction_update() and friends:
*/
#define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS \
(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
- REF_SKIP_REFNAME_VERIFICATION)
+ REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
/*
* Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 73380d7e99..bd0d63bcba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1750,6 +1750,9 @@ static int files_log_ref_write(struct files_ref_store *refs,
{
int logfd, result;
+ if (flags & REF_SKIP_CREATE_REFLOG)
+ return 0;
+
if (log_all_ref_updates == LOG_REFS_UNSET)
log_all_ref_updates = is_bare_repository() ? LOG_REFS_NONE : LOG_REFS_NORMAL;
@@ -2251,6 +2254,7 @@ static int split_head_update(struct ref_update *update,
struct ref_update *new_update;
if ((update->flags & REF_LOG_ONLY) ||
+ (update->flags & REF_SKIP_CREATE_REFLOG) ||
(update->flags & REF_IS_PRUNING) ||
(update->flags & REF_UPDATE_VIA_HEAD))
return 0;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f6edfdf5b3..bffed9257f 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1103,7 +1103,8 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
if (ret)
goto done;
- } else if (u->flags & REF_HAVE_NEW &&
+ } else if (!(u->flags & REF_SKIP_CREATE_REFLOG) &&
+ (u->flags & REF_HAVE_NEW) &&
(u->flags & REF_FORCE_CREATE_REFLOG ||
should_write_log(&arg->refs->base, u->refname))) {
struct reftable_log_record *log;
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index c9efd74c2b..ad24300170 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -126,6 +126,7 @@ static struct flag_definition transaction_flags[] = {
FLAG_DEF(REF_FORCE_CREATE_REFLOG),
FLAG_DEF(REF_SKIP_OID_VERIFICATION),
FLAG_DEF(REF_SKIP_REFNAME_VERIFICATION),
+ FLAG_DEF(REF_SKIP_CREATE_REFLOG),
{ NULL, 0 }
};
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 05/12] refs/files: refactor `add_pseudoref_and_head_entries()`
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (3 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 04/12] refs: allow to skip creation of reflog entries Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
` (8 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]
The `add_pseudoref_and_head_entries()` function accepts both the ref
store as well as a directory name as input. This is unnecessary though
as the ref store already uniquely identifies the root directory of the
ref store anyway.
Furthermore, the function is misnamed now that we have clarified the
meaning of pseudorefs as it doesn't add pseudorefs, but root refs.
Rename it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index bd0d63bcba..b4e5437ffe 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -324,16 +324,14 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
}
/*
- * Add pseudorefs to the ref dir by parsing the directory for any files
- * which follow the pseudoref syntax.
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
*/
-static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
- struct ref_dir *dir,
- const char *dirname)
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
{
- struct files_ref_store *refs =
- files_downcast(ref_store, REF_STORE_READ, "fill_ref_dir");
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
+ const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
DIR *d;
@@ -388,8 +386,7 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
dir = get_ref_dir(refs->loose->root);
if (flags & DO_FOR_EACH_INCLUDE_ROOT_REFS)
- add_pseudoref_and_head_entries(dir->cache->ref_store, dir,
- refs->loose->root->name);
+ add_root_refs(refs, dir);
/*
* Add an incomplete entry for "refs/" (to be filled
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 06/12] refs/files: extract function to iterate through root refs
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (4 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 05/12] refs/files: refactor `add_pseudoref_and_head_entries()` Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
` (7 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]
Extract a new function that can be used to iterate through all root refs
known to the "files" backend. This will be used in the next commit,
where we start to teach ref backends to remove themselves.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 51 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 42 insertions(+), 9 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b4e5437ffe..de8cc83174 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -323,17 +323,15 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
add_per_worktree_entries_to_dir(dir, dirname);
}
-/*
- * Add root refs to the ref dir by parsing the directory for any files which
- * follow the root ref syntax.
- */
-static void add_root_refs(struct files_ref_store *refs,
- struct ref_dir *dir)
+static int for_each_root_ref(struct files_ref_store *refs,
+ int (*cb)(const char *refname, void *cb_data),
+ void *cb_data)
{
struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
const char *dirname = refs->loose->root->name;
struct dirent *de;
size_t dirnamelen;
+ int ret;
DIR *d;
files_ref_path(refs, &path, dirname);
@@ -341,7 +339,7 @@ static void add_root_refs(struct files_ref_store *refs,
d = opendir(path.buf);
if (!d) {
strbuf_release(&path);
- return;
+ return -1;
}
strbuf_addstr(&refname, dirname);
@@ -357,14 +355,49 @@ static void add_root_refs(struct files_ref_store *refs,
strbuf_addstr(&refname, de->d_name);
dtype = get_dtype(de, &path, 1);
- if (dtype == DT_REG && is_root_ref(de->d_name))
- loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
+ if (dtype == DT_REG && is_root_ref(de->d_name)) {
+ ret = cb(refname.buf, cb_data);
+ if (ret)
+ goto done;
+ }
strbuf_setlen(&refname, dirnamelen);
}
+
+ ret = 0;
+
+done:
strbuf_release(&refname);
strbuf_release(&path);
closedir(d);
+ return ret;
+}
+
+struct fill_root_ref_data {
+ struct files_ref_store *refs;
+ struct ref_dir *dir;
+};
+
+static int fill_root_ref(const char *refname, void *cb_data)
+{
+ struct fill_root_ref_data *data = cb_data;
+ loose_fill_ref_dir_regular_file(data->refs, refname, data->dir);
+ return 0;
+}
+
+/*
+ * Add root refs to the ref dir by parsing the directory for any files which
+ * follow the root ref syntax.
+ */
+static void add_root_refs(struct files_ref_store *refs,
+ struct ref_dir *dir)
+{
+ struct fill_root_ref_data data = {
+ .refs = refs,
+ .dir = dir,
+ };
+
+ for_each_root_ref(refs, fill_root_ref, &data);
}
static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 07/12] refs/files: fix NULL pointer deref when releasing ref store
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (5 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 06/12] refs/files: extract function to iterate through root refs Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
` (6 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 702 bytes --]
The `free_ref_cache()` function is not `NULL` safe and will thus
segfault when being passed such a pointer. This can easily happen when
trying to release a partially initialized "files" ref store. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/ref-cache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index b6c53fc8ed..4ce519bbc8 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -71,6 +71,8 @@ static void free_ref_entry(struct ref_entry *entry)
void free_ref_cache(struct ref_cache *cache)
{
+ if (!cache)
+ return;
free_ref_entry(cache->root);
free(cache);
}
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 08/12] reftable: inline `merged_table_release()`
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (6 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 07/12] refs/files: fix NULL pointer deref when releasing ref store Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 09/12] worktree: don't store main worktree twice Patrick Steinhardt
` (5 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 2400 bytes --]
The function `merged_table_release()` releases a merged table, whereas
`reftable_merged_table_free()` releases a merged table and then also
free's its pointer. But all callsites of `merged_table_release()` are in
fact followed by `reftable_merged_table_free()`, which is redundant.
Inline `merged_table_release()` into `reftable_merged_table_free()` to
get rid of this redundance.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/merged.c | 12 ++----------
reftable/merged.h | 2 --
reftable/stack.c | 8 ++------
3 files changed, 4 insertions(+), 18 deletions(-)
diff --git a/reftable/merged.c b/reftable/merged.c
index f85a24c678..804fdc0de0 100644
--- a/reftable/merged.c
+++ b/reftable/merged.c
@@ -207,19 +207,11 @@ int reftable_new_merged_table(struct reftable_merged_table **dest,
return 0;
}
-/* clears the list of subtable, without affecting the readers themselves. */
-void merged_table_release(struct reftable_merged_table *mt)
-{
- FREE_AND_NULL(mt->stack);
- mt->stack_len = 0;
-}
-
void reftable_merged_table_free(struct reftable_merged_table *mt)
{
- if (!mt) {
+ if (!mt)
return;
- }
- merged_table_release(mt);
+ FREE_AND_NULL(mt->stack);
reftable_free(mt);
}
diff --git a/reftable/merged.h b/reftable/merged.h
index a2571dbc99..9db45c3196 100644
--- a/reftable/merged.h
+++ b/reftable/merged.h
@@ -24,6 +24,4 @@ struct reftable_merged_table {
uint64_t max;
};
-void merged_table_release(struct reftable_merged_table *mt);
-
#endif
diff --git a/reftable/stack.c b/reftable/stack.c
index a59ebe038d..984fd866d0 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -261,10 +261,8 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
new_tables = NULL;
st->readers_len = new_readers_len;
- if (st->merged) {
- merged_table_release(st->merged);
+ if (st->merged)
reftable_merged_table_free(st->merged);
- }
if (st->readers) {
reftable_free(st->readers);
}
@@ -968,10 +966,8 @@ static int stack_write_compact(struct reftable_stack *st,
done:
reftable_iterator_destroy(&it);
- if (mt) {
- merged_table_release(mt);
+ if (mt)
reftable_merged_table_free(mt);
- }
reftable_ref_record_release(&ref);
reftable_log_record_release(&log);
st->stats.entries_written += entries;
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 09/12] worktree: don't store main worktree twice
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (7 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 08/12] reftable: inline `merged_table_release()` Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 10/12] refs: implement removal of ref storages Patrick Steinhardt
` (4 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 3113 bytes --]
In `get_worktree_ref_store()` we either return the repository's main ref
store, or we look up the ref store via the map of worktree ref stores.
Which of these worktrees gets picked depends on the `is_current` bit of
the worktree, which indicates whether the worktree is the one that
corresponds to `the_repository`.
The bit is getting set in `get_worktrees()`, but only after we have
computed the list of all worktrees. This is too late though, because at
that time we have already called `get_worktree_ref_store()` on each of
the worktrees via `add_head_info()`. The consequence is that the current
worktree will not have been marked accordingly, which means that we did
not use the main ref store, but instead created a new ref store. We thus
have two separate ref stores now that map to the same ref database.
Fix this by setting `is_current` before we call `add_head_info()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
worktree.c | 29 +++++++++++------------------
1 file changed, 11 insertions(+), 18 deletions(-)
diff --git a/worktree.c b/worktree.c
index 12eadacc61..70844d023a 100644
--- a/worktree.c
+++ b/worktree.c
@@ -53,6 +53,15 @@ static void add_head_info(struct worktree *wt)
wt->is_detached = 1;
}
+static int is_current_worktree(struct worktree *wt)
+{
+ char *git_dir = absolute_pathdup(get_git_dir());
+ const char *wt_git_dir = get_worktree_git_dir(wt);
+ int is_current = !fspathcmp(git_dir, absolute_path(wt_git_dir));
+ free(git_dir);
+ return is_current;
+}
+
/**
* get the main worktree
*/
@@ -76,6 +85,7 @@ static struct worktree *get_main_worktree(int skip_reading_head)
*/
worktree->is_bare = (is_bare_repository_cfg == 1) ||
is_bare_repository();
+ worktree->is_current = is_current_worktree(worktree);
if (!skip_reading_head)
add_head_info(worktree);
return worktree;
@@ -102,6 +112,7 @@ struct worktree *get_linked_worktree(const char *id,
worktree->repo = the_repository;
worktree->path = strbuf_detach(&worktree_path, NULL);
worktree->id = xstrdup(id);
+ worktree->is_current = is_current_worktree(worktree);
if (!skip_reading_head)
add_head_info(worktree);
@@ -111,23 +122,6 @@ struct worktree *get_linked_worktree(const char *id,
return worktree;
}
-static void mark_current_worktree(struct worktree **worktrees)
-{
- char *git_dir = absolute_pathdup(get_git_dir());
- int i;
-
- for (i = 0; worktrees[i]; i++) {
- struct worktree *wt = worktrees[i];
- const char *wt_git_dir = get_worktree_git_dir(wt);
-
- if (!fspathcmp(git_dir, absolute_path(wt_git_dir))) {
- wt->is_current = 1;
- break;
- }
- }
- free(git_dir);
-}
-
/*
* NEEDSWORK: This function exists so that we can look up metadata of a
* worktree without trying to access any of its internals like the refdb. It
@@ -164,7 +158,6 @@ static struct worktree **get_worktrees_internal(int skip_reading_head)
ALLOC_GROW(list, counter + 1, alloc);
list[counter] = NULL;
- mark_current_worktree(list);
return list;
}
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 10/12] refs: implement removal of ref storages
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (8 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 09/12] worktree: don't store main worktree twice Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
` (3 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 8477 bytes --]
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.
Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 5 ++++
refs.h | 5 ++++
refs/files-backend.c | 62 +++++++++++++++++++++++++++++++++++++++++
refs/packed-backend.c | 15 ++++++++++
refs/refs-internal.h | 7 +++++
refs/reftable-backend.c | 52 ++++++++++++++++++++++++++++++++++
6 files changed, 146 insertions(+)
diff --git a/refs.c b/refs.c
index fa3b0a82d4..31fd391214 100644
--- a/refs.c
+++ b/refs.c
@@ -1861,6 +1861,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
return refs->be->create_on_disk(refs, flags, err);
}
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err)
+{
+ return refs->be->remove_on_disk(refs, err);
+}
+
int repo_resolve_gitlink_ref(struct repository *r,
const char *submodule, const char *refname,
struct object_id *oid)
diff --git a/refs.h b/refs.h
index 50a2b3ab09..61ee7b7a15 100644
--- a/refs.h
+++ b/refs.h
@@ -129,6 +129,11 @@ int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *e
*/
void ref_store_release(struct ref_store *ref_store);
+/*
+ * Remove the ref store from disk. This deletes all associated data.
+ */
+int ref_store_remove_on_disk(struct ref_store *refs, struct strbuf *err);
+
/*
* Return the peeled value of the oid currently being iterated via
* for_each_ref(), etc. This is equivalent to calling:
diff --git a/refs/files-backend.c b/refs/files-backend.c
index de8cc83174..e663781199 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3342,11 +3342,73 @@ static int files_ref_store_create_on_disk(struct ref_store *ref_store,
return 0;
}
+struct remove_one_root_ref_data {
+ const char *gitdir;
+ struct strbuf *err;
+};
+
+static int remove_one_root_ref(const char *refname,
+ void *cb_data)
+{
+ struct remove_one_root_ref_data *data = cb_data;
+ struct strbuf buf = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&buf, "%s/%s", data->gitdir, refname);
+
+ ret = unlink(buf.buf);
+ if (ret < 0)
+ strbuf_addf(data->err, "could not delete %s: %s\n",
+ refname, strerror(errno));
+
+ strbuf_release(&buf);
+ return ret;
+}
+
+static int files_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct files_ref_store *refs =
+ files_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct remove_one_root_ref_data data = {
+ .gitdir = refs->base.gitdir,
+ .err = err,
+ };
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete refs: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/logs", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete logs: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ if (for_each_root_ref(refs, remove_one_root_ref, &data) < 0)
+ ret = -1;
+
+ if (ref_store_remove_on_disk(refs->packed_ref_store, err) < 0)
+ ret = -1;
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct ref_storage_be refs_be_files = {
.name = "files",
.init = files_ref_store_init,
.release = files_ref_store_release,
.create_on_disk = files_ref_store_create_on_disk,
+ .remove_on_disk = files_ref_store_remove_on_disk,
.transaction_prepare = files_transaction_prepare,
.transaction_finish = files_transaction_finish,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 2789fd92f5..c4c1e36aa2 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1,5 +1,6 @@
#include "../git-compat-util.h"
#include "../config.h"
+#include "../dir.h"
#include "../gettext.h"
#include "../hash.h"
#include "../hex.h"
@@ -1266,6 +1267,19 @@ static int packed_ref_store_create_on_disk(struct ref_store *ref_store UNUSED,
return 0;
}
+static int packed_ref_store_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct packed_ref_store *refs = packed_downcast(ref_store, 0, "remove");
+
+ if (remove_path(refs->path) < 0) {
+ strbuf_addstr(err, "could not delete packed-refs");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* Write the packed refs from the current snapshot to the packed-refs
* tempfile, incorporating any changes from `updates`. `updates` must
@@ -1724,6 +1738,7 @@ struct ref_storage_be refs_be_packed = {
.init = packed_ref_store_init,
.release = packed_ref_store_release,
.create_on_disk = packed_ref_store_create_on_disk,
+ .remove_on_disk = packed_ref_store_remove_on_disk,
.transaction_prepare = packed_transaction_prepare,
.transaction_finish = packed_transaction_finish,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 33749fbd83..cbcb6f9c36 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -517,6 +517,12 @@ typedef int ref_store_create_on_disk_fn(struct ref_store *refs,
int flags,
struct strbuf *err);
+/*
+ * Remove the reference store from disk.
+ */
+typedef int ref_store_remove_on_disk_fn(struct ref_store *refs,
+ struct strbuf *err);
+
typedef int ref_transaction_prepare_fn(struct ref_store *refs,
struct ref_transaction *transaction,
struct strbuf *err);
@@ -649,6 +655,7 @@ struct ref_storage_be {
ref_store_init_fn *init;
ref_store_release_fn *release;
ref_store_create_on_disk_fn *create_on_disk;
+ ref_store_remove_on_disk_fn *remove_on_disk;
ref_transaction_prepare_fn *transaction_prepare;
ref_transaction_finish_fn *transaction_finish;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index bffed9257f..da6b3162f3 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
#include "../git-compat-util.h"
#include "../abspath.h"
#include "../chdir-notify.h"
+#include "../dir.h"
#include "../environment.h"
#include "../gettext.h"
#include "../hash.h"
@@ -343,6 +344,56 @@ static int reftable_be_create_on_disk(struct ref_store *ref_store,
return 0;
}
+static int reftable_be_remove_on_disk(struct ref_store *ref_store,
+ struct strbuf *err)
+{
+ struct reftable_ref_store *refs =
+ reftable_be_downcast(ref_store, REF_STORE_WRITE, "remove");
+ struct strbuf sb = STRBUF_INIT;
+ int ret = 0;
+
+ /*
+ * Release the ref store such that all stacks are closed. This is
+ * required so that the "tables.list" file is not open anymore, which
+ * would otherwise make it impossible to remove the file on Windows.
+ */
+ reftable_be_release(ref_store);
+
+ strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
+ if (remove_dir_recursively(&sb, 0) < 0) {
+ strbuf_addf(err, "could not delete reftables: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+ if (unlink(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub HEAD: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+ if (unlink(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete stub heads: %s",
+ strerror(errno));
+ ret = -1;
+ }
+ strbuf_reset(&sb);
+
+ strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+ if (rmdir(sb.buf) < 0) {
+ strbuf_addf(err, "could not delete refs directory: %s",
+ strerror(errno));
+ ret = -1;
+ }
+
+ strbuf_release(&sb);
+ return ret;
+}
+
struct reftable_ref_iterator {
struct ref_iterator base;
struct reftable_ref_store *refs;
@@ -2196,6 +2247,7 @@ struct ref_storage_be refs_be_reftable = {
.init = reftable_be_init,
.release = reftable_be_release,
.create_on_disk = reftable_be_create_on_disk,
+ .remove_on_disk = reftable_be_remove_on_disk,
.transaction_prepare = reftable_be_transaction_prepare,
.transaction_finish = reftable_be_transaction_finish,
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 11/12] refs: implement logic to migrate between ref storage formats
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (9 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 10/12] refs: implement removal of ref storages Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 5:29 ` [PATCH v5 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
` (2 subsequent siblings)
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 12062 bytes --]
With the introduction of the new "reftable" backend, users may want to
migrate repositories between the backends without having to recreate the
whole repository. Add the logic to do so.
The implementation is generic and works with arbitrary ref storage
formats so that a backend does not need to implement any migration
logic. It does have a few limitations though:
- We do not migrate repositories with worktrees, because worktrees
have separate ref storages. It makes the overall affair more complex
if we have to migrate multiple storages at once.
- We do not migrate reflogs, because we have no interfaces to write
many reflog entries.
- We do not lock the repository for concurrent access, and thus
concurrent writes may end up with weird in-between states. There is
no way to fully lock the "files" backend for writes due to its
format, and thus we punt on this topic altogether and defer to the
user to avoid those from happening.
In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
often have neither worktrees nor reflogs. But it will not work for many
other repositories without some preparations. These limitations are not
set into stone though, and ideally we will eventually address them over
time.
The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 308 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
refs.h | 18 ++++
2 files changed, 326 insertions(+)
diff --git a/refs.c b/refs.c
index 31fd391214..1304d3dd87 100644
--- a/refs.c
+++ b/refs.c
@@ -2570,3 +2570,311 @@ int ref_update_check_old_target(const char *referent, struct ref_update *update,
referent, update->old_target);
return -1;
}
+
+struct migration_data {
+ struct ref_store *old_refs;
+ struct ref_transaction *transaction;
+ struct strbuf *errbuf;
+};
+
+static int migrate_one_ref(const char *refname, const struct object_id *oid,
+ int flags, void *cb_data)
+{
+ struct migration_data *data = cb_data;
+ struct strbuf symref_target = STRBUF_INIT;
+ int ret;
+
+ if (flags & REF_ISSYMREF) {
+ ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target);
+ if (ret < 0)
+ goto done;
+
+ ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(),
+ symref_target.buf, NULL,
+ REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ } else {
+ ret = ref_transaction_create(data->transaction, refname, oid,
+ REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION,
+ NULL, data->errbuf);
+ if (ret < 0)
+ goto done;
+ }
+
+done:
+ strbuf_release(&symref_target);
+ return ret;
+}
+
+static int move_files(const char *from_path, const char *to_path, struct strbuf *errbuf)
+{
+ struct strbuf from_buf = STRBUF_INIT, to_buf = STRBUF_INIT;
+ size_t from_len, to_len;
+ DIR *from_dir;
+ int ret;
+
+ from_dir = opendir(from_path);
+ if (!from_dir) {
+ strbuf_addf(errbuf, "could not open source directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ strbuf_addstr(&from_buf, from_path);
+ strbuf_complete(&from_buf, '/');
+ from_len = from_buf.len;
+
+ strbuf_addstr(&to_buf, to_path);
+ strbuf_complete(&to_buf, '/');
+ to_len = to_buf.len;
+
+ while (1) {
+ struct dirent *ent;
+
+ errno = 0;
+ ent = readdir(from_dir);
+ if (!ent)
+ break;
+
+ if (!strcmp(ent->d_name, ".") ||
+ !strcmp(ent->d_name, ".."))
+ continue;
+
+ strbuf_setlen(&from_buf, from_len);
+ strbuf_addstr(&from_buf, ent->d_name);
+
+ strbuf_setlen(&to_buf, to_len);
+ strbuf_addstr(&to_buf, ent->d_name);
+
+ ret = rename(from_buf.buf, to_buf.buf);
+ if (ret < 0) {
+ strbuf_addf(errbuf, "could not link file '%s' to '%s': %s",
+ from_buf.buf, to_buf.buf, strerror(errno));
+ goto done;
+ }
+ }
+
+ if (errno) {
+ strbuf_addf(errbuf, "could not read entry from directory '%s': %s",
+ from_path, strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ ret = 0;
+
+done:
+ strbuf_release(&from_buf);
+ strbuf_release(&to_buf);
+ if (from_dir)
+ closedir(from_dir);
+ return ret;
+}
+
+static int count_reflogs(const char *reflog UNUSED, void *payload)
+{
+ size_t *reflog_count = payload;
+ (*reflog_count)++;
+ return 0;
+}
+
+static int has_worktrees(void)
+{
+ struct worktree **worktrees = get_worktrees();
+ int ret = 0;
+ size_t i;
+
+ for (i = 0; worktrees[i]; i++) {
+ if (is_main_worktree(worktrees[i]))
+ continue;
+ ret = 1;
+ }
+
+ free_worktrees(worktrees);
+ return ret;
+}
+
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *errbuf)
+{
+ struct ref_store *old_refs = NULL, *new_refs = NULL;
+ struct ref_transaction *transaction = NULL;
+ struct strbuf new_gitdir = STRBUF_INIT;
+ struct migration_data data;
+ size_t reflog_count = 0;
+ int did_migrate_refs = 0;
+ int ret;
+
+ if (repo->ref_storage_format == format) {
+ strbuf_addstr(errbuf, "current and new ref storage format are equal");
+ ret = -1;
+ goto done;
+ }
+
+ old_refs = get_main_ref_store(repo);
+
+ /*
+ * We do not have any interfaces that would allow us to write many
+ * reflog entries. Once we have them we can remove this restriction.
+ */
+ if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) {
+ strbuf_addstr(errbuf, "cannot count reflogs");
+ ret = -1;
+ goto done;
+ }
+ if (reflog_count) {
+ strbuf_addstr(errbuf, "migrating reflogs is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * Worktrees complicate the migration because every worktree has a
+ * separate ref storage. While it should be feasible to implement, this
+ * is pushed out to a future iteration.
+ *
+ * TODO: we should really be passing the caller-provided repository to
+ * `has_worktrees()`, but our worktree subsystem doesn't yet support
+ * that.
+ */
+ if (has_worktrees()) {
+ strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet");
+ ret = -1;
+ goto done;
+ }
+
+ /*
+ * The overall logic looks like this:
+ *
+ * 1. Set up a new temporary directory and initialize it with the new
+ * format. This is where all refs will be migrated into.
+ *
+ * 2. Enumerate all refs and write them into the new ref storage.
+ * This operation is safe as we do not yet modify the main
+ * repository.
+ *
+ * 3. If we're in dry-run mode then we are done and can hand over the
+ * directory to the caller for inspection. If not, we now start
+ * with the destructive part.
+ *
+ * 4. Delete the old ref storage from disk. As we have a copy of refs
+ * in the new ref storage it's okay(ish) if we now get interrupted
+ * as there is an equivalent copy of all refs available.
+ *
+ * 5. Move the new ref storage files into place.
+ *
+ * 6. Change the repository format to the new ref format.
+ */
+ strbuf_addf(&new_gitdir, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
+ if (!mkdtemp(new_gitdir.buf)) {
+ strbuf_addf(errbuf, "cannot create migration directory: %s",
+ strerror(errno));
+ ret = -1;
+ goto done;
+ }
+
+ new_refs = ref_store_init(repo, format, new_gitdir.buf,
+ REF_STORE_ALL_CAPS);
+ ret = ref_store_create_on_disk(new_refs, 0, errbuf);
+ if (ret < 0)
+ goto done;
+
+ transaction = ref_store_transaction_begin(new_refs, errbuf);
+ if (!transaction)
+ goto done;
+
+ data.old_refs = old_refs;
+ data.transaction = transaction;
+ data.errbuf = errbuf;
+
+ /*
+ * We need to use the internal `do_for_each_ref()` here so that we can
+ * also include broken refs and symrefs. These would otherwise be
+ * skipped silently.
+ *
+ * Ideally, we would do this call while locking the old ref storage
+ * such that there cannot be any concurrent modifications. We do not
+ * have the infra for that though, and the "files" backend does not
+ * allow for a central lock due to its design. It's thus on the user to
+ * ensure that there are no concurrent writes.
+ */
+ ret = do_for_each_ref(old_refs, "", NULL, migrate_one_ref, 0,
+ DO_FOR_EACH_INCLUDE_ROOT_REFS | DO_FOR_EACH_INCLUDE_BROKEN,
+ &data);
+ if (ret < 0)
+ goto done;
+
+ /*
+ * TODO: we might want to migrate to `initial_ref_transaction_commit()`
+ * here, which is more efficient for the files backend because it would
+ * write new refs into the packed-refs file directly. At this point,
+ * the files backend doesn't handle pseudo-refs and symrefs correctly
+ * though, so this requires some more work.
+ */
+ ret = ref_transaction_commit(transaction, errbuf);
+ if (ret < 0)
+ goto done;
+ did_migrate_refs = 1;
+
+ if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
+ printf(_("Finished dry-run migration of refs, "
+ "the result can be found at '%s'\n"), new_gitdir.buf);
+ ret = 0;
+ goto done;
+ }
+
+ /*
+ * Until now we were in the non-destructive phase, where we only
+ * populated the new ref store. From hereon though we are about
+ * to get hands by deleting the old ref store and then moving
+ * the new one into place.
+ *
+ * Assuming that there were no concurrent writes, the new ref
+ * store should have all information. So if we fail from hereon
+ * we may be in an in-between state, but it would still be able
+ * to recover by manually moving remaining files from the
+ * temporary migration directory into place.
+ */
+ ret = ref_store_remove_on_disk(old_refs, errbuf);
+ if (ret < 0)
+ goto done;
+
+ ret = move_files(new_gitdir.buf, old_refs->gitdir, errbuf);
+ if (ret < 0)
+ goto done;
+
+ if (rmdir(new_gitdir.buf) < 0)
+ warning_errno(_("could not remove temporary migration directory '%s'"),
+ new_gitdir.buf);
+
+ /*
+ * We have migrated the repository, so we now need to adjust the
+ * repository format so that clients will use the new ref store.
+ * We also need to swap out the repository's main ref store.
+ */
+ initialize_repository_version(hash_algo_by_ptr(repo->hash_algo), format, 1);
+
+ free(new_refs->gitdir);
+ new_refs->gitdir = xstrdup(old_refs->gitdir);
+ repo->refs_private = new_refs;
+ ref_store_release(old_refs);
+
+ ret = 0;
+
+done:
+ if (ret && did_migrate_refs) {
+ strbuf_complete(errbuf, '\n');
+ strbuf_addf(errbuf, _("migrated refs can be found at '%s'"),
+ new_gitdir.buf);
+ }
+
+ if (ret && new_refs)
+ ref_store_release(new_refs);
+ ref_transaction_free(transaction);
+ strbuf_release(&new_gitdir);
+ return ret;
+}
diff --git a/refs.h b/refs.h
index 61ee7b7a15..76d25df4de 100644
--- a/refs.h
+++ b/refs.h
@@ -1070,6 +1070,24 @@ int is_root_ref(const char *refname);
*/
int is_pseudo_ref(const char *refname);
+/*
+ * The following flags can be passed to `repo_migrate_ref_storage_format()`:
+ *
+ * - REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN: perform a dry-run migration
+ * without touching the main repository. The result will be written into a
+ * temporary ref storage directory.
+ */
+#define REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN (1 << 0)
+
+/*
+ * Migrate the ref storage format used by the repository to the
+ * specified one.
+ */
+int repo_migrate_ref_storage_format(struct repository *repo,
+ enum ref_storage_format format,
+ unsigned int flags,
+ struct strbuf *err);
+
/*
* The following functions have been removed in Git v2.45 in favor of functions
* that receive a `ref_store` as parameter. The intent of this section is
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v5 12/12] builtin/refs: new command to migrate ref storage formats
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (10 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 11/12] refs: implement logic to migrate between ref storage formats Patrick Steinhardt
@ 2024-06-06 5:29 ` Patrick Steinhardt
2024-06-06 7:06 ` [PATCH v5 00/12] refs: ref storage migrations Jeff King
2024-06-06 16:18 ` Junio C Hamano
13 siblings, 0 replies; 103+ messages in thread
From: Patrick Steinhardt @ 2024-06-06 5:29 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak, Jeff King
[-- Attachment #1: Type: text/plain, Size: 16398 bytes --]
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:
- There is no good place to put the migration logic in existing
commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
not the correct place to put it, either.
- I had it in my mind to create a new low-level command for accessing
refs for quite a while already. git-refs(1) is that command and can
over time grow more functionality relating to refs. This should help
discoverability by consolidating low-level access to refs into a
single executable.
As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
.gitignore | 1 +
Documentation/git-refs.txt | 61 ++++++++++
Makefile | 1 +
builtin.h | 1 +
builtin/refs.c | 75 ++++++++++++
command-list.txt | 1 +
git.c | 1 +
t/t1460-refs-migrate.sh | 243 +++++++++++++++++++++++++++++++++++++
8 files changed, 384 insertions(+)
create mode 100644 Documentation/git-refs.txt
create mode 100644 builtin/refs.c
create mode 100755 t/t1460-refs-migrate.sh
diff --git a/.gitignore b/.gitignore
index 612c0f6a0f..8caf3700c2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -126,6 +126,7 @@
/git-rebase
/git-receive-pack
/git-reflog
+/git-refs
/git-remote
/git-remote-http
/git-remote-https
diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
new file mode 100644
index 0000000000..5b99e04385
--- /dev/null
+++ b/Documentation/git-refs.txt
@@ -0,0 +1,61 @@
+git-refs(1)
+===========
+
+NAME
+----
+git-refs - Low-level access to refs
+
+
+SYNOPSIS
+--------
+[verse]
+'git refs migrate' --ref-format=<format> [--dry-run]
+
+DESCRIPTION
+-----------
+
+This command provides low-level access to refs.
+
+COMMANDS
+--------
+
+migrate::
+ Migrate ref store between different formats.
+
+OPTIONS
+-------
+
+The following options are specific to 'git refs migrate':
+
+--ref-format=<format>::
+ The ref format to migrate the ref store to. Can be one of:
++
+include::ref-storage-format.txt[]
+
+--dry-run::
+ Perform the migration, but do not modify the repository. The migrated
+ refs will be written into a separate directory that can be inspected
+ separately. The name of the directory will be reported on stdout. This
+ can be used to double check that the migration works as expected before
+ performing the actual migration.
+
+KNOWN LIMITATIONS
+-----------------
+
+The ref format migration has several known limitations in its current form:
+
+* It is not possible to migrate repositories that have reflogs.
+
+* It is not possible to migrate repositories that have worktrees.
+
+* There is no way to block concurrent writes to the repository during an
+ ongoing migration. Concurrent writes can lead to an inconsistent migrated
+ state. Users are expected to block writes on a higher level. If your
+ repository is registered for scheduled maintenance, it is recommended to
+ unregister it first with git-maintenance(1).
+
+These limitations may eventually be lifted.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index cf504963c2..2d702b552c 100644
--- a/Makefile
+++ b/Makefile
@@ -1283,6 +1283,7 @@ BUILTIN_OBJS += builtin/read-tree.o
BUILTIN_OBJS += builtin/rebase.o
BUILTIN_OBJS += builtin/receive-pack.o
BUILTIN_OBJS += builtin/reflog.o
+BUILTIN_OBJS += builtin/refs.o
BUILTIN_OBJS += builtin/remote-ext.o
BUILTIN_OBJS += builtin/remote-fd.o
BUILTIN_OBJS += builtin/remote.o
diff --git a/builtin.h b/builtin.h
index 28280636da..7eda9b2486 100644
--- a/builtin.h
+++ b/builtin.h
@@ -207,6 +207,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix);
int cmd_rebase__interactive(int argc, const char **argv, const char *prefix);
int cmd_receive_pack(int argc, const char **argv, const char *prefix);
int cmd_reflog(int argc, const char **argv, const char *prefix);
+int cmd_refs(int argc, const char **argv, const char *prefix);
int cmd_remote(int argc, const char **argv, const char *prefix);
int cmd_remote_ext(int argc, const char **argv, const char *prefix);
int cmd_remote_fd(int argc, const char **argv, const char *prefix);
diff --git a/builtin/refs.c b/builtin/refs.c
new file mode 100644
index 0000000000..46dcd150d4
--- /dev/null
+++ b/builtin/refs.c
@@ -0,0 +1,75 @@
+#include "builtin.h"
+#include "parse-options.h"
+#include "refs.h"
+#include "repository.h"
+#include "strbuf.h"
+
+#define REFS_MIGRATE_USAGE \
+ N_("git refs migrate --ref-format=<format> [--dry-run]")
+
+static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
+{
+ const char * const migrate_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ const char *format_str = NULL;
+ enum ref_storage_format format;
+ unsigned int flags = 0;
+ struct option options[] = {
+ OPT_STRING_F(0, "ref-format", &format_str, N_("format"),
+ N_("specify the reference format to convert to"),
+ PARSE_OPT_NONEG),
+ OPT_BIT(0, "dry-run", &flags,
+ N_("perform a non-destructive dry-run"),
+ REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN),
+ OPT_END(),
+ };
+ struct strbuf errbuf = STRBUF_INIT;
+ int err;
+
+ argc = parse_options(argc, argv, prefix, options, migrate_usage, 0);
+ if (argc)
+ usage(_("too many arguments"));
+ if (!format_str)
+ usage(_("missing --ref-format=<format>"));
+
+ format = ref_storage_format_by_name(format_str);
+ if (format == REF_STORAGE_FORMAT_UNKNOWN) {
+ err = error(_("unknown ref storage format '%s'"), format_str);
+ goto out;
+ }
+
+ if (the_repository->ref_storage_format == format) {
+ err = error(_("repository already uses '%s' format"),
+ ref_storage_format_to_name(format));
+ goto out;
+ }
+
+ if (repo_migrate_ref_storage_format(the_repository, format, flags, &errbuf) < 0) {
+ err = error("%s", errbuf.buf);
+ goto out;
+ }
+
+ err = 0;
+
+out:
+ strbuf_release(&errbuf);
+ return err;
+}
+
+int cmd_refs(int argc, const char **argv, const char *prefix)
+{
+ const char * const refs_usage[] = {
+ REFS_MIGRATE_USAGE,
+ NULL,
+ };
+ parse_opt_subcommand_fn *fn = NULL;
+ struct option opts[] = {
+ OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate),
+ OPT_END(),
+ };
+
+ argc = parse_options(argc, argv, prefix, opts, refs_usage, 0);
+ return fn(argc, argv, prefix);
+}
diff --git a/command-list.txt b/command-list.txt
index c4cd0f352b..e0bb87b3b5 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -157,6 +157,7 @@ git-read-tree plumbingmanipulators
git-rebase mainporcelain history
git-receive-pack synchelpers
git-reflog ancillarymanipulators complete
+git-refs ancillarymanipulators complete
git-remote ancillarymanipulators complete
git-repack ancillarymanipulators complete
git-replace ancillarymanipulators complete
diff --git a/git.c b/git.c
index 637c61ca9c..683bb69194 100644
--- a/git.c
+++ b/git.c
@@ -594,6 +594,7 @@ static struct cmd_struct commands[] = {
{ "rebase", cmd_rebase, RUN_SETUP | NEED_WORK_TREE },
{ "receive-pack", cmd_receive_pack },
{ "reflog", cmd_reflog, RUN_SETUP },
+ { "refs", cmd_refs, RUN_SETUP },
{ "remote", cmd_remote, RUN_SETUP },
{ "remote-ext", cmd_remote_ext, NO_PARSEOPT },
{ "remote-fd", cmd_remote_fd, NO_PARSEOPT },
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
new file mode 100755
index 0000000000..f7c0783d30
--- /dev/null
+++ b/t/t1460-refs-migrate.sh
@@ -0,0 +1,243 @@
+#!/bin/sh
+
+test_description='migration of ref storage backends'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_migration () {
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >expect &&
+ git -C "$1" refs migrate --ref-format="$2" &&
+ git -C "$1" for-each-ref --include-root-refs \
+ --format='%(refname) %(objectname) %(symref)' >actual &&
+ test_cmp expect actual &&
+
+ git -C "$1" rev-parse --show-ref-format >actual &&
+ echo "$2" >expect &&
+ test_cmp expect actual
+}
+
+test_expect_success 'setup' '
+ rm -rf .git &&
+ # The migration does not yet support reflogs.
+ git config --global core.logAllRefUpdates false
+'
+
+test_expect_success "superfluous arguments" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate foo 2>err &&
+ cat >expect <<-EOF &&
+ usage: too many arguments
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "missing ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate 2>err &&
+ cat >expect <<-EOF &&
+ usage: missing --ref-format=<format>
+ EOF
+ test_cmp expect err
+'
+
+test_expect_success "unknown ref storage format" '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=unknown 2>err &&
+ cat >expect <<-EOF &&
+ error: unknown ref storage format ${SQ}unknown${SQ}
+ EOF
+ test_cmp expect err
+'
+
+ref_formats="files reftable"
+for from_format in $ref_formats
+do
+ for to_format in $ref_formats
+ do
+ if test "$from_format" = "$to_format"
+ then
+ continue
+ fi
+
+ test_expect_success "$from_format: migration to same format fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$from_format 2>err &&
+ cat >expect <<-EOF &&
+ error: repository already uses ${SQ}$from_format${SQ} format
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with reflog fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_config -C repo core.logAllRefUpdates true &&
+ test_commit -C repo logged &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating reflogs is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: migration with worktree fails" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ git -C repo worktree add wt &&
+ test_must_fail git -C repo refs migrate \
+ --ref-format=$to_format 2>err &&
+ cat >expect <<-EOF &&
+ error: migrating repositories with worktrees is not supported yet
+ EOF
+ test_cmp expect err
+ '
+
+ test_expect_success "$from_format -> $to_format: unborn HEAD" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: single ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: bare repository" '
+ test_when_finished "rm -rf repo repo.git" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git clone --ref-format=$from_format --mirror repo repo.git &&
+ test_migration repo.git "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dangling symref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo symbolic-ref BROKEN_HEAD refs/heads/nonexistent &&
+ test_migration repo "$to_format" &&
+ echo refs/heads/nonexistent >expect &&
+ git -C repo symbolic-ref BROKEN_HEAD >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: broken ref" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ test-tool -C repo ref-store main update-ref "" refs/heads/broken \
+ "$(test_oid 001)" "$ZERO_OID" REF_SKIP_CREATE_REFLOG,REF_SKIP_OID_VERIFICATION &&
+ test_migration repo "$to_format" &&
+ test_oid 001 >expect &&
+ git -C repo rev-parse refs/heads/broken >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "$from_format -> $to_format: pseudo-refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo update-ref FOO_HEAD HEAD &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: special refs are left alone" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo rev-parse HEAD >repo/.git/MERGE_HEAD &&
+ git -C repo rev-parse MERGE_HEAD &&
+ test_migration repo "$to_format" &&
+ test_path_is_file repo/.git/MERGE_HEAD
+ '
+
+ test_expect_success "$from_format -> $to_format: a bunch of refs" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+
+ test_commit -C repo initial &&
+ cat >input <<-EOF &&
+ create FOO_HEAD HEAD
+ create refs/heads/branch-1 HEAD
+ create refs/heads/branch-2 HEAD
+ create refs/heads/branch-3 HEAD
+ create refs/heads/branch-4 HEAD
+ create refs/tags/tag-1 HEAD
+ create refs/tags/tag-2 HEAD
+ EOF
+ git -C repo update-ref --stdin <input &&
+ test_migration repo "$to_format"
+ '
+
+ test_expect_success "$from_format -> $to_format: dry-run migration does not modify repository" '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=$from_format repo &&
+ test_commit -C repo initial &&
+ git -C repo refs migrate --dry-run \
+ --ref-format=$to_format >output &&
+ grep "Finished dry-run migration of refs" output &&
+ test_path_is_dir repo/.git/ref_migration.* &&
+ echo $from_format >expect &&
+ git -C repo rev-parse --show-ref-format >actual &&
+ test_cmp expect actual
+ '
+ done
+done
+
+test_expect_success 'migrating from files format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=files repo &&
+ test_commit -C repo first &&
+ git -C repo pack-refs --all &&
+ test_commit -C repo second &&
+ git -C repo update-ref ORIG_HEAD HEAD &&
+ git -C repo rev-parse HEAD >repo/.git/FETCH_HEAD &&
+
+ test_path_is_file repo/.git/HEAD &&
+ test_path_is_file repo/.git/ORIG_HEAD &&
+ test_path_is_file repo/.git/refs/heads/main &&
+ test_path_is_file repo/.git/packed-refs &&
+
+ test_migration repo reftable &&
+
+ echo "ref: refs/heads/.invalid" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ echo "this repository uses the reftable format" >expect &&
+ test_cmp expect repo/.git/refs/heads &&
+ test_path_is_file repo/.git/FETCH_HEAD &&
+ test_path_is_missing repo/.git/ORIG_HEAD &&
+ test_path_is_missing repo/.git/refs/heads/main &&
+ test_path_is_missing repo/.git/logs &&
+ test_path_is_missing repo/.git/packed-refs
+'
+
+test_expect_success 'migrating from reftable format deletes backend files' '
+ test_when_finished "rm -rf repo" &&
+ git init --ref-format=reftable repo &&
+ test_commit -C repo first &&
+
+ test_path_is_dir repo/.git/reftable &&
+ test_migration repo files &&
+
+ test_path_is_missing repo/.git/reftable &&
+ echo "ref: refs/heads/main" >expect &&
+ test_cmp expect repo/.git/HEAD &&
+ test_path_is_file repo/.git/refs/heads/main
+'
+
+test_done
--
2.45.2.409.g7b0defb391.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v5 00/12] refs: ref storage migrations
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (11 preceding siblings ...)
2024-06-06 5:29 ` [PATCH v5 12/12] builtin/refs: new command to migrate " Patrick Steinhardt
@ 2024-06-06 7:06 ` Jeff King
2024-06-06 16:18 ` Junio C Hamano
13 siblings, 0 replies; 103+ messages in thread
From: Jeff King @ 2024-06-06 7:06 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Eric Sunshine, Junio C Hamano, Ramsay Jones, Justin Tobler,
Karthik Nayak
On Thu, Jun 06, 2024 at 07:28:52AM +0200, Patrick Steinhardt wrote:
> the ref storage migration was merged to `next`, but got reverted due to
> some additional findings by Peff and/or Coverity.
>
> Changes compared to v4:
>
> - Adapt comment of `ref_store_init()` to the new parameter.
>
> - Fix use of an uninitialized return value in `for_each_root_ref()`.
>
> - Fix overwrite of ret code in `files_ref_store_remove_on_disk()`.
>
> - Adapt an error message to more clearly point out that deletion of
> "refs/" directory failed in `reftable_be_remove_on_disk()`.
>
> - Fix a leak when `mkdtemp()` fails.
These all looked good to me (though I did not carefully read the
original topic, so was just looking at the parts I mentioned earlier).
> 6: f7577a0ab3 ! 6: 86cf0c84b1 refs/files: extract function to iterate through root refs
> @@ refs/files-backend.c: static void add_root_refs(struct files_ref_store *refs,
> strbuf_setlen(&refname, dirnamelen);
> }
> +
> ++ ret = 0;
> ++
> +done:
> strbuf_release(&refname);
> strbuf_release(&path);
Since the context doesn't show much, I wondered whether there was any
case where we'd overwrite an earlier "ret" here. But nope, we always
jump to "done" after finding "ret" contains a non-zero value. So setting
it to zero here is the right thing.
-Peff
^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v5 00/12] refs: ref storage migrations
2024-06-06 5:28 ` [PATCH v5 00/12] refs: ref storage migrations Patrick Steinhardt
` (12 preceding siblings ...)
2024-06-06 7:06 ` [PATCH v5 00/12] refs: ref storage migrations Jeff King
@ 2024-06-06 16:18 ` Junio C Hamano
13 siblings, 0 replies; 103+ messages in thread
From: Junio C Hamano @ 2024-06-06 16:18 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Eric Sunshine, Ramsay Jones, Justin Tobler, Karthik Nayak,
Jeff King
Patrick Steinhardt <ps@pks.im> writes:
> Hi,
>
> the ref storage migration was merged to `next`, but got reverted due to
> some additional findings by Peff and/or Coverity.
>
> Changes compared to v4:
>
> - Adapt comment of `ref_store_init()` to the new parameter.
>
> - Fix use of an uninitialized return value in `for_each_root_ref()`.
>
> - Fix overwrite of ret code in `files_ref_store_remove_on_disk()`.
>
> - Adapt an error message to more clearly point out that deletion of
> "refs/" directory failed in `reftable_be_remove_on_disk()`.
>
> - Fix a leak when `mkdtemp()` fails.
>
> Thanks!
>
> Patrick
Looking good. The use of strbuf for mkdtemp() template and relying
on the fact that mkdtemp() makes an in-place modification of the
template string made the resulting code easier to follow, I would
think.
Let's mark the topic ready for 'next'. Thanks.
^ permalink raw reply [flat|nested] 103+ messages in thread